If your sequences are not in a standard alphabet (DNA, RNA or protein), you must input a custom alphabet file.

[ close ]
Click on the menu at the left to see which of the following sequence input methods are available.
Type in sequences
When this option is available you may directly input multiple sequences by typing them. Sequences must be input in FASTA format.
Upload sequences
When this option is available you may upload a file containing sequences in FASTA format.
Upload BED file new
When this option is available you may upload a file containing sequence coordinates in BED format.
Databases (select category)
When this option is available you may first select a category of sequence database from the list below it. Two additional menus will then appear where you can select the particular database and version desired, respectively. The full list of available sequence databases and their descriptions can be viewed HERE.
Submitted sequences
This option is only available when you have invoked the current program by clicking on a button in the output report of a different MEME Suite program. By selecting this option you will input the sequences sent by that program.
[ close ]

Specify a file to upload containing sequence coordinates in BED format. The file must be based on the exact genome version you specified in the menus above.

[ close ]

Select an available sequence database from this menu.

[ close ]

Select an available version of the sequence database from this menu.

[ close ]

Select an available tissue/cell-specificity from this menu.

[ close ]

Selecting this option will filter the sequence menu to only contain databases that have additional information that is specific to a tissue or cell line.

This option causes MEME Suite to use tissue/cell-specific information (typically from DNase I or histone modification ChIP-seq data) encoded as a position specific prior that has been created by the MEME Suite create-priors utility. You can see a description of the sequence databases for which we provide tissue/cell-specific priors here.

Note that you cannot upload or type in your own sequences when tissue/cell-specific scanning is selected.

[ close ]

Enter text naming or describing this analysis. The job description will be included in the notification email you receive and in the job output.

[ close ]

This is the width (number of characters in the sequence pattern) of a single motif. STREME chooses the optimal width of each motif individually using a heuristic function. You can choose limits for the minimum and maximum motif widths that STREME will consider. The width of each motif that STREME reports will lie within the limits you choose.

[ close ]

The background model normalizes for biased distribution of letters and groups of letters in your sequences. A 0-order model adjusts for single letter biases, a 1-order model adjusts for dimer biases (e.g., GC content in DNA sequences), etc.

By default STREME will determine the background Markov model from the control sequences (or from the primary sequences if you do not provide control sequences). The order of the background model depends on the sequence alphabet, but you can also set it manually (see option "What Markov order...", below). Alternately you may select "Upload background model" and input a file containing a background model.

The downloadable version of the MEME Suite contains a program named "fasta-get-markov" that you can use to create background model files in the correct format from FASTA sequence files.

[ close ]

Specify the order (m) for the background model and sequence shuffling. By default, STREME uses m=2 for DNA and RNA sequences, and m=0 for protein or custom alphabet sequences. Check this box and set the value of m if you want to override the default value of m that STREME uses.

If you upload a background model (see option above), STREME will only use the m-order portion of that model. If you do not upload a background model, STREME will create an order-m model from the control sequences that you provide, or from the shuffled primary sequences if you don't provide control sequences.

If you do not specify a set of control sequences, STREME will create one by shuffling each primary sequence while preserving the frequencies of all words of length k that it contains, where k=m+1.

[ close ]

If this option is checked, STREME will NOT trim the control sequences even if their average length exceeds that of the primary sequences. This will cause STREME to use the (less accurate) Binomial test rather than the Fisher Exact test if the control sequences are longer (on average) than the primary sequences.

[ close ]

When your sequences are in the DNA alphabet but you want them to be treated as single-stranded RNA, check this box.

[ close ]

When this option is selected, if the FASTA sequence header of an input sequence contains genomic coordinates in UCSC or Galaxy format the discovered motif sites will be output in genomic coordinates. If the sequence header does not contain valid coordinates, the sites will be output with the start of the sequences as position 1.

[ close ]

Data Submission Form

Perform discriminative motif discovery in sequence datasets (including in very large datasets). The sequences may be in the DNA, RNA or protein alphabet, or in a custom alphabet.

Select the type of control sequences to use

Select the sequence alphabet

Use sequences with a standard alphabet or specify a custom alphabet.

Input the sequences

Enter the sequences in which you want to find motifs.


Specify the genome your BED file is based on.

Select the BED file to upload.

Input job details

(Optional) Enter your email address.

(Optional) Enter a job description.

Advanced options hidden modifications! [Reset]   

How wide can motifs be?

How should the search be limited?

What should be used as the background model?

What Markov order should be used for shuffling sequences and background model creation?

Should STREME trim the control sequences if needed?

How should sequences be aligned for site positional diagrams?

Should STREME parse genomic coordinates?

new

Warning: Your maximum job quota has been reached! You will need to wait until one of your jobs completes or 1 second has elapsed before submitting another job.

This server has the job quota set to 10 unfinished jobs every 1 hour.

Note: if the combined form inputs exceed 80MB the job will be rejected.