Usage:

meme-chip [options] <primary sequence file>

Description

Input

<primary sequence file>

The name of a file of sequences in FASTA format on which to perform comprehensive motif analysis. Ideally the sequences should be all the same length, between 100 and 500 base-pairs long and centrally enriched for motifs. The immediate regions around individual ChIP-seq "peaks" from a transcription factor (TF) ChIP-seq experiment are ideal. The suggested 100 base-pair minimum size is based on the typical resolution of ChIP-seq peaks but it is useful to have more of the surrounding sequence to give CentriMo the power to tell if a motif is centrally enriched. We recommend that you "repeat mask" your sequences, replacing repeat regions to the "N" character.

Output

MEME-ChIP writes its output to files in a directory named memechip_out, which it creates if necessary. You can change the output directory using the -o or -oc options. The directory will contain the following files:

In addition, the MEME-ChIP output directory will contain sub-directories with the results of each of the individual analyses it performed. The results in these directories are all linked to from the MEME-ChIP HTML output file.

Note: See this detailed description of the MEME-ChIP output formats for more information.

Options

OptionParameterDescriptionDefault Behavior
Output
Primary Sequences
Control Sequences and Background Model
-negfile MEME-ChIP will look for motifs enriched in the primary sequences relative to this control set of sequences in FASTA format. These sequences will be input as control sequences to MEME, STREME and CentriMo. MEME will use its "Differential Enrichment" objective function. When this option is used, the primary and control sequences should all be the same length; otherwise CentriMo E-values will be inaccurate. If the primary sequences are ChIP-seq peak regions from a transcription factor ChIP-seq experiment, similar regions from a knockout cell line or organism are a possible choices for control sequences. The control sequences should be prepared in exactly the same way (e.g., repeat-masking) as the primary sequences. No control sequences are used for MEME or CentriMo. MEME uses the "Classic" objective function. STREME will create the control set by shuffling the positive sequences while preserving the frequency statistics of the order specified by the -neg option.
-orderorder Set the order of the Markov background model that will be used. If a background model is not specified via -bfile, MEME-ChIP will create one from the control sequences (if given) or from the primary sequences. STREME will generate its own Markov background model of this order, and, if control sequences are not given, it will create them from the primary sequences by shuffling preserving the frequency statistics of this order. A Markov background model of order 2 is generated and passed to the programs that support it.
-bfilefile Pass the file specifying a background model in Markov Background Model Format to programs that support a background model (MEME, CentriMo, FIMO, SpaMo and Tomtom). Consult the documentation for those programs for details on how they use the background model. Basically, you can use a background model in order to normalize for biased distribution of letters and groups of letters in your sequences. A 0-order model adjusts for single letter biases, a 1-order model adjusts for dimer biases (e.g., GC content in DNA sequences), etc. A 0-order Markov background model is calculated from the input sequences and then passed by MEME-ChIP to the programs that support background models.
-psp-gen Use psp-gen to create a position-specific prior for use by MEME. MEME will use this prior and the "Classic" objective function, rather than the "Differential Enrichment" objective function. Note1: Requires the -neg option. Note2: This was the default prior to MEME Suite version 5.0. MEME uses the control sequences directly via its "Differential Enrichment" objective function, and no position-specific prior is created.
-seedseed The seed for the randomized selection of sequences for MEME and the shuffling of sequences for STREME. A seed value of 1 is used.
Input Motifs
-dbmotif file recommended The names of a file containing known MEME formatted motifs. Outputs from MEME, STREME and DREME are supported, as well as Minimal MEME Format. You can convert many other motif formats to MEME Motif format using conversion scripts available with the MEME Suite. This option may be repeated to pass multiple files of motifs. When no files are provided, MEME-ChIP cannot report similar known motifs.
Alphabet
Output Filtering and Number of Motifs
-filter-threshfthr E-value threshold for including motifs in the output. A value of 0.05 is used.
-timeminutes The maximum time that MEME-ChIP is allowed to run before terminating itself gracefully. There is no time limit
Motif Width
-minwwidth The minimum width of motifs to find. A minimum width of 6 is used unless the maximum width has been set to be less than 6 in which case the maximum width is used.
-maxwwidth The maximum width of motifs to find. A maximum width of 15 is used unless the minimum width has been set to be larger than 15 in which case the minimum width is used.
Misc
-dbfile Use file containing DNA motifs in MEME Motif format. This file will used by Tomtom and CentriMo. This option may be used multiple times to pass multiple files. When no files are provided, Tomtom can't suggest similar motifs and CentriMo is limited to the discovered motifs.
-ccutsize For input to MEME and STREME, trim sequences to their central region of size base-pairs. (The full-length sequences are input to CentriMo and SpaMo.) A value of 0 indicates that sequences should not be trimmed before being passed to MEME and STREME. A maximum size of 100 is used.
-group-threshgthr Main threshold for clustering highly similar motifs in MEME-ChIP output. All motifs in a group will have a Tomtom E-value less than or equal to gthr when compared to the seed motif for the group, which is the most significant motif in the group. A value of 0.05 is used.
-group-weakwthr Secondary threshold for clustering highly similar motifs in MEME-ChIP output. If this is specified by the user, groups will be merged into a more significant group if all their motifs are weakly similar to the seed motif of the more significant group. wthr specifies the Tomtom E-value threshold for merging groups. Set to be equal to twice the value of the main clustering threshold: 2 * gthr.
-old-clustering  Pick seed motifs for clustering based only on significance; Discovered motifs are preferentially used as seed motifs for clustering.
-noecho  Don't echo the commands run. Echo the commands run to standard output.
MEME Specific Options
-meme-nmotifsnum The number of motifs that MEME should search for. If num is 0, MEME will not be run. MEME will find 3 motifs.
-meme-searchsize searchsize The maximum portion of the primary sequences (in characters) used by MEME in searching for motifs. See the documentation on the MEME -searchsize option in the MEME documentation for more details. MEME performs sampling if the primary sequences contain more than 100,000 characters.
-meme-norand If your (primary) sequences are sorted in order of confidence (best to worst) then you should select this option. See the MEME documentation for the -norand option for more details. MEME randomly selects the (primary) sequences to include in its initial analysis if there are more than searchsize primary sequences.
-meme-pnp Use faster, parallel version of MEME with np processors. The parameter np may be a number or it may be a quoted string starting with a number and followed by arguments to the particular MPI run command for your installation (e.g., mpirun). Use a single processor.
-meme-brief nbrief If there are more than nbrief (primary) sequences, the size of MEME's output will be reduced by suppressing the inclusion of the sequence names, motif sites and scanned sites in MEME's HTML and XML outputs, and by suppressing the tables of sequence lengths, sites and block diagrams in MEME's text output. A value of 1000 is used for nbrief.
-meme-modoops|zoops|anr The number of motif sites that MEME will find per sequence.
oops - One Occurrence Per Sequence,
zoops - Zero or One Occurrence Per Sequence,
anr - Any Number of Repetitions
See -mod in the MEME command-line documentation.
MEME defaults to using zoops mode.
-meme-minsitessites The minimum number of sites that MEME needs to find for a motif. MEME doesn't require any minimum number of sites for a motif.
-meme-maxsitessites The maximum number of sites that MEME will find for a motif. MEME doesn't limit the number of sites it will find for a motif.
-meme-pal  Restrict MEME to searching for palindromes only. MEME searches for any motif, not just palindromes.
STREME Specific Options
-streme-pvtp-value Stop searching for more motifs when three successive motifs have p-values larger than this threshold. An p-value threshold of 0.05 is used.
-streme-nmotifscount Stop searching for more motifs when count motifs have been found. If count is 0, STREME will not be run. Search stops when the -streme-pvt criterion has been satisfied.
-streme-align left | center | right For the STREME site positional distribution diagrams, align the sequences on their left ends (left), on their centers (center), or on their right ends (right). For visualizing motif distributions, center alignment is ideal for ChIP-seq and similar data; right alignment for sequences upstream of transcription start sites; left alignment for many proteins or 3' UTR sequences. Align the sequences on their centers in the STREME site positional distribution diagrams.
-streme-totallengthlen Tell STREME to randomly choose sequences from the input fils so that the maximum total length of the sequences from each sequence file totals at most len. This can prevent STREME from running out of memory. STREME uses all the input sequences.
CentriMo Specific Options
-centrimo-ethreshE-value Set the E-value threshold for reporting enriched central regions. An E-value threshold of 10 will be used.
-centrimo-local  CentriMo perform local motif enrichment analysis, computing enrichment in every possible sequence region. CentriMo will perform central motif enrichment analysis, computing enrichment in centered regions only.
-centrimo-scorescore Set the minimum accepted score for a match. A minimum score of 5 is used.
-centrimo-maxregregion Set the size of the maximum region size tested. CentriMo will test all valid region sizes.
-centrimo-noseq  Do not store sequence IDs in the output of CentriMo. CentriMo stores a list of the sequence IDs with matches in the best region for each motif.
-centrimo-flip  Reflect the positions of matches on the reverse strand around the center. Matches on the reverse strand are counted where they occur in the sequence.
SpaMo Specific Options
-spamo-skip  Do not run SpaMo. Can be combined with options -meme-nmotifs 0, -streme-nmotifs 0, and -fimo-skip to use MEME-ChIP to run CentriMo and cluster the significant motifs. Run SpaMo using most significant motif from each cluster as primary.
FIMO Specific Options
-fimo-skip  Do not run FIMO. Can be combined with options -meme-nmotifs 0, -streme-nmotifs 0, and -spamo-skip to use MEME-ChIP to run CentriMo and cluster the significant motifs. Run FIMO using most significant motif from each cluster to scan input sequences.

Citing