MEME-ChIP

<primary sequence file>

The name of a file of sequences in FASTA format on which to perform comprehensive motif analysis. Ideally the sequences should be all the same length, between 100 and 500 base-pairs long and centrally enriched for motifs. The immediate regions around individual ChIP-seq "peaks" from a transcription factor (TF) ChIP-seq experiment are ideal. The suggested 100 base-pair minimum size is based on the typical resolution of ChIP-seq peaks but it is useful to have more of the surrounding sequence to give CentriMo the power to tell if a motif is centrally enriched. We recommend that you "repeat mask" your sequences, replacing repeat regions to the "N" character.

MEME-ChIP writes its output to files in a directory named memechip_out, which it creates if necessary. You can change the output directory using the -o or -oc options. The directory will contain the following files:

meme-chip.html - an HTML file that provides the results in an interactive, human-readable format that contains links to the other files produced by the analyses performed by MEME-ChIP
summary.tsv - a TSV (tab-separated values) file that provides a summary of the results in a format suitable for parsing by scripts and viewing with Excel
combined.meme - a text file that contains all the motifs identified by MEME-ChIP in MEME Motif Format

In addition, the MEME-ChIP output directory will contain sub-directories with the results of each of the individual analyses it performed. The results in these directories are all linked to from the MEME-ChIP HTML output file.

Note: See this detailed description of the MEME-ChIP output formats for more information.

Option	Parameter	Description	Default Behavior
Output
Primary Sequences
Control Sequences and Background Model
-neg	file	MEME-ChIP will look for motifs enriched in the primary sequences relative to this control set of sequences in FASTA format. These sequences will be input as control sequences to MEME, STREME and CentriMo. MEME will use its "Differential Enrichment" objective function. When this option is used, the primary and control sequences should all be the same length; otherwise CentriMo E-values will be inaccurate. If the primary sequences are ChIP-seq peak regions from a transcription factor ChIP-seq experiment, similar regions from a knockout cell line or organism are a possible choices for control sequences. The control sequences should be prepared in exactly the same way (e.g., repeat-masking) as the primary sequences.	No control sequences are used for MEME or CentriMo. MEME uses the "Classic" objective function. STREME will create the control set by shuffling the positive sequences while preserving the frequency statistics of the order specified by the -order option.
-order	order	Set the order of the Markov background model that will be used. If a background model is not specified via -bfile, MEME-ChIP will create one from the control sequences (if given) or from the primary sequences. STREME will generate its own Markov background model of this order, and, if control sequences are not given, it will create them from the primary sequences by shuffling preserving the frequency statistics of this order.	A Markov background model of order 2 is generated and passed to the programs that support it.
-bfile	file	Pass the file specifying a background model in Markov Background Model Format to programs that support a background model (MEME, CentriMo, FIMO, SpaMo and Tomtom). Consult the documentation for those programs for details on how they use the background model. Basically, you can use a background model in order to normalize for biased distribution of letters and groups of letters in your sequences. A 0-order model adjusts for single letter biases, a 1-order model adjusts for dimer biases (e.g., GC content in DNA sequences), etc.	A 0-order Markov background model is calculated from the input sequences and then passed by MEME-ChIP to the programs that support background models.
-psp-gen		Use `psp-gen` to create a position-specific prior for use by MEME. MEME will use this prior and the "Classic" objective function, rather than the "Differential Enrichment" objective function. Note1: Requires the -neg option. Note2: This was the default prior to MEME Suite version 5.0.	MEME uses the control sequences directly via its "Differential Enrichment" objective function, and no position-specific prior is created.
-seed	seed	The seed for the randomized selection of sequences for MEME and the shuffling of sequences for STREME.	A seed value of 1 is used.
Input Motifs
-db	motif file	recommended The names of a file containing known MEME formatted motifs. Outputs from MEME, STREME and DREME are supported, as well as Minimal MEME Format. You can convert many other motif formats to MEME Motif format using conversion scripts available with the MEME Suite. This option may be repeated to pass multiple files of motifs.	When no files are provided, MEME-ChIP cannot report similar known motifs.
Alphabet
Output Filtering and Number of Motifs
-filter-thresh	fthr	E-value threshold for including motifs in the output.	A value of 0.05 is used.
-time	minutes	The maximum time that MEME-ChIP is allowed to run before terminating itself gracefully.	There is no time limit
Motif Width
-minw	width	The minimum width of motifs to find.	A minimum width of 6 is used unless the maximum width has been set to be less than 6 in which case the maximum width is used.
-maxw	width	The maximum width of motifs to find.	A maximum width of 15 is used unless the minimum width has been set to be larger than 15 in which case the minimum width is used.
Misc
-db	file	Use file containing DNA motifs in MEME Motif format. This file will used by Tomtom and CentriMo. This option may be used multiple times to pass multiple files.	When no files are provided, Tomtom can't suggest similar motifs and CentriMo is limited to the discovered motifs.
-ccut	size	For input to MEME and STREME, trim sequences to their central region of size base-pairs. (The full-length sequences are input to CentriMo and SpaMo.) A value of 0 indicates that sequences should not be trimmed before being passed to MEME and STREME.	A maximum size of 100 is used.
-group-thresh	gthr	Main threshold for clustering highly similar motifs in MEME-ChIP output. All motifs in a group will have a Tomtom E-value less than or equal to gthr when compared to the seed motif for the group, which is the most significant motif in the group.	A value of 0.05 is used.
-group-weak	wthr	Secondary threshold for clustering highly similar motifs in MEME-ChIP output. If this is specified by the user, groups will be merged into a more significant group if all their motifs are weakly similar to the seed motif of the more significant group. wthr specifies the Tomtom E-value threshold for merging groups.	Set to be equal to twice the value of the main clustering threshold: 2 * gthr.
-old-clustering		Pick seed motifs for clustering based only on significance;	Discovered motifs are preferentially used as seed motifs for clustering.
-noecho		Don't echo the commands run.	Echo the commands run to standard output.
MEME Specific Options
-meme-nmotifs	num	The number of motifs that MEME should search for. If num is 0, MEME will not be run.	MEME will find 3 motifs.
-meme-searchsize	searchsize	The maximum portion of the primary sequences (in characters) used by MEME in searching for motifs. See the documentation on the MEME -searchsize option in the MEME documentation for more details.	MEME performs sampling if the primary sequences contain more than 100,000 characters.
-meme-norand		If your (primary) sequences are sorted in order of confidence (best to worst) then you should select this option. See the MEME documentation for the -norand option for more details.	MEME randomly selects the (primary) sequences to include in its initial analysis if there are more than searchsize primary sequences.
-meme-p	np	Use faster, parallel version of MEME with np processors. The parameter np may be a number or it may be a quoted string starting with a number and followed by arguments to the particular MPI run command for your installation (e.g., `mpirun`).	Use a single processor.
-meme-brief	nbrief	If there are more than nbrief (primary) sequences, the size of MEME's output will be reduced by suppressing the inclusion of the sequence names, motif sites and scanned sites in MEME's HTML and XML outputs, and by suppressing the tables of sequence lengths, sites and block diagrams in MEME's text output.	A value of 1000 is used for nbrief.
-meme-mod	oops\|zoops\|anr	The number of motif sites that MEME will find per sequence. oops - One Occurrence Per Sequence, zoops - Zero or One Occurrence Per Sequence, anr - Any Number of Repetitions See -mod in the MEME command-line documentation.	MEME defaults to using zoops mode.
-meme-minsites	sites	The minimum number of sites that MEME needs to find for a motif.	MEME doesn't require any minimum number of sites for a motif.
-meme-maxsites	sites	The maximum number of sites that MEME will find for a motif.	MEME doesn't limit the number of sites it will find for a motif.
-meme-pal		Restrict MEME to searching for palindromes only.	MEME searches for any motif, not just palindromes.
STREME Specific Options
-streme-pvt	p-value	Stop searching for more motifs when three successive motifs have p-values larger than this threshold.	An p-value threshold of 0.05 is used.
-streme-nmotifs	count	Stop searching for more motifs when count motifs have been found. If count is 0, STREME will not be run.	Search stops when the -streme-pvt criterion has been satisfied.
-streme-align	left \| center \| right	For the STREME site positional distribution diagrams, align the sequences on their left ends (left), on their centers (center), or on their right ends (right). For visualizing motif distributions, center alignment is ideal for ChIP-seq and similar data; right alignment for sequences upstream of transcription start sites; left alignment for many proteins or 3' UTR sequences.	Align the sequences on their centers in the STREME site positional distribution diagrams.
-streme-totallength	len	Tell STREME to randomly choose sequences from the input fils so that the maximum total length of the sequences from each sequence file totals at most len. This can prevent STREME from running out of memory.	STREME uses all the input sequences.
CentriMo Specific Options
-centrimo-ethresh	E-value	Set the E-value threshold for reporting enriched central regions.	An E-value threshold of 10 will be used.
-centrimo-local		CentriMo perform local motif enrichment analysis, computing enrichment in every possible sequence region.	CentriMo will perform central motif enrichment analysis, computing enrichment in centered regions only.
-centrimo-score	score	Set the minimum accepted score for a match.	A minimum score of 5 is used.
-centrimo-maxreg	region	Set the size of the maximum region size tested.	CentriMo will test all valid region sizes.
-centrimo-noseq		Do not store sequence IDs in the output of CentriMo.	CentriMo stores a list of the sequence IDs with matches in the best region for each motif.
-centrimo-flip		Reflect the positions of matches on the reverse strand around the center.	Matches on the reverse strand are counted where they occur in the sequence.
SpaMo Specific Options
-spamo-skip		Do not run SpaMo. Can be combined with options -meme-nmotifs 0, -streme-nmotifs 0, and -fimo-skip to use MEME-ChIP to run CentriMo and cluster the significant motifs.	Run SpaMo using most significant motif from each cluster as primary.
FIMO Specific Options
-fimo-skip		Do not run FIMO. Can be combined with options -meme-nmotifs 0, -streme-nmotifs 0, and -spamo-skip to use MEME-ChIP to run CentriMo and cluster the significant motifs.	Run FIMO using most significant motif from each cluster to scan input sequences.

The MEME Suite

Motif-based sequence analysis tools

Motif Analysis of Large Nucleotide Datasets

Usage:

Description

Input

<primary sequence file>

Output

Options

Citing