The MEME Suite

Motif-based sequence analysis tools

The MEME Suite allows the biologist to discover novel motifs in collections of unaligned nucleotide or protein sequences, and to perform a wide variety of other motif-based analyses.

The MEME Suite supports motif-based analysis of DNA, RNA and protein sequences. You can also us it with sequences over novel sequence alphabets that you define. The Suite provides motif discovery algorithms using both probabilistic (MEME) and discrete models (STREME), which have complementary strengths. It also allows discovery of motifs with arbitrary insertions and deletions (GLAM2). The MEME Suite provides three tools for motif enrichment analysis--measuring the enrichment of known motifs in sets of sequences. The motif enrichment may be anywhere in the sequences (SEA, AME), or concentrated in the central regions of the sequences (CentriMo). The Suite also provides an algorithm for measuring the similarity between motifs (Tomtom). These three types of analysis are combined in a pipeline in two MEME Suite tools--XSTREME and MEME-ChIP--that perform comprehensive motif analysis in general sequences (XSTREME) or in ChIP-seq peaks (MEME-ChIP). In addition to motif discovery, the MEME Suite provides tools for scanning sequences for matches to motifs (FIMO, MAST and GLAM2Scan), scanning for clusters of motifs (MCAST), finding preferred spacings between motifs (SpaMo), predicting the biological roles of motifs (GOMo), and predicting the regulatory targets of transcription factors.

The MEME Suite is comprised of a collection of tools that work together, as shown below. Not all the tools are available as webservices, so to get the full power of the MEME Suite you will need to download and install a local copy of the software. To see what has changed recently you can peruse the release notes.

Motif Discovery
MEME	Find ungapped motifs in unaligned DNA, RNA or protein sequences.	Sample
STREME	Find ungapped motifs in large sets of sequences in the DNA, RNA, protein or a custom alphabet that are relativel enriched in your sequences compared to shuffled sequences or your control sequences.	Sample
XSTREME	Comprehensive motif analysis of datasets where the motifs can be anywhere in the sequences.	Sample
MEME-ChIP	Comprehensive motif analysis of datasets where the motifs tend to be centrally located, such as those from ChIP-seq experiments.	Sample
GLAM2	Find gapped motifs in DNA or protein sequences. It has a tutorial.	Sample
MoMo	Find PTM (post-translationally modified) motifs in fixed-width protein sequences.	Sample
DREME (deprecated)	Find short, ungapped motifs in large sets of DNA or RNA sequences. Also allows discriminative motif analysis using a set of control sequences.	Sample
Motif Enrichment Analysis
SEA	Simple Motif Enrichment Analysis: find known DNA, RNA or protein motifs that are relatively enriched in the input sequences compared to shuffled version of those sequences or control sequences.	Sample
CentriMo	Local Motif Enrichment Analysis: Find motifs that are enriched in local regions in equal-length sequences.	Sample
AME	Motif Enrichment Analysis: find known DNA, RNA or protein motifs that are relatively enriched in the input sequences compared to shuffled version of those sequences or control sequences, or that are enriched in sequences with small values of scores that you can specify with your input sequences.	Sample
SpaMo	Motif Spacing Analysis: Find known motifs that occur with preferred spacings relative to a primary motif in a set of DNA sequences.	Sample
GOMo	Genome Ontology Motif Enrichment: Identify possible roles (Gene Ontology terms) for DNA binding motifs.	Sample
Motif Search
FIMO	Search a sequence database for occurrences of known motifs. This program treats each motif independently and reports all putative motif occurrences below a specified p-value threshold.	Sample
MAST	Search a sequence database for occurrences of known motifs. This program assumes exactly one occurrence of each motif per sequence, and each sequence in the database is assigned a p-value, based on the product of the p-values of the individual motif occurrences in that sequence.	Sample
MCAST	Search a sequence database for clusters of known motifs. mcast employs a motif-based hidden Markov model, using a star topology and a novel scoring algorithm. The motifs may appear in any order.	Sample
MOTIPH	Search a set of aligned sequences for conserved matches to motifs. In addition to the aligned sequences and the motifs, this program requires a maximum likelihood phylogenetic tree estimating the evolutionary relationships and distances among the sequences.	Sample
GLAM2Scan	Search for occurrences of gapped motifs, discovered by GLAM2.	Sample
Motif Comparison
Tomtom	Find motifs that are similar to a given DNA or RNA motif by searching a database of known motifs.	Sample
Gene Regulation
T-Gene	Predict regulatory links between regulatory elements (chromosomal regions) and genes.	Sample
Utilities
BED2FASTA	Extract the regions specified in a BED file from a genome.
Additional Primary Tools
AMA	Print the Average Motif Affinity score of each sequence in a database. The score is calculated by averaging the likelihood ratio scores for all feasible binding events to the given sequence and to its reverse strand.
Motif Format Conversion Scripts		Foreign Motif Formats
beeml2meme	Convert an BEEML matrix file to MEME format.
chen2meme	Convert a CHEN matrix file to MEME format.
elm2meme	Convert a ELM tab separated file to MEME format.
iupac2meme	Convert an IUPAC string to MEME format.
jaspar2meme	Convert a directory of JASPAR files to MEME format.
matrix2meme	Convert count or frequency matrices to MEME format.
meme2meme	Convert and merge multiple MEME formatted files.
nmica2meme	Convert a nestedMICA (BioTiffin/XMS) matrix file to MEME format.
priority2meme	Convert a PRIORITY matrix file to MEME format.
prosite2meme	Convert a PROSITE pattern file to MEME format.
rna2meme	Convert a FASTA file with micro-RNA sequences into MEME motifs for their mRNA target sequences.
scpd2meme	Convert an SCPD matrix file to MEME format.
sites2meme	Convert files containing sites into MEME format.
taipale2meme	Convert a tab-separated file exported from a spreadsheet of Taipale results to MEME format.
tamo2meme	Convert a TAMO matrix file to MEME format.
transfac2meme	Convert a TRANSFAC matrix file to MEME format.
uniprobe2meme	Convert a UNIPROBE matrix file to MEME format.
File Format Conversion Utilities
clustalw2fasta	Convert a Clustalw multiple alignment into FASTA format.
clustalw2phylip	Convert a Clustalw multiple alignment into Phylip format.
glam2format	Convert glam2 motifs to standard alignment formats.
obo2dag	Convert a Gene Ontology OBO file into a GO DAG file.
FASTA Sequence Utilities
dust	Mask low-complexity regions in DNA sequences in a FASTA file to `N` characters.
fasta-center	Output the central portion of each sequence in a FASTA file of sequences.
fasta-dinucleotide-shuffle	Shuffle the letters in each sequence in a file of FASTA file of nucleotide sequences, preserving the dinucleotide frequencies.
fasta-fetch	Fetch sequences from a FASTA sequence file. Requires an index file made by fasta-make-index.
fasta-file-indexer	Create an index for a FASTA file for use with programs such as BED2FASTA.
fasta-get-markov	Estimate a Markov model from a FASTA file of sequences.
fasta-grep	Find matches to a Perl regular expression in a FASTA file of sequences.
fasta-hamming-enrich	Compute the relative enrichment of a regular expression in two sets of sequence, where the shortest Hamming distance is used to classify sequences.
fasta-holdout-set	Split primary and control sequences into training and testing sets. Control sequences are generated by shuffling if not specified. Primary sequences will be centrally trimmed to a specified length if requested.
fasta-io	Read and write FASTA files.
fasta-make-index	Make an index for a FASTA file for use by fasta-fetch.
fasta-most	Writes the most frequently occurring sequence length and how many times it occurs.
fasta-shuffle-letters	Shuffle the letters in each sequence in a file of FASTA file preserving k-mer frequencies. This makes use of uShuffle.
fasta-subsample	Extract a random selection of the sequences in a FASTA file. Can also subsample the sequences themselves.
fasta-unique-names	Copy a FASTA sequence file changing any duplicate sequence names to insure there are no duplicates.
gendb	Generate FASTA sequences from a Markov model.
getsize	Print statistics about or (higher-order) shuffle sequences read from a FASTA file.
Other Utilities
alphtype	Classify a string passed as a command line argument as an instance of the DNA or protein alphabet.
ama-qvalues	Add q-values to AMA output.
ceqlogo	Create motif logos.
compute-prior-dist	Compute the distribution of priors in a MEME PSP format file.	Sample
compute-uniform-priors	Compute a uniform position-specific prior equal to the mean of the position-specific prior contained in a MEME PSP format file.	Sample
create-priors	Compute priors and their distribution from raw scores in a Wiggle format file.
fisher_exact	Compute the Fisher Exact test p-value.
fitevd	Fit an extreme value distribution to data.
glam2mask	Mask glam2 motifs out of sequences so that weaker motifs can be found.
gomo_highlight	Identify GO terms which are implied by other GO terms, allowing the most specific GO terms to be highlighted in the conversion to html.
index-sequence-db	Create index files for the genomes in the sequence database directory that was created by `update-sequence-db`.
meme2alph	Extract the sequence alphabet definition from a MEME Motif Format file (e.g., MEME, STREME or DREME output).
meme2images	Create motif logos from a MEME Motif Format file (e.g., MEME, STREME or DREME output).
meme-get-motif	Extract specified motifs from a MEME text (.txt) format file.
meme-rename	Easily rename MEME Suite HTML files to unique names incorporating the path name (rather than "meme.html").
motif-shuffle-columns	Shuffle the columns of each motif in a file of motifs in MEME motif format.
ncbi-genomes-tsv-to-csv	Convenience utility for creating an `update-sequence-db` CSV file specifying a list of NCBI species DNA and protein databases.
pmb_bf	Calculate the statistical power of phylogentic motif models.
psp-gen	Generate position-specific priors from positive (likely to contain a feature of interest) and negative (unlikely to contain a feature of interest) sequences for use as an additional input to MEME.
purge	Remove highly similar members of a set of sequences.
qvalue	Compute q-values from p-values.
reconcile-tree-alignment	Given a tree and an alignment, identify the intersection of the sets of sequence IDs and leaf labels. Trim the extra sequences and leaves and print the resulting alignment and tree.
reduce-alignment	Extract specified columns from a multiple alignment.
remove-alignment-gaps	Remove from an alignment all columns that correspond to a gap in a specified species.
shadow	Perform phylogenetic shadowing on a given DNA alignment, using a given tree.
update-sequence-db	Create or update the sequence databases for a MEME Suite web server.
Input File Formats
MEME Motif	The motif format which is supported by the MEME Suite.
Alphabet Definition	A method for defining custom alphabets.
Background Model	Background frequencies for DNA or protein sequences.
Peptide-Spectra Match	Peptides identified from tandem mass spectra.
Position-Specific Priors (PSP)	Priors (weights) on each position in each input sequence that can bias the search for motifs by MEME.
Dirichlet Mixtures	A Dirichlet mixture file specifies residues' tendencies to align with one another, and is the basis for scoring columns of aligned residues in MEME and GLAM2.
FASTA Sequence	DNA or protein sequences.
FASTA Coordinates	Sequence coordinates (e.g., genomic coordinates) in FASTA sequence headers.
ClustalW Alignment	A multiple alignment of DNA or protein sequences.
Foreign Motif Formats	Motif formats that can be converted into MEME motifs using the motif format conversion scripts available when you install the MEME Suite on your own computer.
GLAM2 Alphabet	A custom sequence alphabet for GLAM2. This can be used to provide alternate alphabets other than the standard DNA and protein.
GO DAG	A file format which stores the structure of the Gene Ontology so it can be used to improve GOMo output.
Motif Discovery Output Formats
STREME Output Formats	The file formats output by the STREME tool.
XSTREME Output Formats	The file formats output by the XSTREME tool.
MEME-ChIP Output Formats	The file formats output by the MEME-ChIP tool.
MoMo Output Formats	The file formats output by the MoMo tool.
Motif Enrichment Output Formats
SEA Output Formats	The file formats output by the SEA tool.
CentriMo Output Formats	The file formats output by the CentriMo tool.
AME Output Formats	The file formats output by the AME tool.
SpaMo Output Formats	The file formats output by the SpaMo tool.
GOMo Output Formats	The file formats output by the GOMo tool.
Motif Scanning Output Formats
FIMO Output Formats	The file formats output by the FIMO tool.
MCAST Output Formats	The file formats output by the MCAST tool.
Motif Comparison Output Formats
Tomtom Output Formats	The file formats output by the Tomtom tool.
Gene Regulation Output Formats
T-Gene Output Formats	The file formats output by the T-Gene tool.
Guides and Tutorials
Installation	How to install a local copy of the MEME Suite.
Release notes	A list of changes included in the latest release.

Development of the MEME Suite was funded by grant R01 GM103544 from the National Institutes of Health.

The MEME Suite

Motif-based sequence analysis tools

Table of Contents:

Developed and maintained by: