For each motif that it discovers in the training set, MEME prints the following information:
NUCLEIC ACIDS | COLOR |
---|---|
A | RED |
C | BLUE |
G | ORANGE |
T | GREEN |
AMINO ACIDS | COLOR | PROPERTIES | A, C, F, I, L, V, W and M | BLUE | Most hydrophobic[Kyte and Doolittle, 1982] |
---|---|---|
NQST | GREEN | Polar, non-charged, non-aliphatic residues |
DE | MAGENTA | Acidic |
KR | RED | Positively charged |
H | PINK | |
G | ORANGE | |
P | YELLOW | |
Y | TURQUOISE |
J. Kyte and R. Doolittle, 1982. "A Simple Method for Displaying the Hydropathic Character of a Protein", J. Mol Biol. 157, 105-132.
Summing the information content for each position in the motif gives the total information content of the motif (shown in parentheses to the left of the diagram). The total information content is approximately equal to the log likelihood ratio divided by the number of occurrences times ln(2). The total information content gives a measure of the usefulness of the motif for database searches. For a motif to be useful for database searches, it must as a rule contain at least log_2(N) bits of information where N is the number of sequences in the database being searched. For example, to effectively search a database containing 100,000 sequences for occurrences of a single motif, the motif should have an IC of at least 16.6 bits. Motifs with lower information content are still useful when a family of sequences shares more than one motif since they can be combined in multiple motif searches (using MAST).
Multilevel TTATGTGAACGACGTCACACT consensus AA T A G A GA AA sequence T C TT TThis multilevel consensus sequence says several things about the motif. First, the most likely form of the motif can be read from the top line as TTATGTGAACGACGTCACACT. Second, that only letter A has probability more than 0.2 in position 3 of the motif, both T and A have probability greater than 0.2 in position 1, etc. Third, a rough approximation of the motif can be made by converting the multilevel consensus sequence into the Prosite signature
You can convert these blocks to PSSMs (position-specific scoring matrices), LOGOS (color representations of the motifs), phylogeny trees and search them against a database of other blocks by pasting everything from the "BL" line to the "//" line (inclusive) into the Multiple Alignment Processor. If you include the -print_fasta switch on the command line, MEME prints the motif sites in FASTA format instead of BLOCKS format.
Note: Earlier versions of MEME gave the posterior probabilities--the probability after applying a prior on letter frequencies--rather than the observed frequencies. These versions of MEME also gave the number of possible positions for the motif rather than the actual number of occurrences. The output from these earlier versions of MEME can be distinguished by "n=" rather than "nsites=" in the line preceding the matrix.