MEME version 3.0 (Release date: 2004/07/16 05:53:30)
For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.sdsc.edu.
This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.sdsc.edu.
If you use this program in your research, please cite:
Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
DATAFILE= /home/meme/meme.3.0.6/tests/adh.s ALPHABET= ACDEFGHIKLMNPQRSTVWY Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 2BHD_STREX 1.0000 255 3BHD_COMTE 1.0000 253 ADH_DROME 1.0000 255 AP27_MOUSE 1.0000 244 BA72_EUBSP 1.0000 249 BDH_HUMAN 1.0000 343 BPHB_PSEPS 1.0000 275 BUDC_KLETE 1.0000 241 DHES_HUMAN 1.0000 327 DHGB_BACME 1.0000 262 DHII_HUMAN 1.0000 292 DHMA_FLAS1 1.0000 270 ENTA_ECOLI 1.0000 248 FIXR_BRAJA 1.0000 278 GUTD_ECOLI 1.0000 259 HDE_CANTR 1.0000 906 HDHA_ECOLI 1.0000 255 LIGD_PSEPA 1.0000 305 NODG_RHIME 1.0000 245 RIDH_KLEAE 1.0000 249 YINL_LISMO 1.0000 248 YRTP_BACSU 1.0000 238 CSGA_MYXXA 1.0000 166 DHB2_HUMAN 1.0000 387 DHB3_HUMAN 1.0000 310 DHCA_HUMAN 1.0000 276 FABI_ECOLI 1.0000 262 FVT1_HUMAN 1.0000 332 HMTR_LEIMA 1.0000 287 MAS1_AGRRA 1.0000 476 PCR_PEA 1.0000 399 RFBB_NEIGO 1.0000 346 YURA_MYXXA 1.0000 258
This information can also be useful in the event you wish to report a problem with the MEME software. command: meme /home/meme/meme.3.0.6/tests/adh.s -mod tcm -protein -nmotifs 2 model: mod= tcm nmotifs= 2 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 50 wnsites= 0.8 theta: prob= 1 spmap= pam spfuzz= 120 em: prior= megap b= 49980 maxiter= 50 distance= 1e-05 data: n= 9996 N= 33 sample: seed= 0 seqfrac= 1 Dirichlet mixture priors file: prior30.plib Letter frequencies in dataset: A 0.111 C 0.012 D 0.050 E 0.055 F 0.036 G 0.090 H 0.018 I 0.057 K 0.052 L 0.092 M 0.027 N 0.041 P 0.041 Q 0.029 R 0.049 S 0.064 T 0.057 V 0.083 W 0.010 Y 0.027 Background letter frequencies (from dataset with add-one prior applied): A 0.111 C 0.012 D 0.050 E 0.055 F 0.036 G 0.090 H 0.018 I 0.057 K 0.052 L 0.092 M 0.027 N 0.041 P 0.041 Q 0.029 R 0.049 S 0.064 T 0.057 V 0.083 W 0.010 Y 0.027
Time 11.03 secs.
Time 20.26 secs.
CPU: chromo
MOTIFS
For each motif that it discovers in the training set, MEME prints the following information:
J. Kyte and R. Doolittle, 1982. "A Simple Method for Displaying the Hydropathic Character of a Protein", J. Mol Biol. 157, 105-132.
Summing the information content for each position in the motif gives the total information content of the motif (shown in parentheses to the left of the diagram). The total information content is approximately equal to the log likelihood ratio divided by the number of occurrences times ln(2). The total information content gives a measure of the usefulness of the motif for database searches. For a motif to be useful for database searches, it must as a rule contain at least log_2(N) bits of information where N is the number of sequences in the database being searched. For example, to effectively search a database containing 100,000 sequences for occurrences of a single motif, the motif should have an IC of at least 16.6 bits. Motifs with lower information content are still useful when a family of sequences shares more than one motif since they can be combined in multiple motif searches (using MAST).
Multilevel TTATGTGAACGACGTCACACT consensus AA T A G A GA AA sequence T C TT T
You can convert these blocks to PSSMs (position-specific scoring matrices), LOGOS (color representations of the motifs), phylogeny trees and search them against a database of other blocks by pasting everything from the "BL" line to the "//" line (inclusive) into the Multiple Alignment Processor. If you include the -print_fasta switch on the command line, MEME prints the motif sites in FASTA format instead of BLOCKS format.
Note: Earlier versions of MEME gave the posterior probabilities--the probability after applying a prior on letter frequencies--rather than the observed frequencies. These versions of MEME also gave the number of possible positions for the motif rather than the actual number of occurrences. The output from these earlier versions of MEME can be distinguished by "n=" rather than "nsites=" in the line preceding the matrix.