How Meta-MEME determines motif order and spacing information
Linear models
This section is not yet written.Completely connected models
Meta-MEME uses the motif occurrence information provided by MEME to compute motif-to-motif transition probabilities, as well as average inter-motif distances. The motif occurrrence information is given in the MEME output file in the following format:ICYA_MANSE 6.82e-42 3 189 1 13 1.43e-18 2 99 4.03e-19 3 127 4.33e-16 LACB_BOVIN 2.15e-27 3 178 1 21 9.62e-15 2 104 6.57e-13 3 151 3.37e-11 BBP_PIEBR 1.29e-40 3 173 1 12 9.75e-18 2 95 1.04e-18 3 123 6.65e-16 RETB_BOVIN 2.11e-32 3 183 1 10 1.99e-16 3 35 2.92e-11 2 100 2.40e-16 MUP2_MOUSE 5.96e-33 3 180 1 23 4.53e-14 3 58 1.43e-16 2 104 6.24e-14Each line contains the motif occurrence information from one sequence in the training set. Each line begins with the sequence id, sequence p-value, number n of motif occurrences, and the length of sequence. The four values are followed by n triples, each corresponding to one motif occurrence. Each triple consists of the motif id, its occurrence position in the sequence, and the motif occurrence p-value.Say that the lengths of the three motifs are as follows:
Combining this information, we can find the start and end positions of each motif:
- motif 1 = 19
- motif 2 = 20
- motif 3 = 18
[1] [2] [3] ----- ------ ------- 13 32 99 119 127 145 [1] [2] [3] ----- ------- ------- 21 40 104 124 151 169 [1] [2] [3] ----- ------ ------- 12 31 95 115 123 141 [1] [3] [2] ----- ----- ------- 10 29 35 53 100 120 [1] [3] [2] ----- ----- ------- 23 42 58 76 104 124The motif occurrence diagrams would therefore look like this:
13-[1]-67-[2]- 8-[3]-44 132 21-[1]-64-[2]-27-[3]- 9 121 12-[1]-64-[2]- 8-[3]-32 116 10-[1]- 6-[3]-47-[2]-63 126 23-[1]-16-[3]-28-[2]-56 124Here are the unnormalized counts of motif transitions:
TO S 1 2 3 E F S 0 5 0 0 0 R 1 0 0 3 2 0 O 2 0 0 0 3 2 M 3 0 0 2 0 3 E 0 0 0 0 0This matrix indicates, for example, that we observed motif 1 occurring first in the sequence five times, and that we observed motif 3 after motif 2 three times. Here is the same matrix with a pseudocount of 1 added to all legal transitions:TO S 1 2 3 E F S 0 6 1 1 0 R 1 0 1 4 3 1 O 2 0 1 1 4 3 M 3 0 1 3 1 4 E 0 0 0 0 0Note that it's not possible to transition to the start or from the end, nor is it possible to go directly from the start to the end. Finally, here is the matrix after row normalization:TO S 1 2 3 E F S .00 .75 .13 .13 .00 R 1 .00 .11 .44 .33 .11 O 2 .00 .11 .11 .44 .33 M 3 .00 .11 .33 .11 .44 E .00 .00 .00 .00 .00In a similar fashion, we can derive a matrix that contains the total length of the spacers lengths between any pair of motifs:
TO S 1 2 3 E F S 0 79 0 0 0 R 1 0 0 195 22 0 O 2 0 0 0 43 119 M 3 0 0 75 0 85 E 0 0 0 0 0Dividing this element-wise by the transition counts matrix yields the average spacer length between motifs:TO S 1 2 3 E F S 0 15.8 0 0 0 R 1 0 0 65.0 11.0 0 O 2 0 0 0 14.3 59.5 M 3 0 0 37.5 0 28.3 E 0 0 0 0 0
Author: William Stafford Noble.