How Meta-MEME determines motif order and spacing information

Linear models

This section is not yet written.

Completely connected models

Meta-MEME uses the motif occurrence information provided by MEME to compute motif-to-motif transition probabilities, as well as average inter-motif distances. The motif occurrrence information is given in the MEME output file in the following format:
ICYA_MANSE 6.82e-42 3 189  1 13 1.43e-18  2 99 4.03e-19  3 127 4.33e-16
LACB_BOVIN 2.15e-27 3 178  1 21 9.62e-15  2 104 6.57e-13  3 151 3.37e-11
BBP_PIEBR 1.29e-40 3 173  1 12 9.75e-18  2 95 1.04e-18  3 123 6.65e-16
RETB_BOVIN 2.11e-32 3 183  1 10 1.99e-16  3 35 2.92e-11  2 100 2.40e-16
MUP2_MOUSE 5.96e-33 3 180  1 23 4.53e-14  3 58 1.43e-16  2 104 6.24e-14
Each line contains the motif occurrence information from one sequence in the training set. Each line begins with the sequence id, sequence p-value, number n of motif occurrences, and the length of sequence. The four values are followed by n triples, each corresponding to one motif occurrence. Each triple consists of the motif id, its occurrence position in the sequence, and the motif occurrence p-value.

Say that the lengths of the three motifs are as follows:

Combining this information, we can find the start and end positions of each motif:
   [1]   [2]     [3]
  ----- ------ -------
  13 32 99 119 127 145

   [1]    [2]    [3]
  ----- ------- -------
  21 40 104 124 151 169

   [1]   [2]     [3]
  ----- ------ -------
  12 31 95 115 123 141

   [1]   [3]    [2]
  ----- ----- -------
  10 29 35 53 100 120

   [1]   [3]    [2]
  ----- ----- -------
  23 42 58 76 104 124

The motif occurrence diagrams would therefore look like this:

 13-[1]-67-[2]- 8-[3]-44  132 
 21-[1]-64-[2]-27-[3]- 9  121
 12-[1]-64-[2]- 8-[3]-32  116
 10-[1]- 6-[3]-47-[2]-63  126
 23-[1]-16-[3]-28-[2]-56  124

Here are the unnormalized counts of motif transitions:

        TO

     S 1 2 3 E   
F  S 0 5 0 0 0
R  1 0 0 3 2 0
O  2 0 0 0 3 2
M  3 0 0 2 0 3
   E 0 0 0 0 0
This matrix indicates, for example, that we observed motif 1 occurring first in the sequence five times, and that we observed motif 3 after motif 2 three times. Here is the same matrix with a pseudocount of 1 added to all legal transitions:
        TO

     S 1 2 3 E   
F  S 0 6 1 1 0
R  1 0 1 4 3 1
O  2 0 1 1 4 3
M  3 0 1 3 1 4
   E 0 0 0 0 0
Note that it's not possible to transition to the start or from the end, nor is it possible to go directly from the start to the end. Finally, here is the matrix after row normalization:
            TO

     S   1   2   3   E   
F  S .00 .75 .13 .13 .00
R  1 .00 .11 .44 .33 .11
O  2 .00 .11 .11 .44 .33
M  3 .00 .11 .33 .11 .44
   E .00 .00 .00 .00 .00

In a similar fashion, we can derive a matrix that contains the total length of the spacers lengths between any pair of motifs:

        TO

     S   1   2   3   E   
F  S 0  79   0   0   0
R  1 0   0 195  22   0
O  2 0   0   0  43 119
M  3 0   0  75   0  85
   E 0   0   0   0   0
Dividing this element-wise by the transition counts matrix yields the average spacer length between motifs:
        TO

     S    1    2    3    E   
F  S 0 15.8    0    0    0
R  1 0    0 65.0 11.0    0
O  2 0    0    0 14.3 59.5
M  3 0    0 37.5    0 28.3
   E 0    0    0    0    0

Author: William Stafford Noble.

Meta-MEME program documentation

Meta-MEME home