Return to MAST introduction.
Subject: MAST confirmation: alcohol dehydrogenase motifs Your MAST search request 14019 is being processed: Motif file: adh Database to search: SwissProtIf you fail to receive the confirmation message, check your e-mail address and try resubmitting your MAST request.
Each section of the results file contains an explanation of how to interpret them.
TAATGTTGGTGCTGGTTTTTGTGGCATCGGGCGAGAATAGCGC ========and the motif is represented by the position-dependent scoring matrix (where each row of the matrix corresponds to a position in the motif)
=========|================================= POSITION | A C G T =========|================================= 1 | 1.447 0.188 -4.025 -4.095 2 | 0.739 1.339 -3.945 -2.325 3 | 1.764 -3.562 -4.197 -3.895 4 | 1.574 -3.784 -1.594 -1.994 5 | 1.602 -3.935 -4.054 -1.370 6 | 0.797 -3.647 -0.814 0.215 7 |-1.280 1.873 -0.607 -1.933 8 |-3.076 1.035 1.414 -3.913 =========|=================================then the match score of the fourth position in the sequence (underlined) would be found by summing the score for T in position 1, G in position 2 and so on until G in position 8. So the match score would be
score = -4.095 + -3.945 + -3.895 + -1.994 + -4.054 + -0.814 + -1.933 + 1.414 = -19.316The match scores for other positions in the sequence are calculated in the same way. Match scores are only calculated if the match completely fits within the sequence. Match scores are not calculated if the motif would overhang either end of the sequence.
When nucleotide sequences are searched, the strand (+ or -) is indicated. When nucleotide sequences are searched with peptide motifs, the reading frame (a, b or c) of the best matches is is also indicated. Matches are not all required to be in the same reading frame but must all be on the same strand.
In the ANNOTATED SEQUENCES section of the output, diagrams are shown like this:
When nucleotide databases are searched, all matches must be on the same strand and the strand (+ or -) is indicated in the output. When peptide motifs are used to search nucleotide sequences, the reading frame (a, b or c) of each match is indicated next to the motif numbers in the motif diagrams found in the ANNOTATED SEQUENCES section of the output. For example, Note: If you specify the -hit_list switch to MAST, the motif "diagram" takes the form of a comma separated list of motif occurrences ("hits"). Each "hit" has the format:
When peptide motifs are used to search nucleotide sequences, the reading frame (a, b or c) of each match is indicated with the motif number and the peptide translation of the matching sequence is shown just above the motif occurrence.
Search using MAST Database and Motifs
This section shows information on the database that was searched and the motifs in the search query. The database section gives the date the database was last updated as well as the number of sequences and total sequence characters in it. The motifs are listed by motif number. The width and subsequence which would be given the best possible score for each motif is shown. If there is more than one motif in the query, all pairwise correlations between the motifs are shown. The correlations can range from -1 to +1, with +1 meaning that the shorter motif is exactly identical to part or all of the longer motif. High correlations can cause some combined p-values and e-values to be inaccurate (too low). It may be advisable to remove enough motifs from the query to insure that no pairs of motifs have high correlations. Any high correlations are indicated along with the suggestion that one of the motifs be removed from the query.
High-scoring Sequences
MAST lists the names and part of the descriptive text of all sequences whose e-value is less than E. Sequences shorter than one or more of the motifs are skipped. The sequences are sorted by increasing e-value. The value of E is set to 10 for the WEB server but is user-selectable in the down-loadable version of MAST.
Motif Diagrams
Motif diagrams show the order and spacing of non-overlapping matches to the motifs in each high-scoring sequence. Motif occurrences are determined based on the position p-value of matches to the motif. In the MOTIF DIAGRAMS section of the output, diagrams are shown like this:
6
4
3
5
7
27-[3]-44-<4>-99-[1]-7
97-[6b]-17-[4a]-36-[3a]-45-[5a]-96-[7a]-59
<strand><motif> <start> <end> <p-value>
where
<strand> is the strand (+ or - for DNA, blank for protein),
<motif> is the motif number,
<start> is the starting position of the hit,
<end> is the ending position of the hit, and
<p-value> is the position p-value of the hit.
Annotated Sequences
MAST annotates each high-scoring sequence by printing the sequence along with the position and strength of all the non-overlapping motif occurrences. The four lines above each motif occurrence contain, respectively,
The best possible match to a motif is the sequence of letters which would achieve the highest match score.
Sample MAST Search Results
Here is an actual MAST search results file of a search of a nucleotide database with peptide motifs. It has been edited slightly to reduce its size by removing most of the 832 sequences which matched the motifs.
MAST introduction
MEME SYSTEM introduction