AME Tutorial

What AME does

Typical AME applications

How AME works

AME determines the (relative) enrichment of motif in its input separately and reports the significant ones. The first step is to scan each primary and control sequence with the motif, computing a odds score (not log-odds) for each position in each sequence. Next, for each sequence, AME combines the scores according to the chosen scoring method. By default, the website version uses the "Average odds score", which is just what it sounds like—the average odds score over all positions in the sequence where the motif fits. AME then sorts all the sequences (primary and control) according to their scores, and applies a statistical test to determine if the primary sequences have significantly larger scores. By default, the website version of AME uses the "Rank sum test".

  1. Scan each sequence with a motif, computing one score per sequence.
  2. Apply statistical test on the tendency of the primary sequences to have larger scores.
  3. Report motifs with significant (adjusted) p-values.

Sequence and Motif Alphabets

AME supports DNA, RNA, protein and custom sequence alphabets. The alphabet is specified in the motif file, and the sequences being searched must be compatible with that alphabet. (DNA motifs can be used to search RNA sequences and vice-versa.) If the motif alphabet is "complementable" (e.g., DNA-like alphabets), AME scores both strands of the sequences. If you have DNA motifs and wish to search only the given strand of each sequence, you can edit the alphabet section of the motif file to contain

rather than You can also override the the alphabet specified in the motif file with an alphabet that contains all the core symbols specified in the motif alphabet but which may contain additional core symbols. The motifs will be expanded to match this new alphabet with 0's filling in the probabilities for the new symbols (prior to applying pseudocounts).

Primary Sequence set

These sequences should all be in the same sequence alphabet. Their lengths may vary. They can, for example, be a set of promoters thought to be co-regulated, a set of ChIP-seq regions or a set of proteins thought to be phosphorylated be one or more kinases.

Control sequence set

AME detects motifs enriched in the primary sequences relative to these sequences. If you don't provide a control set, the website version of AME will create one copying the primary sequences and shuffling the letters within each sequence. The shuffling preserves 2-mer frequencies in each sequence individually. It is advisable that the primary and control sequences have similar length distributions or AME's reported p-values may not be accurate.

FAQ