Usage:

tomtom [options] <query motif file> <target motif file>+

Description

Inputs

<query motif file>

The name of a file containing one or more motifs in MEME format or the HTML (.html) or plain text (.txt) output of MEME, STREME or DREME. Each of these motifs will be searched against the target files. If you only wish to search with a subset of these motifs then look into the -m and -mi options.

<target motif file>+

The names of one or more files containing MEME formatted motifs. Outputs from MEME, STREME and DREME are supported, as well as Minimal MEME Format. You can convert many other motif formats to MEME format using conversion scripts available with the MEME Suite.

Output

Tomtom writes its output to files in a directory named tomtom_out, which it creates if necessary. You can change the output directory using the -o or -oc options. The directory will contain:

Only matches for which the significance is less than or equal to the threshold set by the -thresh option (below) will be output. By default, significance is measured by q-value of the match.

Additional outputs may be requested using the -eps and -png options, as described below.

Note: See this detailed description of the Tomtom output formats for more information.

Options

OptionParameterDescriptionDefault Behavior
Input
-mid The name of a motif in the query file that will be used. This option may be repeated multiple times. If both this option and the related -mi is unused then all motifs in the query file will be used.
-miindex The offset in the query file of a motif that will be used. This option may be repeated multiple times. If both this option and the related -m is unused then all motifs in the query file will be used.
-bfile file Specify the source of a background model for converting a frequency matrix to a log-odds score matrix. The background model contains letter frequencies that can adjust for biases in the single-letter composition of the sequences in the organism from which the motif is derived. The file must be in Markov Background Model Format. The background model will be modified by averaging the frequencies of letters and their reverse complements. Note 1: The background model must be for the same alphabet as specified in the query motif file. Note 2: Not allowed with -dist ed, -dist kullback, -dist pearson, or -dist sandelin, as those distance measures are based only on the motif frequencies. Background frequencies will be loaded from the query motif file. Note this was changed from previous versions that used the first target motif file because that design choice could not work with the -xalph option.
-motif-pseudocount This option adds the specified pseudocount to the motifs. The pseudocount is distributed taking into account the background. Note that some comparison algorithms require motif probabilities to not contain zeros. If you choose to set a pseudocount of zero with those comparison algorithms then any motifs containing a probability of zero will be discarded and a warning will be emitted. A pseudocount of 0.1 is added to the motifs.
Output
-png  Output motif logo alignment images in portable network graphics (PNG) format. This format is useful for display on websites. Note: Ghostscript must be installed on your computer in order to use this option. Images are not output in PNG format.
-eps  Output motif logo alignment images in Encapsulated Postscript (EPS) format. This format is useful for inclusion in publications as it is a vector graphics format and can be easily scaled. Images are not output in EPS format.
-no-ssc  This option causes the LOGOs in the LOGO alignments output by Tomtom not to be corrected for small-sample sizes. By default, the height of letters in the LOGOs are reduced when the number of samples on which a motif is based (nsites in the MEME motif) is small. The default setting can cause motifs based on very few sites to have "empty" LOGOs, so this switch can be used if your query or target motifs are based on few samples. Small sample correction is used.
Scoring
--norc  Do not score the reverse complements of target motifs. Given and reverse complement target motifs are scored if the alphabet is complementable.
-incomplete-scores  Compute scores using only aligned columns. Take into account columns that don't align.
-threshvalue Only report matches with significance values ≤ value. Unless the -evalue option is specified, this value must be smaller than or equal to 1. A threshold of 0.5 is used.
-evalue  Use the E-value of the match as the significance threshold. Use the q-value as the significance threshold.
-dist allr | ​ ed | ​ kullback | ​ pearson | ​ sandelin | ​ blic1 | ​ blic5 | ​ llr1 | ​ llr5
CodeNameRestrictions
allr Average log-likelihood ratio Non-zero probabilities
ed Euclidian distance
kullback Kullback-Leibler divergence
pearson Pearson's correlation coefficient
sandelin Sandelin-Wasserman function
blic1 Bayesian Likelihood 2-Components (1 Dirichlet) DNA only
blic5 Bayesian Likelihood 2-Components (5 Dirichlet) DNA only
llr1 Log likelihood ratio (1 Dirichlet) DNA only
llr5 Log likelihood ratio (5 Dirichlet) DNA only
Detailed descriptions of these functions can be found in the published description of Tomtom.
Euclidean distance (ed) is used by default.
-internal  This parameter forces the shorter motif to be completely contained in the longer motif. The shorter motif may extend outside the longer motif.
-min-overlapmin overlap Only report motif matches that overlap by min overlap positions or more. In case a query motif is smaller than min overlap, then the motif's width is used as the minimum overlap for that query. A minimum overlap of 1 is required.
-timetime Maximum allowed running time (in CPU seconds). The time value must be > 0. No limit on running time.
Miscellaneous

Citing