MCAST: Motif Cluster Alignment Search Tool

Usage: mcast [options] <query> <database>

Description:

MCAST searches a sequence database for statistically significant clusters of non-overlapping "hits" to the motifs in a query.

A "hit" is a sequence position that is sufficiently similar to a motif in the query. To be a hit, the p-value of the motif alignment score must be less than the significance threshold, pthresh (see option -p, below). The alignment of the motif and the sequence position is done without gaps. To compute the p-value of a motif alignment score, MCAST assumes that the sequences in the database were generated by a 0-order Markov process; see option -bg, below. With DNA sequences, MCAST searches for hits on both the sequences given in the database, and their reverse complements.

A cluster of non-overlapping hits is called a "match". The use specifies the maximum allowed distance between the hits in a match. (Two hits separated by more than the maximum allowed gap will be reported in separate matches.)

MCAST searches for all of the matches between the query and the sequences in the database. Each match is assigned an E-value, and matches that score below an E-value threshold are printed in order of increasing E-value (see option -e, below).

The p-value of a hit is converted to a "p-score" in order to compute the total score of the match it participates in. The p-score for a hit with p-value p is

S = -log2(p/pthresh),
where the significance threshold pthresh may be specified by the user. The total score of a match is the sum of the p-scores of the hits making up the match. MCAST finds the matches with the maximum match scores.

In order for E-values to be computed by MCAST, at least 100 matches must be found. If there are too few sequences in the database, or if certain other options are made to stringent (see Options, below), too few matches may exist for E-values to be computed. In this case, the results are sorted by match score, the E-value column is set to "NaN" and all matches are printed.

Input:

Output:

MCAST uses MHMMSCAN to generate its output. Clicking here or on the links below will take you to the MHMMSCAN output documentation. Use the "Back" button on your browser to return to the MCAST documentation.

Options:

Bugs: None known.


Author: Timothy Bailey and William Stafford Noble .

Meta-MEME program documentation

Meta-MEME home