iupac2meme
Usage: iupac2meme [options] [<iupac_motif>]+
Description:
Convert an IUPAC motif into a MEME version 4 formatted file suitable for use with MAST and other MEME Suite programs.
Input:
An IUPAC motif represents frequencies by using either an exact letter meaning that letter occurs in all sites, or ambiguous letters, representing an equal frequency of all the letters representing by that letter. This program additionally supports regular expression bracket expressions where multiple letters can be grouped in one with the use of square brackets.
A background frequency file modifies the assumption of equal probability of all alternative letters.
A probability matrix and optionally a log-odds matrix are output for each motif provided on the command line. The probability matrix is computed using pseudo-counts consisting of the background frequency (see -bg, below) multiplied by the total pseudocounts (see -pseudo, below). The log-odds matrix uses the background frequencies in the denominator and is log base 2.
Options:
- -alpha DNA|PROTEIN
- IUPAC alphabet; default: DNA
- -numseqs <numseqs>
- assume frequencies based on this many sequences; default: 20
- -bg <background file>
- file with background frequencies of letters; default: uniform background
- -pseudo <total pseudocounts>
- add <total pseudocounts> times letter background to each frequency; default: 0
- -logodds
- output the log-odds (PSSM) and frequency (PSPM) motifs; default: PSPM motif only
- -url <website>
-
A link to the provided website URL will be added to the output. This can be used to link
to information about the motif. If
<website>
contains the key word MOTIF_NAME the IUPAC code will be substituted in place of MOTIF_NAME in the output. For example, the output ofiupac2meme -url http://big-box-of-motifs.com/motifs/MOTIF_NAME ATGATG
will contain a link tohttp://big-box-of-motifs.com/motifs/ATGAT
The default output does not include a URL.
Output:
Writes MEME format to standard output.
Sample Input:
DNA IUPAC motif:
ACGGWN[ACGT]YCGT
protein IUPAC motif:
IKLVBZYXXHG