iupac2meme

Convert a consensus sequence into a motif in MEME motif format suitable for use with MEME Suite programs.

IUPAC Motifs

<consensus_sequence>+

One or more consensus sequences over an extended alphabet (such as IUPAC), each representing a motif.

The consensus sequences can be over either the nucleotide or protein IUPAC alphabets, or over a user-defined alphabet. This program additionally supports regular expression bracket expressions (character classes) where multiple letters can be grouped in one with the use of square brackets. Negated character classes are also supported and are by indicated using a caret ('^') immediately following the opening bracket.

A probability matrix and optionally a log-odds matrix are output for each consensus sequence provided on the command line. For each position in the consensus sequence, a count of count is assigned to each letter the symbol at that position matches (see option -numseqs, below). The probability matrix is computed using pseudo-counts consisting of the background frequency (see option -bg, below) multiplied by the total pseudocounts (see -pseudo, below). The log-odds matrix uses the background frequencies in the denominator and is log base 2.

Examples

DNA IUPAC motif:
ACGGWN[ACGT]YCGT

protein IUPAC motif:
IKLVB[^ILVM]ZYXXHG

Option	Description	Default Behavior
General Options
-nosort	Don't change the order of the motifs.	Sort the motifs alphabetically
-named	The program will expect the name of each motif to follow the regular expression.	The motifs will be named based on the regular expression used to create them.

The MEME Suite

Motif-based sequence analysis tools

Usage:

Description

Inputs

IUPAC Motifs

<consensus_sequence>+

Examples

Output

Options