jaspar2meme
Usage: jaspar2meme [options] <Jaspar
directory>
Description:
Convert a directory of JASPAR files into a MEME version 3 formatted file suitable for use with MEME Suite programs.
Input:
<Jaspar directory>
- a directory containing one or
more JASPAR motif files. The possible formats are:
A JASPAR '.sites' file describes a motif in terms of a multiple alignment of sites. It contains a multiple alignment in modified FASTA format. Only capitalized sequence letters are part of the alignment.
A JASPAR count file ('.pfm') contains a count matrix where the rows correspond to A, C, G and T, respectively.
A CM count file ('.cm') prefixes the rows with 'A| ', 'C| ', 'G| ' and 'T| '.
A a probability matrix are output for each motif file. The probability matrix is computed using pseudo-counts consisting of the background frequency (see -bg, below) multiplied by the total pseudocounts (see -pseudo, below).
Options:
-cm
- Read count file with line labels 'A|' etc. (.cm); default: site files (.sites)-pfm
- Read JASPAR count files (.pfm); default: site files (.sites)-numbers
- Use numbers instead of strings as motif IDs-strands [1|2]
- Print '+', '+ -' on the MEME strand line; default: 2 (prints '+ -')-bg
- file with background frequencies in MEME background file format; default: uniform frequencies-pseudo <A>
- Add<A>
times background frequency to each count when computing letter frequencies default: 0-logodds
- Output log-odds matrices, too. default: output probability matrices only. These are required only by MAST. The log-odds matrix uses t he background frequencies in the denominator and is log base 2.
Output:
Writes MEME format to standard output.
Sample Input:
.pfm format (counts):
0 3 79 40 66 48 65 11 65 0 94 75 4 3 1 2 5 2 3 3 1 0 3 4 1 0 5 3 28 88 2 19 11 50 29 47 22 81 1 6
.cm format (counts):
A| 0 3 79 40 66 48 65 11 65 0 C| 94 75 4 3 1 2 5 2 3 3 G| 1 0 3 4 1 0 5 3 28 88 T| 2 19 11 50 29 47 22 81 1 6
.sites format (motif sites):
>MA0024 E2F 1 aTTTGGCGC >MA0024 E2F 2 TTTGGCGC >MA0024 E2F 3 TTTGGCGC >MA0024 E2F 4 TTTGGCGC >MA0024 E2F 5 TTTCGCGC >MA0024 E2F 6 TTTCGCGC >MA0024 E2F 7 TTTCGCGC >MA0024 E2F 8 TTTGCCGC >MA0024 E2F 9 TTTCCCGC >MA0024 E2F 10 TTTGGCGG A [ 0 0 0 0 0 0 0 0 ] C [ 0 0 0 4 2 10 0 9 ] G [ 0 0 0 6 8 0 10 1 ] T [10 10 10 0 0 0 0 0 ]