enr [options] p <primary sequences> [m <motifs>]+
The name of a file containing the primary (positive) sequences in FASTA format. The file must contain at least 2 valid sequences or ENR will reject it. Note that the commandline version of ENR does not attempt to detect the alphabet from the primary sequences, so you should specify it with the dna, rna, protein or alph options.
The name of a file containing motifs in MEME format that ENR will test for enrichment in the primary sequence. This argument may be present more than once, allowing you to simultaneously analyze motifs in several motif files.
ENR writes its output to standard output. The output is in tabseparated values format (TSV). The first line of the output contains the (tabseparated) names of the fields. The names and meanings of each of the fields are described in the table below.
field  name  contents 

1  ID  The ID of the motif. 
2  ALT_ID  The alternate ID of the motif (or blank). 
3  POS_MATCHES  The number of primary sequences matching the motif with scores greater than or equal to the optimal score threshold. 
4  NEG_MATCHES  The number of negative sequences matching the motif with scores greater than or equal to the optimal score threshold. 
5  SCORE_THR  The match score threshold giving the optimal pvalue. This is the score threshold used by ENR to determine the values of POS_MATCHES and NEG_MATCHES. 
6  RATIO  The relative enrichment ratio of the motif in the primary vs. control sequences, defined as (POS_MATCHES/NPOS) / (NEG_MATCHES/NNEG), where NPOS is the numbr of primary sequences in the input, and NNEG is the number of negative sequences in the input. 
7  PVALUE  The statistical signficance (pvalue) of the motif's enrichment, according to the chosen objective function. 
8  LOG10_PVALUE  The base10 logarithm of the pvalue. 
Option  Parameter  Description  Default Behavior  

Objective Function  
objfun  de cd  This option is used to select the objective function that
ENR optimizes in searching for motifs.

ENR uses the Differential Enrichment (de) function.  
Control Sequences and Holdout Set  
n  control sequences  The name of a file containing control (negative) sequences in FASTA format. The control sequences must be in the same sequence alphabet as the primary sequences. If the average length of the control sequences is longer than that of the primary sequences, ENR trims the control sequences so that both sets have the same average length.  If you do not provide control sequences, ENR creates them by shuffling a copy of each primary sequence, preserving the frequencies of words of length k (see next option). Shuffling also preserves the positions of noncore (e.g., ambiguous) characters in each sequence to avoid artifacts.  
kmer  k  Preserve the frequencies of words (kmers) of this size when shuffling primary sequences to create control sequences. k must be in the range [1,..,6]. ENR also estimates a background model of order k1 from the primary (positive) sequences for use in loglikelihood scoring of motif sites.  ENR preserves the frequencies of words of length 3 (DNA and RNA), and 1 (Protein and Custom alphabets), and constructs background models of order 2 (DNA and RNA), and order 0 (Protein and Custom alphabets).  
hofract  hofract  The fraction of the primary sequences that ENR will randomly select and hold out to simulate exactly how STREME works. ENR will therefor report the same statistical significance for motifs found by STREME as reported by STREME. Note: Set this option to zero if you want to measure the statistical significance of your motifs in the complete set of input sequences.  ENR sets hofract to 0.1 (10%) of the primary sequences.  
seed  seed  Random seed for shuffling and sampling the holdout set sequences (see above).  ENR uses a random seed of 0.  
Alphabet  
Motif Scoring and Selection  
Misc  
verbosity  12345  A number that regulates the verbosity level of the output information messages. If set to 1 (quiet) then ENR will only output error messages, whereas the other extreme 5 (dump) outputs lots of information intended for debugging.  The verbosity level is set to 2 (normal). 
ENR evaluates each motif in the motif file(s) for enrichment in the primary sequences.
ENR builds a single suffix tree that includes both the primary and control sequences (but not the holdout set sequences).
ENR converts each motif from a frequency matrix to a logodds score matrix. By default, STREME creates a background model from the control sequences, but you can provide a different background model if you wish.
ENR computes the unbiased statistical significance of the of the motif by using the motif and the optimal discriminative score threshold (based on the primary and control sequences) to classify the holdout set sequences, and then applying the statistical test (Fisher's exact test, Binomial test, or the cumulative Bates distribution) to the classification. Classification is based on the best match to the motif in each sequence (on either strand when the alphabet is complementable).