dust

Usage:

dust <sequence file> <cutoff>

Description

Dust masks low-compexity regions in sequences in a FASTA sequence file This is recommended in order to prevent such low-complexity regions from distorting the search for motifs, and in motif enrichment analyses using AME. Dust was written by R. Tatusov and D.J. Lipman (unpublished).

Input

<sequence file>

The name of a file containing FASTA formatted sequences.

<cutoff>

The minimum score for masking low-complexity regions in sequences. By default, the cutoff score is 20. Making the cutoff lower will cause more masking of low-complexity regions.

Output

Dust outputs a FASTA file where all low-complexity regions are changed to runs of the N character.