dust <sequence file> <cutoff>


Dust masks low-compexity regions in sequences in a FASTA sequence file This is recommended in order to prevent such low-complexity regions from distorting the search for motifs, and in motif enrichment analyses using AME. Dust was written by R. Tatusov and D.J. Lipman (unpublished).


<sequence file>

The name of a file containing FASTA formatted sequences.


The minimum score for masking low-complexity regions in sequences. By default, the cutoff score is 20. Making the cutoff lower will cause more masking of low-complexity regions.


Dust outputs a FASTA file where all low-complexity regions are changed to runs of the N character.