update-sequence-db [options] <sequence database directory>
Download sequence databases.
Creates a SQLite database called fasta_db.sqlite
and
downloads sequences from multiple sources while storing information
about the sequences in the database.
The program will start in status display mode where it will give regular updates on what it is doing. You can switch it to command mode by pressing Enter. In command mode you can type the two basic commands "help" which will show the available commands and "status" which will switch it back to status mode. While sequences are downloading you may use the command "exit" to stop any further downloading.
The folder to store downloaded database files. The MEME Suite expects
to find sequence databases in a folder called fasta_databases
either inside in the folder MEME Install Folder/db
or in
the folder specified to the configure script
--with-db DB Install Folder
. Depending on how
you configured the MEME Suite you should either specify
MEME Install Folder/db/fasta_databases
or
DB Install Folder/fasta_databases
.
By default, all of the standard sequence databases supported by the
MEME Suite will be updated. Specifying one or more specific types
of databases overrides this default, and then only the specified types
of sequence database will be updated. You can also specify individual
types of database to omit using the --no_X
options, where X
is one of the allowed database types (see
the section "Select Databases to Update", below).
The program creates a folder called downloads
and a folder
called logs
. It also creates a SQLite database called
fasta_db.sqlite
. Every sequence database that is downloaded
is initially put in the folder downloads
until it has been
completely downloaded. When the sequence has been downloaded it will be
decompressed or merged from multiple sources as required and put into
a sequence file with either a .faa
or .fna
extension for protein or DNA sequences. Once the sequence has been
expanded it will be processed by fasta-get-markov
to
calculate a 1st order background model in a file with the extension
.bfile
. Additionally fasta-get-markov
will
calculate the number of sequences, the shortest, longest and average size
and all this information will be stored in the SQLite database.
Configuration files that tweak the behaviours of the sequence database
downloaders will be automatically generated in the conf/
subdirectory within the specified sequence database directory.
Additionally the miscellaneous source downloader will check the
conf/
subdirectory for any files ending with the extension
.csv
which it reads to determine sequence sources. The MEME
Suite includes two files db_general.csv
and
db_other_genomes.csv
in the distribution's etc
folder which may be moved into the conf
folder, though
this is not done automatically during install.
Option | Parameter | Description | Default Behaviour |
---|---|---|---|
Help | |||
--help | Display a help message and exit. | Run like normal. | |
Select Databases to Update | |||
--[no_]ensembl | [Do not] update genomes from Ensembl. | Update all sequence databases. | |
--[no_]genbank | [Do not] update genomes from GenBank. | Update all sequence databases. | |
--[no_]ucsc | [Do not] update the genomes from UCSC. | Update all sequence databases. | |
--[no_]rsat | [Do not] update the upstream sequence databases from RSAT. | Update all sequence databases. | |
--[no_]epd | [Do not] update the Eukaryotic Promoter Database. | Update all sequence databases. | |
--[no_]misc | [Do not] update the miscellaneous sequence databases specified
in .csv files in the database subdirectory conf/ .
There are two example .csv files in the MEME Suite etc/
directory.
|
Update all sequence databases. | |
--updater | classname | Experimental Specify the classname of a custom updater. | |
File Cleanup | |||
--delete_old | Sequence databases marked as obsolete (on a previous update) will be deleted. | Sequence databases marked as obsolete will be left untouched. | |
--retain_missing | Database entries for missing files are retained. | Database entries for missing files are removed. | |
Backwards compatibility | |||
--csv:directory | Create a csv file and index file that lists all the databases to enable backwards compatibility with older releases. The directory to create the csv and index file can be specified if desired but if it is not specified then the csv and index file will be placed in the sequence database directory. | Don't create a csv or index file. | |
Miscellaneous | |||
--bin | directory | Specify the location to find the fasta-get-markov tool. | The program will search the configured bin directory and if fasta-get-markov is not present it will search the path. |
--log | log file | Specify the file to write logs. | A log will be written the logs directory below the
sequence database directory. |
-v | log level | Specify the logging level [1-8]. | A default logging level of 3 is used which outputs errors, warnings and summary information. |