glam2 [options] <alphabet> <sequence file>
The alphabet of the sequences. This can be 'p' for protein sequences, 'n' for nucleotide sequences or the name of a GLAM2 alphabet file.
A file containing FASTA formatted sequences.
Outputs
GLAM2 writes its output to files in a directory named
glam2_out
, which it creates if necessary. You can change the
output directory using the -o or -O options.
The directory will contain:
glam2.html
-
an HTML file containing the results in a human-readable formatglam2.txt
-
a plain text file containing the resultsSee the tutorial for a detailed description of GLAM2's output.
GLAM2 might occasionally issue this warning: 'accuracy loss due to numeric underflow'. If this happens, its ability to optimize the alignment may be harmed: see GLAM2 methods (PDF) for details. To fix this, try increasing -u, or possibly decreasing -b. Increasing -u somewhat should be harmless, as long as it stays well below 1.
Option | Parameter | Description | Default Behavior |
---|---|---|---|
Input/Output | |||
Alignment Options | |||
-2 | Search both strands. | ||
-z | minimum sites | Specify the minimum number sequences that must participate in the alignment. If greater than the number of input sequences, then all the sequences must participate. | |
-a | minimum aligned | Specify the minimum number of key positions (aligned columns) in the alignment. | |
-b | maximum aligned | Specify the maximum number of key positions (aligned columns) in the alignment. | |
Scoring Scheme | |||
-d | pseudocount | Specify a Dirichlet mixture file, which describes residues' tendencies to align with one another. | |
-D | pseudocount | Specify the deletion pseudocount. | |
-E | pseudocount | Specify the 'no-deletion' pseudocount. | |
-I | pseudocount | Specify the insertion pseudocount. | |
-J | pseudocount | Specify the 'no-insertion' pseudocount. | |
-q | weight | Specify the weight for generic versus sequence-set-specific residue abundances. The residue abundances are estimated by counting the residue types in all the input sequences, and adding pseudocounts. The total number of pseudocounts is equal to the alphabet size multiplied by the -q parameter. The allocation of pseudocounts among the residue types depends on the alphabet. | |
Search Algorithm | |||
-r | num | Specify the number of alignment runs. | |
-n | n | Specify how many iterations should pass since the highest-scoring alignment seen so far before ending each alignment run. | |
-w | aligned columns | Specify the initial number of key positions (aligned columns) in the alignment. If less than the minimum (-a) or greater than the maximum (-b), it is increased to the minimum or reduced to the maximum. | |
-t | temperature | Specify the initial temperature. | |
-c | cooling factor | Specify the cooling factor per n iterations. The temperature is multiplied by a constant factor after each iteration, such that after n iterations, it has dropped by this amount. The -n option is used to set the n value. | |
-u | minimum temperature | Specify the minimum temperature. The temperature never drops below this level, to avoid numerical problems. | |
-m | probability | Specify the rate of column sampling relative to site sampling. On each iteration, glam2 randomly decides to try either realigning a sequence or adjusting an aligned column (key position). This parameter sets the probabilities for this decision. | |
-x | 0|1|2 | Specify the site sampling algorithm: 0=FAST, 1=SLOW, 2=FFT. See GLAM2 methods (PDF) for details. In summary, the FAST algorithm deviates slightly from the strict definition of simulated annealing, but works well in practice. The SLOW algorithm implements strict simulated annealing, but is much slower, especially for longer sequences. The FFT algorithm also implements strict simulated annealing, and has intermediate speed, but carries greater risk of numerical roundoff error. In order to use the FFT algorithm, it is necessary to install the FFTW library, and re-compile glam2 with 'make glam2fft'. | |
-s | seed | Specify the seed for pseudo-random number generation. Change this to avoid getting identical results each time the program is run. | |
Cosmetic | |||
-p | Print information about the algorithm's state before each iteration. The following information is printed: the temperature, the number of key positions in the alignment, the number of sequences in the alignment, and the alignment's score. In addition, some information about each move is printed: for site sampling moves, which sequence is picked, and for column sampling moves, which column is picked, whether or not it is deleted, and whether the direction is left or right. | ||
-Q | Run quietly, suppressing unnecessary messages. |
GLAM2 was developed by Martin C Frith, working at the Computational Biology Research Center in Tokyo, and Timothy L Bailey, working at the Institute for Molecular Bioscience in Brisbane. The source code and documentation are hereby released into the public domain.