Usage:

glam2 [options] <alphabet> <sequence file>

Description

Inputs

<alphabet>

The alphabet of the sequences. This can be 'p' for protein sequences, 'n' for nucleotide sequences or the name of a GLAM2 alphabet file.

<sequence file>

A file containing FASTA formatted sequences.

Outputs

Outputs

GLAM2 writes its output to files in a directory named glam2_out, which it creates if necessary. You can change the output directory using the -o or -O options. The directory will contain:

See the tutorial for a detailed description of GLAM2's output.

Warnings

GLAM2 might occasionally issue this warning: 'accuracy loss due to numeric underflow'. If this happens, its ability to optimize the alignment may be harmed: see GLAM2 methods (PDF) for details. To fix this, try increasing -u, or possibly decreasing -b. Increasing -u somewhat should be harmless, as long as it stays well below 1.

Options

Option Parameter Description Default Behavior
Input/Output
Alignment Options
-2 Search both strands.
-zminimum sites Specify the minimum number sequences that must participate in the alignment. If greater than the number of input sequences, then all the sequences must participate.
-aminimum aligned Specify the minimum number of key positions (aligned columns) in the alignment.
-bmaximum aligned Specify the maximum number of key positions (aligned columns) in the alignment.
Scoring Scheme
-dpseudocount Specify a Dirichlet mixture file, which describes residues' tendencies to align with one another.
-Dpseudocount Specify the deletion pseudocount.
-Epseudocount Specify the 'no-deletion' pseudocount.
-Ipseudocount Specify the insertion pseudocount.
-Jpseudocount Specify the 'no-insertion' pseudocount.
-qweight Specify the weight for generic versus sequence-set-specific residue abundances. The residue abundances are estimated by counting the residue types in all the input sequences, and adding pseudocounts. The total number of pseudocounts is equal to the alphabet size multiplied by the -q parameter. The allocation of pseudocounts among the residue types depends on the alphabet.
Search Algorithm
-rnum Specify the number of alignment runs.
-nn Specify how many iterations should pass since the highest-scoring alignment seen so far before ending each alignment run.
-waligned columns Specify the initial number of key positions (aligned columns) in the alignment. If less than the minimum (-a) or greater than the maximum (-b), it is increased to the minimum or reduced to the maximum.
-ttemperature Specify the initial temperature.
-ccooling factor Specify the cooling factor per n iterations. The temperature is multiplied by a constant factor after each iteration, such that after n iterations, it has dropped by this amount. The -n option is used to set the n value.
-uminimum temperature Specify the minimum temperature. The temperature never drops below this level, to avoid numerical problems.
-mprobability Specify the rate of column sampling relative to site sampling. On each iteration, glam2 randomly decides to try either realigning a sequence or adjusting an aligned column (key position). This parameter sets the probabilities for this decision.
-x0|1|2 Specify the site sampling algorithm: 0=FAST, 1=SLOW, 2=FFT. See GLAM2 methods (PDF) for details. In summary, the FAST algorithm deviates slightly from the strict definition of simulated annealing, but works well in practice. The SLOW algorithm implements strict simulated annealing, but is much slower, especially for longer sequences. The FFT algorithm also implements strict simulated annealing, and has intermediate speed, but carries greater risk of numerical roundoff error. In order to use the FFT algorithm, it is necessary to install the FFTW library, and re-compile glam2 with 'make glam2fft'.
-sseed Specify the seed for pseudo-random number generation. Change this to avoid getting identical results each time the program is run.
Cosmetic
-p  Print information about the algorithm's state before each iteration. The following information is printed: the temperature, the number of key positions in the alignment, the number of sequences in the alignment, and the alignment's score. In addition, some information about each move is printed: for site sampling moves, which sequence is picked, and for column sampling moves, which column is picked, whether or not it is deleted, and whether the direction is left or right.
-Q  Run quietly, suppressing unnecessary messages.

About

GLAM2 was developed by Martin C Frith, working at the Computational Biology Research Center in Tokyo, and Timothy L Bailey, working at the Institute for Molecular Bioscience in Brisbane. The source code and documentation are hereby released into the public domain.

Citing