GLAM2Scan

<alphabet>

The alphabet of the motif and sequences. This can be 'p' for protein sequences, 'n' for nucleotide sequences or the name of an GLAM2 alphabet file.

<glam2 motif file>

The name of a file containing a GLAM2 motif. If the file contains multiple motifs then GLAM2SCAN only considers the top one. The file may be either GLAM2's plain text (glam2.txt) or HTML (glam2.html) output.

The minimal component of the GLAM2 motif is shown in this GLAM2 DNA motif and GLAM2 protein motif.

<sequence file>

The name of a file containing FASTA formatted sequences.

GLAM2Scan writes its output to files in a directory named glam2scan_out, which it creates if necessary. You can change the output directory using the -o or -O options. The directory will contain:

glam2scan.html - an HTML file containing the results in a human-readable format
glam2scan.txt - a plain text file containing the results (the HTML format contains slightly less information but is easier to read)

Output begins with some general information stating the program name, version and the actual command line:

  GLAM2scan
  Version 9999

  glam2scan p prot_motif.glam2 lotsa_prots.fa

This is followed by motif matches, sorted in order of score. A motif match looks like this:

NAME	START	SITE	END	STRAND	SCORE
		**.*****..*****.*
At1g01140.2_SnRK3.12	196	YDGAAADVWSCGVI.FVLMAGYLPFDEPN	223	+	66.9

The name of the sequence with the match appears on the left; the start and end coordinates of the match appear on either side of the matching sequence; the match score appears on the right. The plus sign indicates the strand of the match (only meaningful when considering both strands of nucleotide sequences with the -2 option). The stars indicate the key positions of the motif: the alignment of the match to the key positions is shown. (The HTML format output does not include the stars over the matched sequence.)

Option	Parameter	Description	Default Behavior
Basic Options
-t		Output in text format only to standard output.	The program behaves as if `-O glam2scan_out` had been specified.
-n	n	Report n matches. If scores are tied, they are sorted in alphabetical order of sequence name. If sequence names are also identical, the order is arbitrary.
-2		Search both strands of nucleotide sequences.
Advanced Options
The remaining options are somewhat specialized. For typical usage, it is reasonable to set them to exactly the same values as were used with glam2 to discover the motif.
-D	pseudocount	Specify the deletion pseudocount.
-E	pseudocount	Specify the 'no-deletion' pseudocount.
-I	pseudocount	Specify the insertion pseudocount.
-J	pseudocount	Specify the 'no-insertion' pseudocount.
-d	file	Specify a Dirichlet mixture file.

Some users may wish to make 'fake' glam2 motifs for input to glam2scan, for instance based on motifs found by other tools. Most of the glam2 output is ignored by glam2scan, and a minimal motif file looks like this:

                  **..****
  seq1         10 HP..D.IG
  seq2          5 HPGADLIG
  seq3          7 HP..ELIG
  seq4          5 HP..ELLA

The sequence names and coordinates are ignored, but some placeholder characters should be present. The stars indicating key positions are necessary, and the first and last columns must be starred.

The MEME Suite

Motif-based sequence analysis tools

Scanning with Gapped Motifs

Usage:

Description

Input

<alphabet>

<glam2 motif file>

<sequence file>

Output

Options

Motif format

Citing