Usage:

cismapper [options] <locus_file> <rna_source>

Description

Input

<locus_file>

The name of a file containing chromosome locations (loci) of potential regulatory elements in BED format. Typically, these would be transcription factor (TF) peaks from a TF ChIP-seq experiment, output by a peak-caller such as MACS.

<rna_source>

The type of RNA expression data that you are providing. This must be one of "LongPap", "LongPapMouse", "LongPam", "Short" or "Cage". See below under option -expression-file-type for more information.

Output

CisMapper writes its output to files in a directory named cismapper_out, which it creates if necessary. You can change the output directory using the -o or -oc options. The directory will contain:

Note: See this detailed description of the CisMapper output formats for more information.

Options

Option Parameter Description Default Behavior
General Options
-tissues tissues A comma-separated list (no spaces) of tissue names that are the sources of the histone and expression data. These names are assumed to also be the names of the subfolders where the histone and expression data files are to be found by CisMapper. See below under options -histone-root and -expression-root for more information. The value of tissues is set equal to 'Ag04450,Gm12878,H1hesc,Helas3,Hepg2,Huvec,K562'.
-histone-root hrd The root directory containing the histone modification files. The files are assumed to be in ENCODE broadPeak format. The histone modification files should be subdirectories under the histone root directory, where each subfolder is named according to the tissue from which the data is taken. (See option -tissues, above.) The subdirectories should be named '<hrd>/<t>', where <t> is one of the tissue names in the comma-separated tissues list. The value of hrd is set to 'MappingData/Human/Histone'.
-histone-names hnames A comma-separated list (no spaces) of histone modification names. The histone modification file names must match '<hrd>/<t>/*<hname>*broadPeak', where <t> is one of the tissue names in the comma-separated tissues list, and <hname> is one of the histone names in the comma-separated hnames list. The value of hnames is set to 'H3k27ac,H3k4me3'.
-max-link-distances mlds A comma-separated list (no spaces) of maximum distances between a potential regulatory element (RE) and its target. Note: there must be one distance for each histone name in hnames, and each distance is used with the corresponding histone modification. The value of mlds is set to '500000,1000'.
-expression-root erd The root directory containing the RNA expression files. The files are assumed to be in The RNA expression files should be subdirectories under the expression root directory, where each subfolder is named according to the tissue from which the data is taken. (See option -tissues, above.) The subdirectories should be named '<erd>/<t>', where <t> is one of the tissue names in the comma-separated tissues list. The value of erd is set to 'MappingData/Human/Expression'.
-expression-file-type eft The file extension of the RNA expression files. The files should be in GTF format. The RNA expression file names must match '<erd>/<t>/<rna_source>.<eft>', where <t> is one of the tissue names in the comma-separated tissues list, and <rna_source> was specified by you on the command line. The value of eft is set to 'gtf'.
-annotation-file-name afile The name of an annotation file containing information on each of the genes and transcription start sites referenced in your RNA expression files. The annotation file should be in the variety of GTF format determined by option -annotation-type, below. The value of afile is set to
'MappingData/Human/gencode.v7.transcripts.gtf'.
-annotation-type GenCode|​RefSeq The type of annotation file that you are providing. GenCode
-transcript-types ttypes A comma-separated list (no spaces) of RNA transcript types. Only RNA expression data for these types of transcript will be used by CisMapper. The value of ttypes is set to
'protein_coding,processed_transcript'.
-min-feature-count mfc CisMapper will only consider links where there is both histone and expression data for at least this many tissues. 7
-min-max-expression mme CisMapper will only consider targets (genes or transcription start sites) that have whose maximum expression across all tissues is at least mme. Furthermore, there must also be at least a two-fold variation around the average expression of the target. 2
-max-html-score mhs Only links whose (unadjusted) score is less than or equal to mhs will be included in the HTML output of CisMapper. This threshold does not affect the other outputs of CisMapper (e.g., the TSV files). 0.05
-desc text Plain text description of this run of CisMapper, which is included in the HTML output file. No description is included in the HTML output file.
-fdesc file A plain text file containing a description of this run of CisMapper. This text will be included in the HTML output file. No description is included in the HTML output file.
-noecho Do not echo commands as the are run. Echo commands as they are run.
-nostatus Do not print progress reports to the terminal. Print progress reports to the terminal.

Citing

If you use CisMapper in your research, please cite the following paper:
Timothy L. Bailey and Philip Machanick, "Inferring direct DNA binding from ChIP-seq", Nucleic Acids Research, 40:e128, 2012. [full text]