Installing cisCall on Your Computer
In this section, we explain the manual installation of cisCall environments on your computer without using the VirtualBox image, though we recommend trying the image first.
1. cisCall Source Code
2. Required Environments for Running cisCall
OS:
- Linux. The CPU architecture must be x86_64; otherwise, samtools and bwa must be compiled by you.
The following environments are required:
- Bash
- Java (openjdk 1.8.0=<)
- Perl (5=<)
- Python (3.4=<)
- R (3.2.2=<)
Since python uses rpy2 for calling R functions, python and R must be compiled from the source files with the following options:
- python ...
./configure --enable-shared --enable-loadable-sqlite-extensions
- R ...
./configure --enable-R-shlib --with-libpng
These options should be given for the ./configure
command when compiling python and R from the source files.
3. Required Packages
To run cisCall, the following packages are also required.
- R packages:
- KernSmooth
- MASS
- RPMM
- VGAM
- exactRankTests
- coin
- parallel
- grDevices
- cluster
- Python packages:
- rpy2
- six
- scipy
- numpy
- Perl packages:
- Math::CDF
4. Packages Prepared in cisCall
We have prepared the following packages in cisCall:
- BWA (0.7.10)
- Samtools (0.1.18)
- bigWigToBedGraph
Therefore, these do not need to be installed unless a specific version is desired or cisCall is planned to be used on an OS other than x86_64 Linux. To use self-compiled BWA and Samtools, their command paths must be changed in the cisCall parameter settings. See User-Prepared Data section.
5. Prerequisite Files
SAMPLE_NAME
: target (foreground) sample nameBG_SAMPLE_NAME
: control (background) sample nameGROUP
: group name to identify a group of input files in the execution directory
Use the above names for the following prerequisite files, and place the files under the cisCall execution directory.
The execution directories for cisMuton, cisFusion, and cisCton must be separate.
*Note: The files listed in the following table are based on hg19.
No. | File Type | cisMuton/cisFusion | cisCton |
---|---|---|---|
1 | ${SAMPLE_NAME}.${GROUP}.fastq.R{1,2}.gz |
Required | - |
2 | ${BG_SAMPLE_NAME}.${GROUP}.fastq.R{1,2}.gz |
Required | - |
3 | tmp/${SAMPLE_NAME}.${GROUP}.MUTON.TARGET/common.fastq.aln.bam |
- | Required |
4 | tmp/${BG_SAMPLE_NAME}.${GROUP}.MUTON.TARGET/common.fastq.aln.bam |
- | Required |
5 | ${GROUP}.target.bed |
Required | - |
6 | ${GROUP}.gene.bed |
Required | - |
7 | ${GROUP}.bed |
- | Required |
8 | ${GROUP}.fusion.bed |
Required | Required (rename to target_fusion.txt ) |
9 | ${GROUP}.fasta |
Required | Required |
10 | ${GROUP}.refGene.txt.gz |
Required (${GROUP}.refGene.txt.gz ) |
Required (${GROUP}.refGene.txt ) |
11 | ${GROUP}.geneName.txt.gz |
Required | - |
12 | ${GROUP}.DBexome |
Required | - |
13 | ${GROUP}.snp.dbsnp |
Required | - |
14 | ${GROUP}.genomicSuperDups.txt |
Required | - |
15 | ${GROUP}.rmsk |
Required | - |
16 | ${GROUP}.simpleRepeat |
Required | - |
17 | wgEncodeDukeMapabilityUniqueness20bp.bigWig |
- | Required |
18 | ciscall.config |
- | Reqruied |
For the sample test data, all these prerequisite files are included in the VirtualBox image.
5.1. Explanations of Each File
(1) ${SAMPLE_NAME}.${GROUP}.fastq.R{1,2}.gz
- .fastq files of the target (foreground) sample to be analyzed.
- For a .fastq file of single-end sequence reads, change the file name to
${SAMPLE_NAME}.${GROUP}.fastq.gz
.
(2) ${BG_SAMPLE_NAME}.${GROUP}.fastq.R{1,2}.gz
- .fastq files of the control (background) sample to be used as a control for calling mutations.
- For a .fastq file of single-end sequence reads, change the file name to
${BG_SAMPLE_NAME}.${GROUP}.fastq.gz
.
(3) tmp/${SAMPLE_NAME}.${GROUP}.MUTON.TARGET/common.fastq.aln.bam
- .bam file of the target (foreground) sample generated by cisMuton.
- For FFPE samples, the file name
common.fastq.aln.bam
may slightly differ such ascommon.fastq.aln.none.bam.dup1331.bam
andcommon.fastq.aln.bam.dup1110.bam
, depending on the settings of cisMuton.
(4) tmp/${BG_SAMPLE_NAME}.${GROUP}.MUTON.TARGET/common.fastq.aln.bam
- .bam file of the control (background) .bam file generated by cisMuton.
- For FFPE samples, the file name
common.fastq.aln.bam
may slightly differ such ascommon.fastq.aln.none.bam.dup1331.bam
andcommon.fastq.aln.bam.dup1110.bam
, depending on the settings of cisMuton.
(5) ${GROUP}.target.bed
- .bed file of target-capture regions for cisMuton/cisFusion.
- Format:
- Tab-delimited file without a header in ascending order by start position
- Columns: [Chromosome number], [start position (1-based)], [end position (1-based)]
- All items after the third column are ignored.
(6) ${GROUP}.gene.bed
.bed file of all gene regions for cisMuton/cisFusion.
Format:
- Tab-delimited file without a header in ascending order by start position
- Columns: [Chromosome number], [start position (1-based)], [end position (1-based)], [gene name]
(7) ${GROUP}.bed
- .bed file of target-capture regions for cisCton.
- Format:
- Tab-delimited file without a header in ascending order by start position
- Columns: [Chromosome number], [start position (1-based)], [end position (1-based)], [gene name]
- Gene names in the gene name column must be included in
${GROUP}.refGene.txt.gz
. - cisCton defines amplifications or deletions using predefined "baseline regions."
Entries (lines) that start with
chr
in the gene name column are interpreted as baseline regions. If baseline regions are not explicitly defined, cisCton uses all target regions as baseline regions.
(8) ${GROUP}.fusion.bed
- .bed file regions that include the possible breakpoints of fusion genes.
- All cisCall modules require this file.
- Name
target_fusion.txt
for cisCton. - Format:
- Tab-delimited file without a header in ascending order by start position
- Columns: [Chromosome number], [start position (1-based)], [end position (1-based)], [gene name]
(9) ${GROUP}.fasta
- Source: hg19 genome.
- Concatenate into a single .fasta in ascending order by start position.
- The following index files for a reference genome are also required to run cisCall.
${GROUP}.fasta.fai
:- All cisCall modules (cisMuton, cisFusion, and cisCton) require this file
- Samtools included in cisCall can generate this with
samtools faidx ${GROUP}.fasta
.
${GROUP}.fasta.ann
,${GROUP}.fasta.amb
,${GROUP}.fasta.bwt
,${GROUP}.fasta.pac
,${GROUP}.fasta.sa
:- cisMuton and cisFusion require these files
- BWA included in cisCall can generate these with
bwa index ${GROUP}.fasta
(10) ${GROUP}.refGene.txt.gz
- Source: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz
- gz for cisFusion/cisMuton, gunzip for cisCton.
(11) ${GROUP}.geneName.txt.gz
(12) ${GROUP}.DBexome
(13) ${GROUP}.snp.dbsnp
- Source: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp138.txt.gz
- Run gunzip and edit the file extension from
.txt
to.snp.dbsnp
.
(14) ${GROUP}.genomicSuperDups.txt
- Source: http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/genomicSuperDups.txt.gz
- Run gunzip.
(15) ${GROUP}.rmsk
- Source: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/rmsk.txt.gz
- Run gunzip and remove the file extension
.txt
.
(16) ${GROUP}.simpleRepeat
- Source: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/simpleRepeat.txt.gz
- Run gunzip and remove file extension
.txt
.
(17) wgEncodeDukeMapabilityUniqueness20bp.bigWig
(18) cisCall.config
- Source: included in cisCton modules.
- Log setting file for cisCton.
- Copy these files into the cisCton execution directory.