Skip to main content

Submit Help Request

Launch GNomEx

Launch CORE Browser

Must use campus WiFi or VPN to access

Launch cBioPortal

Must use campus WiFi or VPN to access

Variant detection of somatic mutations is best performed when both tumor and normal (germline) samples are available. Sequence may be obtained from whole genome, exome capture, or small gene panels, where only the exons of clinically relevant genes are sequenced. Adequate sequencing depth is paramount to detecting somatic variants, especially when tumor heterogeneity or clonality is concerned. A target of >=80x over 95% of target bps is needed to confidently detect somatic variants at a 10% allele frequency. Likewise, for the matched normal, a target coverage >=60x is needed to remove both germline variants and sequencing artifacts. The 80x and 60x refer to unique observations after removing duplicate reads and overlapping paired end base pairs, not mean coverage.

To analyze these datasets, the Cancer Bioinformatics (CBI) Shared Resource has developed more than a dozen workflows for the detection, annotation and prioritization of germline and somatic variants from tumor and tumor-normal DNASeq datasets.

Tumor Normal Workflows:

  • Dna Align QC – BWA mem Hg38 alignments, GATK recalibration/realignment, unique observation read coverage estimation and comprehensive data quality control
  • Somatic Caller – Optimized Manta/Strelka2 somatic variant calling, background error estimation, and recall frequency annotation
  • GATK Somatic - Optimized somatic variant calling using GATK's Mutect2 application
  • GATK Haplotyping and JointGenotyping – GATK’s joint genotyping, vt normalization, and filtering for germline variants
  • Annotator – SnpEff/CLINVAR/DbNSFP/Splice annotators and identification of ACMG incidental germline findings
  • Copy Analysis – Copy number variation analysis with optimized copy ratio GATK 4.0 tools
  • Rna Align QC – RNASeq transcriptome analysis with STAR and Picard QC tools
  • Sample Concordance – Sample concordance using DNASeq and RNASeq bam/cram alignment files
  • MSI - Microsatellite instability estimation using Mantis
  • LoH - Identification of regions displaying Loss of Heterozygosity
  • Tempus/ Caris/ Invitae/ Avatar - Workflows for processing patient derived test results and integrating them with HCIs large Patient Molecular Repository

These workflows utilize Docker/ Singularity containers running the Snakemake workflow manager to enable portability, reproducibility, and ease of use. Although designed to be run individually, the USeq tool TNRunner2 has been developed to launch each workflow in CHPC’s protected environment for processing of hundreds of patient sample datasets in parallel.

Somatic Variant Identification Workflow Visualization