# Bioinformatic Components

The 4 main sections of the pipeline are:

  1. Read alignment
  2. Somatic variant detection
  3. Germline variant detection
  4. Quality Ccontrol

Additionally, various QC metrics are generated. Below are described the separate modules tools used. The following diagram outlines the workflow:

Note: The pipeline can be run with already-aligned BAM files as input, which avoids the first of these three modules.

# Read Alignment

Tempo accepts as input sequencing reads from one or multiple FASTQ file pairs (corresponding to separate sequencing lanes) per sample, as described. These are aligned against the human genome reference using common practices, which include:

# Somatic Analyses

# Germline Analyses

# Quality Control

# MultiQC Report

A combined MultiQC report is produced at the BAM level, somatic level and cohort level of analysis, highlighting QC metrics and high-level summaries produced by Tempo. Documentation for the QC tool can be found here (opens new window).

The Tempo team has internally derived pass/warn/fail thresholds to present in the QC report. This should be used with discretion by the analyst, as a failure in the QC report may not be a true failure, and so on. By default the following metrics are assessed and a Status value is produced in the report:

  • Tumor_Contamination : Estimates percentage of cross-individual contamination in the tumor sample. Cross-contamination may be higher if the sample comes from the recipient of allograft tissue.
  • Normal_Contamination : Estimates percentage of cross-individual contamination in the normal sample. Cross-contamination may be higher if the sample comes from the recipient of allograft tissue.
  • Concordance : measures likelihood of samples coming from the same individual. Low values may indicate sample swap or contamination.
  • facets_qc and purity : Report by facets-preview (opens new window). A facets_qc value of FALSE indicates that the pair has failed Facets QC.
  • Fold Enrichment : The fold by which the baited region has been amplified above genomic background. A low metric could indicate inefficiency of the bait selection kit during sample preparation.
  • __ Target Bases 50X__ : The fraction of all target bases achieving 50X or greater coverage. A low metric could indicate insufficient coverage.