Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA

Alexander R. Gawroński, Yen Yi Lin, Brian McConeghy, Stephane LeBihan, Hossein Asghari, Can Koçkan, Baraa Orabi, Nabil Adra, Roberto Pili, Colin C. Collins, S. Cenk Sahinalp, Faraz Hach

Research output: Contribution to journalArticle

Abstract

MOTIVATION: Cancer is a complex disease that involves rapidly evolving cells, often forming multiple distinct clones. In order to effectively understand progression of a patient-specific tumor, one needs to comprehensively sample tumor DNA at multiple time points, ideally obtained through inexpensive and minimally invasive techniques. Current sequencing technologies make the 'liquid biopsy' possible, which involves sampling a patient's blood or urine and sequencing the circulating cell free DNA (cfDNA). A certain percentage of this DNA originates from the tumor, known as circulating tumor DNA (ctDNA). The ratio of ctDNA may be extremely low in the sample, and the ctDNA may originate from multiple tumors or clones. These factors present unique challenges for applying existing tools and workflows to the analysis of ctDNA, especially in the detection of structural variations which rely on sufficient read coverage to be detectable. RESULTS: Here we introduce SViCT , a structural variation (SV) detection tool designed to handle the challenges associated with cfDNA analysis. SViCT can detect breakpoints and sequences of various structural variations including deletions, insertions, inversions, duplications and translocations. SViCT extracts discordant read pairs, one-end anchors and soft-clipped/split reads, assembles them into contigs, and re-maps contig intervals to a reference genome using an efficient k-mer indexing approach. The intervals are then joined using a combination of graph and greedy algorithms to identify specific structural variant signatures. We assessed the performance of SViCT and compared it to state-of-the-art tools using simulated cfDNA datasets with properties matching those of real cfDNA samples. The positive predictive value and sensitivity of our tool was superior to all the tested tools and reasonable performance was maintained down to the lowest dilution of 0.01% tumor DNA in simulated datasets. Additionally, SViCT was able to detect all known SVs in two real cfDNA reference datasets (at 0.6-5% ctDNA) and predict a novel structural variant in a prostate cancer cohort. AVAILABILITY: SViCT is available at https://github.com/vpc-ccg/svict. Contact:faraz.hach@ubc.ca.

Original languageEnglish (US)
Pages (from-to)e38
JournalNucleic acids research
Volume47
Issue number7
DOIs
StatePublished - Apr 23 2019

Fingerprint

DNA
Neoplasms
Clone Cells
Workflow
Prostatic Neoplasms
Urine
Genome
Technology
Biopsy
Datasets

ASJC Scopus subject areas

  • Genetics

Cite this

Gawroński, A. R., Lin, Y. Y., McConeghy, B., LeBihan, S., Asghari, H., Koçkan, C., ... Hach, F. (2019). Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA. Nucleic acids research, 47(7), e38. https://doi.org/10.1093/nar/gkz067

Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA. / Gawroński, Alexander R.; Lin, Yen Yi; McConeghy, Brian; LeBihan, Stephane; Asghari, Hossein; Koçkan, Can; Orabi, Baraa; Adra, Nabil; Pili, Roberto; Collins, Colin C.; Sahinalp, S. Cenk; Hach, Faraz.

In: Nucleic acids research, Vol. 47, No. 7, 23.04.2019, p. e38.

Research output: Contribution to journalArticle

Gawroński, AR, Lin, YY, McConeghy, B, LeBihan, S, Asghari, H, Koçkan, C, Orabi, B, Adra, N, Pili, R, Collins, CC, Sahinalp, SC & Hach, F 2019, 'Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA', Nucleic acids research, vol. 47, no. 7, pp. e38. https://doi.org/10.1093/nar/gkz067
Gawroński, Alexander R. ; Lin, Yen Yi ; McConeghy, Brian ; LeBihan, Stephane ; Asghari, Hossein ; Koçkan, Can ; Orabi, Baraa ; Adra, Nabil ; Pili, Roberto ; Collins, Colin C. ; Sahinalp, S. Cenk ; Hach, Faraz. / Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA. In: Nucleic acids research. 2019 ; Vol. 47, No. 7. pp. e38.
@article{0d2666927b924e40bf64bd62456f8e51,
title = "Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA",
abstract = "MOTIVATION: Cancer is a complex disease that involves rapidly evolving cells, often forming multiple distinct clones. In order to effectively understand progression of a patient-specific tumor, one needs to comprehensively sample tumor DNA at multiple time points, ideally obtained through inexpensive and minimally invasive techniques. Current sequencing technologies make the 'liquid biopsy' possible, which involves sampling a patient's blood or urine and sequencing the circulating cell free DNA (cfDNA). A certain percentage of this DNA originates from the tumor, known as circulating tumor DNA (ctDNA). The ratio of ctDNA may be extremely low in the sample, and the ctDNA may originate from multiple tumors or clones. These factors present unique challenges for applying existing tools and workflows to the analysis of ctDNA, especially in the detection of structural variations which rely on sufficient read coverage to be detectable. RESULTS: Here we introduce SViCT , a structural variation (SV) detection tool designed to handle the challenges associated with cfDNA analysis. SViCT can detect breakpoints and sequences of various structural variations including deletions, insertions, inversions, duplications and translocations. SViCT extracts discordant read pairs, one-end anchors and soft-clipped/split reads, assembles them into contigs, and re-maps contig intervals to a reference genome using an efficient k-mer indexing approach. The intervals are then joined using a combination of graph and greedy algorithms to identify specific structural variant signatures. We assessed the performance of SViCT and compared it to state-of-the-art tools using simulated cfDNA datasets with properties matching those of real cfDNA samples. The positive predictive value and sensitivity of our tool was superior to all the tested tools and reasonable performance was maintained down to the lowest dilution of 0.01{\%} tumor DNA in simulated datasets. Additionally, SViCT was able to detect all known SVs in two real cfDNA reference datasets (at 0.6-5{\%} ctDNA) and predict a novel structural variant in a prostate cancer cohort. AVAILABILITY: SViCT is available at https://github.com/vpc-ccg/svict. Contact:faraz.hach@ubc.ca.",
author = "Gawroński, {Alexander R.} and Lin, {Yen Yi} and Brian McConeghy and Stephane LeBihan and Hossein Asghari and Can Ko{\cc}kan and Baraa Orabi and Nabil Adra and Roberto Pili and Collins, {Colin C.} and Sahinalp, {S. Cenk} and Faraz Hach",
year = "2019",
month = "4",
day = "23",
doi = "10.1093/nar/gkz067",
language = "English (US)",
volume = "47",
pages = "e38",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "7",

}

TY - JOUR

T1 - Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA

AU - Gawroński, Alexander R.

AU - Lin, Yen Yi

AU - McConeghy, Brian

AU - LeBihan, Stephane

AU - Asghari, Hossein

AU - Koçkan, Can

AU - Orabi, Baraa

AU - Adra, Nabil

AU - Pili, Roberto

AU - Collins, Colin C.

AU - Sahinalp, S. Cenk

AU - Hach, Faraz

PY - 2019/4/23

Y1 - 2019/4/23

N2 - MOTIVATION: Cancer is a complex disease that involves rapidly evolving cells, often forming multiple distinct clones. In order to effectively understand progression of a patient-specific tumor, one needs to comprehensively sample tumor DNA at multiple time points, ideally obtained through inexpensive and minimally invasive techniques. Current sequencing technologies make the 'liquid biopsy' possible, which involves sampling a patient's blood or urine and sequencing the circulating cell free DNA (cfDNA). A certain percentage of this DNA originates from the tumor, known as circulating tumor DNA (ctDNA). The ratio of ctDNA may be extremely low in the sample, and the ctDNA may originate from multiple tumors or clones. These factors present unique challenges for applying existing tools and workflows to the analysis of ctDNA, especially in the detection of structural variations which rely on sufficient read coverage to be detectable. RESULTS: Here we introduce SViCT , a structural variation (SV) detection tool designed to handle the challenges associated with cfDNA analysis. SViCT can detect breakpoints and sequences of various structural variations including deletions, insertions, inversions, duplications and translocations. SViCT extracts discordant read pairs, one-end anchors and soft-clipped/split reads, assembles them into contigs, and re-maps contig intervals to a reference genome using an efficient k-mer indexing approach. The intervals are then joined using a combination of graph and greedy algorithms to identify specific structural variant signatures. We assessed the performance of SViCT and compared it to state-of-the-art tools using simulated cfDNA datasets with properties matching those of real cfDNA samples. The positive predictive value and sensitivity of our tool was superior to all the tested tools and reasonable performance was maintained down to the lowest dilution of 0.01% tumor DNA in simulated datasets. Additionally, SViCT was able to detect all known SVs in two real cfDNA reference datasets (at 0.6-5% ctDNA) and predict a novel structural variant in a prostate cancer cohort. AVAILABILITY: SViCT is available at https://github.com/vpc-ccg/svict. Contact:faraz.hach@ubc.ca.

AB - MOTIVATION: Cancer is a complex disease that involves rapidly evolving cells, often forming multiple distinct clones. In order to effectively understand progression of a patient-specific tumor, one needs to comprehensively sample tumor DNA at multiple time points, ideally obtained through inexpensive and minimally invasive techniques. Current sequencing technologies make the 'liquid biopsy' possible, which involves sampling a patient's blood or urine and sequencing the circulating cell free DNA (cfDNA). A certain percentage of this DNA originates from the tumor, known as circulating tumor DNA (ctDNA). The ratio of ctDNA may be extremely low in the sample, and the ctDNA may originate from multiple tumors or clones. These factors present unique challenges for applying existing tools and workflows to the analysis of ctDNA, especially in the detection of structural variations which rely on sufficient read coverage to be detectable. RESULTS: Here we introduce SViCT , a structural variation (SV) detection tool designed to handle the challenges associated with cfDNA analysis. SViCT can detect breakpoints and sequences of various structural variations including deletions, insertions, inversions, duplications and translocations. SViCT extracts discordant read pairs, one-end anchors and soft-clipped/split reads, assembles them into contigs, and re-maps contig intervals to a reference genome using an efficient k-mer indexing approach. The intervals are then joined using a combination of graph and greedy algorithms to identify specific structural variant signatures. We assessed the performance of SViCT and compared it to state-of-the-art tools using simulated cfDNA datasets with properties matching those of real cfDNA samples. The positive predictive value and sensitivity of our tool was superior to all the tested tools and reasonable performance was maintained down to the lowest dilution of 0.01% tumor DNA in simulated datasets. Additionally, SViCT was able to detect all known SVs in two real cfDNA reference datasets (at 0.6-5% ctDNA) and predict a novel structural variant in a prostate cancer cohort. AVAILABILITY: SViCT is available at https://github.com/vpc-ccg/svict. Contact:faraz.hach@ubc.ca.

UR - http://www.scopus.com/inward/record.url?scp=85064976833&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064976833&partnerID=8YFLogxK

U2 - 10.1093/nar/gkz067

DO - 10.1093/nar/gkz067

M3 - Article

C2 - 30759232

AN - SCOPUS:85064976833

VL - 47

SP - e38

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 7

ER -