RNA-Seq mapping and detection of gene fusions with a suffix array algorithm

Onur Sakarya, Heinz Breu, Milan Radovich, Yongzhi Chen, Yulei N. Wang, Catalin Barbacioru, Sowmi Utiramerur, Penn P. Whitley, Joel P. Brockman, Paolo Vatta, Zheng Zhang, Liviu Popescu, Matthew W. Muller, Vidya Kudlingar, Nriti Garg, Chieh Yuan Li, Benjamin S. Kong, John P. Bodeau, Robert C. Nutter, Jian Gu & 4 others Kelli S. Bramlett, Jeffrey K. Ichikawa, Fiona C. Hyland, Asim S. Siddiqui

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

High-throughput RNA sequencing enables quantification of transcripts (both known and novel), exon/exon junctions and fusions of exons from different genes. Discovery of gene fusions-particularly those expressed with low abundance- is a challenge with short- and medium-length sequencing reads. To address this challenge, we implemented an RNA-Seq mapping pipeline within the LifeScope software. We introduced new features including filter and junction mapping, annotation-aided pairing rescue and accurate mapping quality values. We combined this pipeline with a Suffix Array Spliced Read (SASR) aligner to detect chimeric transcripts. Performing paired-end RNA-Seq of the breast cancer cell line MCF-7 using the SOLiD system, we called 40 gene fusions among over 120,000 splicing junctions. We validated 36 of these 40 fusions with TaqMan assays, of which 25 were expressed in MCF-7 but not the Human Brain Reference. An intra-chromosomal gene fusion involving the estrogen receptor alpha gene ESR1, and another involving the RPS6KB1 (Ribosomal protein S6 kinase beta-1) were recurrently expressed in a number of breast tumor cell lines and a clinical tumor sample.

Original languageEnglish
Article numbere1002464
JournalPLoS Computational Biology
Volume8
Issue number4
DOIs
StatePublished - Apr 2012

Fingerprint

Suffix Array
gene fusion
Gene Fusion
RNA
exons
Exons
Fusion
Fusion reactions
Genes
Gene
breast neoplasms
gene
cell lines
Breast Neoplasms
High-Throughput Nucleotide Sequencing
Estrogen Receptor alpha
tumor
ribosomal proteins
Sequencing
Tumors

ASJC Scopus subject areas

  • Cellular and Molecular Neuroscience
  • Ecology
  • Molecular Biology
  • Genetics
  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Computational Theory and Mathematics

Cite this

Sakarya, O., Breu, H., Radovich, M., Chen, Y., Wang, Y. N., Barbacioru, C., ... Siddiqui, A. S. (2012). RNA-Seq mapping and detection of gene fusions with a suffix array algorithm. PLoS Computational Biology, 8(4), [e1002464]. https://doi.org/10.1371/journal.pcbi.1002464

RNA-Seq mapping and detection of gene fusions with a suffix array algorithm. / Sakarya, Onur; Breu, Heinz; Radovich, Milan; Chen, Yongzhi; Wang, Yulei N.; Barbacioru, Catalin; Utiramerur, Sowmi; Whitley, Penn P.; Brockman, Joel P.; Vatta, Paolo; Zhang, Zheng; Popescu, Liviu; Muller, Matthew W.; Kudlingar, Vidya; Garg, Nriti; Li, Chieh Yuan; Kong, Benjamin S.; Bodeau, John P.; Nutter, Robert C.; Gu, Jian; Bramlett, Kelli S.; Ichikawa, Jeffrey K.; Hyland, Fiona C.; Siddiqui, Asim S.

In: PLoS Computational Biology, Vol. 8, No. 4, e1002464, 04.2012.

Research output: Contribution to journalArticle

Sakarya, O, Breu, H, Radovich, M, Chen, Y, Wang, YN, Barbacioru, C, Utiramerur, S, Whitley, PP, Brockman, JP, Vatta, P, Zhang, Z, Popescu, L, Muller, MW, Kudlingar, V, Garg, N, Li, CY, Kong, BS, Bodeau, JP, Nutter, RC, Gu, J, Bramlett, KS, Ichikawa, JK, Hyland, FC & Siddiqui, AS 2012, 'RNA-Seq mapping and detection of gene fusions with a suffix array algorithm', PLoS Computational Biology, vol. 8, no. 4, e1002464. https://doi.org/10.1371/journal.pcbi.1002464
Sakarya, Onur ; Breu, Heinz ; Radovich, Milan ; Chen, Yongzhi ; Wang, Yulei N. ; Barbacioru, Catalin ; Utiramerur, Sowmi ; Whitley, Penn P. ; Brockman, Joel P. ; Vatta, Paolo ; Zhang, Zheng ; Popescu, Liviu ; Muller, Matthew W. ; Kudlingar, Vidya ; Garg, Nriti ; Li, Chieh Yuan ; Kong, Benjamin S. ; Bodeau, John P. ; Nutter, Robert C. ; Gu, Jian ; Bramlett, Kelli S. ; Ichikawa, Jeffrey K. ; Hyland, Fiona C. ; Siddiqui, Asim S. / RNA-Seq mapping and detection of gene fusions with a suffix array algorithm. In: PLoS Computational Biology. 2012 ; Vol. 8, No. 4.
@article{d98f77d1a99e4e3c890249803527cfdd,
title = "RNA-Seq mapping and detection of gene fusions with a suffix array algorithm",
abstract = "High-throughput RNA sequencing enables quantification of transcripts (both known and novel), exon/exon junctions and fusions of exons from different genes. Discovery of gene fusions-particularly those expressed with low abundance- is a challenge with short- and medium-length sequencing reads. To address this challenge, we implemented an RNA-Seq mapping pipeline within the LifeScope software. We introduced new features including filter and junction mapping, annotation-aided pairing rescue and accurate mapping quality values. We combined this pipeline with a Suffix Array Spliced Read (SASR) aligner to detect chimeric transcripts. Performing paired-end RNA-Seq of the breast cancer cell line MCF-7 using the SOLiD system, we called 40 gene fusions among over 120,000 splicing junctions. We validated 36 of these 40 fusions with TaqMan assays, of which 25 were expressed in MCF-7 but not the Human Brain Reference. An intra-chromosomal gene fusion involving the estrogen receptor alpha gene ESR1, and another involving the RPS6KB1 (Ribosomal protein S6 kinase beta-1) were recurrently expressed in a number of breast tumor cell lines and a clinical tumor sample.",
author = "Onur Sakarya and Heinz Breu and Milan Radovich and Yongzhi Chen and Wang, {Yulei N.} and Catalin Barbacioru and Sowmi Utiramerur and Whitley, {Penn P.} and Brockman, {Joel P.} and Paolo Vatta and Zheng Zhang and Liviu Popescu and Muller, {Matthew W.} and Vidya Kudlingar and Nriti Garg and Li, {Chieh Yuan} and Kong, {Benjamin S.} and Bodeau, {John P.} and Nutter, {Robert C.} and Jian Gu and Bramlett, {Kelli S.} and Ichikawa, {Jeffrey K.} and Hyland, {Fiona C.} and Siddiqui, {Asim S.}",
year = "2012",
month = "4",
doi = "10.1371/journal.pcbi.1002464",
language = "English",
volume = "8",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "4",

}

TY - JOUR

T1 - RNA-Seq mapping and detection of gene fusions with a suffix array algorithm

AU - Sakarya, Onur

AU - Breu, Heinz

AU - Radovich, Milan

AU - Chen, Yongzhi

AU - Wang, Yulei N.

AU - Barbacioru, Catalin

AU - Utiramerur, Sowmi

AU - Whitley, Penn P.

AU - Brockman, Joel P.

AU - Vatta, Paolo

AU - Zhang, Zheng

AU - Popescu, Liviu

AU - Muller, Matthew W.

AU - Kudlingar, Vidya

AU - Garg, Nriti

AU - Li, Chieh Yuan

AU - Kong, Benjamin S.

AU - Bodeau, John P.

AU - Nutter, Robert C.

AU - Gu, Jian

AU - Bramlett, Kelli S.

AU - Ichikawa, Jeffrey K.

AU - Hyland, Fiona C.

AU - Siddiqui, Asim S.

PY - 2012/4

Y1 - 2012/4

N2 - High-throughput RNA sequencing enables quantification of transcripts (both known and novel), exon/exon junctions and fusions of exons from different genes. Discovery of gene fusions-particularly those expressed with low abundance- is a challenge with short- and medium-length sequencing reads. To address this challenge, we implemented an RNA-Seq mapping pipeline within the LifeScope software. We introduced new features including filter and junction mapping, annotation-aided pairing rescue and accurate mapping quality values. We combined this pipeline with a Suffix Array Spliced Read (SASR) aligner to detect chimeric transcripts. Performing paired-end RNA-Seq of the breast cancer cell line MCF-7 using the SOLiD system, we called 40 gene fusions among over 120,000 splicing junctions. We validated 36 of these 40 fusions with TaqMan assays, of which 25 were expressed in MCF-7 but not the Human Brain Reference. An intra-chromosomal gene fusion involving the estrogen receptor alpha gene ESR1, and another involving the RPS6KB1 (Ribosomal protein S6 kinase beta-1) were recurrently expressed in a number of breast tumor cell lines and a clinical tumor sample.

AB - High-throughput RNA sequencing enables quantification of transcripts (both known and novel), exon/exon junctions and fusions of exons from different genes. Discovery of gene fusions-particularly those expressed with low abundance- is a challenge with short- and medium-length sequencing reads. To address this challenge, we implemented an RNA-Seq mapping pipeline within the LifeScope software. We introduced new features including filter and junction mapping, annotation-aided pairing rescue and accurate mapping quality values. We combined this pipeline with a Suffix Array Spliced Read (SASR) aligner to detect chimeric transcripts. Performing paired-end RNA-Seq of the breast cancer cell line MCF-7 using the SOLiD system, we called 40 gene fusions among over 120,000 splicing junctions. We validated 36 of these 40 fusions with TaqMan assays, of which 25 were expressed in MCF-7 but not the Human Brain Reference. An intra-chromosomal gene fusion involving the estrogen receptor alpha gene ESR1, and another involving the RPS6KB1 (Ribosomal protein S6 kinase beta-1) were recurrently expressed in a number of breast tumor cell lines and a clinical tumor sample.

UR - http://www.scopus.com/inward/record.url?scp=84861141246&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861141246&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1002464

DO - 10.1371/journal.pcbi.1002464

M3 - Article

VL - 8

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 4

M1 - e1002464

ER -