Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies

Weixing Feng, Sen Zhao, Dingkai Xue, Fengfei Song, Ziwei Li, Duojiao Chen, Bo He, Yangyang Hao, Yadong Wang, Yunlong Liu

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Background: Ion Torrent and Ion Proton are semiconductor-based sequencing technologies that feature rapid sequencing speed and low upfront and operating costs, thanks to the avoidance of modified nucleotides and optical measurements. Despite of these advantages, however, Ion semiconductor sequencing technologies suffer much reduced sequencing accuracy at the genomic loci with homopolymer repeats of the same nucleotide. Such limitation significantly reduces its efficiency for the biological applications aiming at accurately identifying various genetic variants. Results: In this study, we propose a Bayesian inference-based method that takes the advantage of the signal distributions of the electrical voltages that are measured for all the homopolymers of a fixed length. By cross-referencing the length of homopolymers in the reference genome and the voltage signal distribution derived from the experiment, the proposed integrated model significantly improves the alignment accuracy around the homopolymer regions. Conclusions: Besides improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies with the proposed model, similar strategies can also be used on other high-throughput sequencing technologies that share similar limitations.

Original languageEnglish (US)
Article number521
JournalBMC Genomics
Volume17
DOIs
StatePublished - Aug 22 2016

Fingerprint

Semiconductors
Technology
Ions
Nucleotides
Protons
Genome
Costs and Cost Analysis

Keywords

  • Alignment
  • Bayesian
  • Homopolymer
  • Ion Torrent/Proton

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies. / Feng, Weixing; Zhao, Sen; Xue, Dingkai; Song, Fengfei; Li, Ziwei; Chen, Duojiao; He, Bo; Hao, Yangyang; Wang, Yadong; Liu, Yunlong.

In: BMC Genomics, Vol. 17, 521, 22.08.2016.

Research output: Contribution to journalArticle

Feng, Weixing ; Zhao, Sen ; Xue, Dingkai ; Song, Fengfei ; Li, Ziwei ; Chen, Duojiao ; He, Bo ; Hao, Yangyang ; Wang, Yadong ; Liu, Yunlong. / Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies. In: BMC Genomics. 2016 ; Vol. 17.
@article{d81dfccae9ed4594bf228f270ee6896d,
title = "Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies",
abstract = "Background: Ion Torrent and Ion Proton are semiconductor-based sequencing technologies that feature rapid sequencing speed and low upfront and operating costs, thanks to the avoidance of modified nucleotides and optical measurements. Despite of these advantages, however, Ion semiconductor sequencing technologies suffer much reduced sequencing accuracy at the genomic loci with homopolymer repeats of the same nucleotide. Such limitation significantly reduces its efficiency for the biological applications aiming at accurately identifying various genetic variants. Results: In this study, we propose a Bayesian inference-based method that takes the advantage of the signal distributions of the electrical voltages that are measured for all the homopolymers of a fixed length. By cross-referencing the length of homopolymers in the reference genome and the voltage signal distribution derived from the experiment, the proposed integrated model significantly improves the alignment accuracy around the homopolymer regions. Conclusions: Besides improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies with the proposed model, similar strategies can also be used on other high-throughput sequencing technologies that share similar limitations.",
keywords = "Alignment, Bayesian, Homopolymer, Ion Torrent/Proton",
author = "Weixing Feng and Sen Zhao and Dingkai Xue and Fengfei Song and Ziwei Li and Duojiao Chen and Bo He and Yangyang Hao and Yadong Wang and Yunlong Liu",
year = "2016",
month = "8",
day = "22",
doi = "10.1186/s12864-016-2894-9",
language = "English (US)",
volume = "17",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies

AU - Feng, Weixing

AU - Zhao, Sen

AU - Xue, Dingkai

AU - Song, Fengfei

AU - Li, Ziwei

AU - Chen, Duojiao

AU - He, Bo

AU - Hao, Yangyang

AU - Wang, Yadong

AU - Liu, Yunlong

PY - 2016/8/22

Y1 - 2016/8/22

N2 - Background: Ion Torrent and Ion Proton are semiconductor-based sequencing technologies that feature rapid sequencing speed and low upfront and operating costs, thanks to the avoidance of modified nucleotides and optical measurements. Despite of these advantages, however, Ion semiconductor sequencing technologies suffer much reduced sequencing accuracy at the genomic loci with homopolymer repeats of the same nucleotide. Such limitation significantly reduces its efficiency for the biological applications aiming at accurately identifying various genetic variants. Results: In this study, we propose a Bayesian inference-based method that takes the advantage of the signal distributions of the electrical voltages that are measured for all the homopolymers of a fixed length. By cross-referencing the length of homopolymers in the reference genome and the voltage signal distribution derived from the experiment, the proposed integrated model significantly improves the alignment accuracy around the homopolymer regions. Conclusions: Besides improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies with the proposed model, similar strategies can also be used on other high-throughput sequencing technologies that share similar limitations.

AB - Background: Ion Torrent and Ion Proton are semiconductor-based sequencing technologies that feature rapid sequencing speed and low upfront and operating costs, thanks to the avoidance of modified nucleotides and optical measurements. Despite of these advantages, however, Ion semiconductor sequencing technologies suffer much reduced sequencing accuracy at the genomic loci with homopolymer repeats of the same nucleotide. Such limitation significantly reduces its efficiency for the biological applications aiming at accurately identifying various genetic variants. Results: In this study, we propose a Bayesian inference-based method that takes the advantage of the signal distributions of the electrical voltages that are measured for all the homopolymers of a fixed length. By cross-referencing the length of homopolymers in the reference genome and the voltage signal distribution derived from the experiment, the proposed integrated model significantly improves the alignment accuracy around the homopolymer regions. Conclusions: Besides improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies with the proposed model, similar strategies can also be used on other high-throughput sequencing technologies that share similar limitations.

KW - Alignment

KW - Bayesian

KW - Homopolymer

KW - Ion Torrent/Proton

UR - http://www.scopus.com/inward/record.url?scp=84983292126&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84983292126&partnerID=8YFLogxK

U2 - 10.1186/s12864-016-2894-9

DO - 10.1186/s12864-016-2894-9

M3 - Article

C2 - 27556417

AN - SCOPUS:84983292126

VL - 17

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 521

ER -