Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes

Ibrahim Numanagić, Salem Malikić, Michael Ford, Xiang Qin, Lorraine Toji, Milan Radovich, Todd Skaar, Victoria M. Pratt, Bonnie Berger, Steve Scherer, S. Cenk Sahinalp

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

High-throughput sequencing provides the means to determine the allelic decomposition for any gene of interest - the number of copies and the exact sequence content of each copy of a gene. Although many clinically and functionally important genes are highly polymorphic and have undergone structural alterations, no high-throughput sequencing data analysis tool has yet been designed to effectively solve the full allelic decomposition problem. Here we introduce a combinatorial optimization framework that successfully resolves this challenging problem, including for genes with structural alterations. We provide an associated computational tool Aldy that performs allelic decomposition of highly polymorphic, multi-copy genes through using whole or targeted genome sequencing data. For a large diverse sequencing data set, Aldy identifies multiple rare and novel alleles for several important pharmacogenes, significantly improving upon the accuracy and utility of current genotyping assays. As more data sets become available, we expect Aldy to become an essential component of genotyping toolkits.

Original languageEnglish (US)
Article number828
JournalNature Communications
Volume9
Issue number1
DOIs
StatePublished - Dec 1 2018

Fingerprint

sequencing
genes
Genes
Decomposition
decomposition
Gene Dosage
Throughput
genome
Alleles
Combinatorial optimization
Genome
Assays
optimization
Datasets

ASJC Scopus subject areas

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Cite this

Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes. / Numanagić, Ibrahim; Malikić, Salem; Ford, Michael; Qin, Xiang; Toji, Lorraine; Radovich, Milan; Skaar, Todd; Pratt, Victoria M.; Berger, Bonnie; Scherer, Steve; Sahinalp, S. Cenk.

In: Nature Communications, Vol. 9, No. 1, 828, 01.12.2018.

Research output: Contribution to journalArticle

Numanagić, I, Malikić, S, Ford, M, Qin, X, Toji, L, Radovich, M, Skaar, T, Pratt, VM, Berger, B, Scherer, S & Sahinalp, SC 2018, 'Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes', Nature Communications, vol. 9, no. 1, 828. https://doi.org/10.1038/s41467-018-03273-1
Numanagić, Ibrahim ; Malikić, Salem ; Ford, Michael ; Qin, Xiang ; Toji, Lorraine ; Radovich, Milan ; Skaar, Todd ; Pratt, Victoria M. ; Berger, Bonnie ; Scherer, Steve ; Sahinalp, S. Cenk. / Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes. In: Nature Communications. 2018 ; Vol. 9, No. 1.
@article{f5259f39fa47466189a4321b40584882,
title = "Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes",
abstract = "High-throughput sequencing provides the means to determine the allelic decomposition for any gene of interest - the number of copies and the exact sequence content of each copy of a gene. Although many clinically and functionally important genes are highly polymorphic and have undergone structural alterations, no high-throughput sequencing data analysis tool has yet been designed to effectively solve the full allelic decomposition problem. Here we introduce a combinatorial optimization framework that successfully resolves this challenging problem, including for genes with structural alterations. We provide an associated computational tool Aldy that performs allelic decomposition of highly polymorphic, multi-copy genes through using whole or targeted genome sequencing data. For a large diverse sequencing data set, Aldy identifies multiple rare and novel alleles for several important pharmacogenes, significantly improving upon the accuracy and utility of current genotyping assays. As more data sets become available, we expect Aldy to become an essential component of genotyping toolkits.",
author = "Ibrahim Numanagić and Salem Malikić and Michael Ford and Xiang Qin and Lorraine Toji and Milan Radovich and Todd Skaar and Pratt, {Victoria M.} and Bonnie Berger and Steve Scherer and Sahinalp, {S. Cenk}",
year = "2018",
month = "12",
day = "1",
doi = "10.1038/s41467-018-03273-1",
language = "English (US)",
volume = "9",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes

AU - Numanagić, Ibrahim

AU - Malikić, Salem

AU - Ford, Michael

AU - Qin, Xiang

AU - Toji, Lorraine

AU - Radovich, Milan

AU - Skaar, Todd

AU - Pratt, Victoria M.

AU - Berger, Bonnie

AU - Scherer, Steve

AU - Sahinalp, S. Cenk

PY - 2018/12/1

Y1 - 2018/12/1

N2 - High-throughput sequencing provides the means to determine the allelic decomposition for any gene of interest - the number of copies and the exact sequence content of each copy of a gene. Although many clinically and functionally important genes are highly polymorphic and have undergone structural alterations, no high-throughput sequencing data analysis tool has yet been designed to effectively solve the full allelic decomposition problem. Here we introduce a combinatorial optimization framework that successfully resolves this challenging problem, including for genes with structural alterations. We provide an associated computational tool Aldy that performs allelic decomposition of highly polymorphic, multi-copy genes through using whole or targeted genome sequencing data. For a large diverse sequencing data set, Aldy identifies multiple rare and novel alleles for several important pharmacogenes, significantly improving upon the accuracy and utility of current genotyping assays. As more data sets become available, we expect Aldy to become an essential component of genotyping toolkits.

AB - High-throughput sequencing provides the means to determine the allelic decomposition for any gene of interest - the number of copies and the exact sequence content of each copy of a gene. Although many clinically and functionally important genes are highly polymorphic and have undergone structural alterations, no high-throughput sequencing data analysis tool has yet been designed to effectively solve the full allelic decomposition problem. Here we introduce a combinatorial optimization framework that successfully resolves this challenging problem, including for genes with structural alterations. We provide an associated computational tool Aldy that performs allelic decomposition of highly polymorphic, multi-copy genes through using whole or targeted genome sequencing data. For a large diverse sequencing data set, Aldy identifies multiple rare and novel alleles for several important pharmacogenes, significantly improving upon the accuracy and utility of current genotyping assays. As more data sets become available, we expect Aldy to become an essential component of genotyping toolkits.

UR - http://www.scopus.com/inward/record.url?scp=85042933237&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042933237&partnerID=8YFLogxK

U2 - 10.1038/s41467-018-03273-1

DO - 10.1038/s41467-018-03273-1

M3 - Article

VL - 9

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

IS - 1

M1 - 828

ER -