Copy number variation accuracy in genome-wide association studies

Peng Lin, Sarah M. Hartz, Jen Chyong Wang, Robert F. Krueger, Tatiana Foroud, Howard Edenberg, John Nurnberger, Andrew I. Brooks, Jay A. Tischfield, Laura Almasy, Bradley T. Webb, Victor M. Hesselbrock, Bernice Porjesz, Alison M. Goate, Laura J. Bierut, John P. Rice

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Background/Aim: Copy number variations (CNVs) are a major source of alterations among individuals and are a potential risk factor in many diseases. Numerous diseases have been linked to deletions and duplications of these chromosomal segments. Data from genome-wide association studies and other microarrays may be used to identify CNVs by several different computer programs, but the reliability of the results has been questioned. Methods: To help researchers reduce the number of false-positive CNVs that need to be followed up with laboratory testing, we evaluated the relative performance of CNVPartition, PennCNV and QuantiSNP, and developed a statistical method for estimating sensitivity and positive predictive values of CNV calls and tested it on 96 duplicate samples in our dataset. Results: We found that the positive predictive rate increases with the number of probes in the CNV and the size of the CNV, with the highest positive predicted rates in CNVs of at least 500 kb and at least 100 probes. Our analysis also indicates that identifying CNVs reported by multiple programs can greatly improve the reproducibility rate and the positive predicted rate. Conclusion: Our methods can be used by investigators to identify CNVs in genome-wide data with greater reliability.

Original languageEnglish
Pages (from-to)141-147
Number of pages7
JournalHuman Heredity
Volume71
Issue number3
DOIs
StatePublished - Jul 2011

Fingerprint

Genome-Wide Association Study
Research Personnel
Chromosome Duplication
Reproducibility of Results
Software
Genome

Keywords

  • Accuracy
  • Copy number variations
  • False positives
  • Genome-wide association studies

ASJC Scopus subject areas

  • Genetics(clinical)
  • Genetics

Cite this

Lin, P., Hartz, S. M., Wang, J. C., Krueger, R. F., Foroud, T., Edenberg, H., ... Rice, J. P. (2011). Copy number variation accuracy in genome-wide association studies. Human Heredity, 71(3), 141-147. https://doi.org/10.1159/000324683

Copy number variation accuracy in genome-wide association studies. / Lin, Peng; Hartz, Sarah M.; Wang, Jen Chyong; Krueger, Robert F.; Foroud, Tatiana; Edenberg, Howard; Nurnberger, John; Brooks, Andrew I.; Tischfield, Jay A.; Almasy, Laura; Webb, Bradley T.; Hesselbrock, Victor M.; Porjesz, Bernice; Goate, Alison M.; Bierut, Laura J.; Rice, John P.

In: Human Heredity, Vol. 71, No. 3, 07.2011, p. 141-147.

Research output: Contribution to journalArticle

Lin, P, Hartz, SM, Wang, JC, Krueger, RF, Foroud, T, Edenberg, H, Nurnberger, J, Brooks, AI, Tischfield, JA, Almasy, L, Webb, BT, Hesselbrock, VM, Porjesz, B, Goate, AM, Bierut, LJ & Rice, JP 2011, 'Copy number variation accuracy in genome-wide association studies', Human Heredity, vol. 71, no. 3, pp. 141-147. https://doi.org/10.1159/000324683
Lin, Peng ; Hartz, Sarah M. ; Wang, Jen Chyong ; Krueger, Robert F. ; Foroud, Tatiana ; Edenberg, Howard ; Nurnberger, John ; Brooks, Andrew I. ; Tischfield, Jay A. ; Almasy, Laura ; Webb, Bradley T. ; Hesselbrock, Victor M. ; Porjesz, Bernice ; Goate, Alison M. ; Bierut, Laura J. ; Rice, John P. / Copy number variation accuracy in genome-wide association studies. In: Human Heredity. 2011 ; Vol. 71, No. 3. pp. 141-147.
@article{0da49d5f27494bf0be227c2919829fc2,
title = "Copy number variation accuracy in genome-wide association studies",
abstract = "Background/Aim: Copy number variations (CNVs) are a major source of alterations among individuals and are a potential risk factor in many diseases. Numerous diseases have been linked to deletions and duplications of these chromosomal segments. Data from genome-wide association studies and other microarrays may be used to identify CNVs by several different computer programs, but the reliability of the results has been questioned. Methods: To help researchers reduce the number of false-positive CNVs that need to be followed up with laboratory testing, we evaluated the relative performance of CNVPartition, PennCNV and QuantiSNP, and developed a statistical method for estimating sensitivity and positive predictive values of CNV calls and tested it on 96 duplicate samples in our dataset. Results: We found that the positive predictive rate increases with the number of probes in the CNV and the size of the CNV, with the highest positive predicted rates in CNVs of at least 500 kb and at least 100 probes. Our analysis also indicates that identifying CNVs reported by multiple programs can greatly improve the reproducibility rate and the positive predicted rate. Conclusion: Our methods can be used by investigators to identify CNVs in genome-wide data with greater reliability.",
keywords = "Accuracy, Copy number variations, False positives, Genome-wide association studies",
author = "Peng Lin and Hartz, {Sarah M.} and Wang, {Jen Chyong} and Krueger, {Robert F.} and Tatiana Foroud and Howard Edenberg and John Nurnberger and Brooks, {Andrew I.} and Tischfield, {Jay A.} and Laura Almasy and Webb, {Bradley T.} and Hesselbrock, {Victor M.} and Bernice Porjesz and Goate, {Alison M.} and Bierut, {Laura J.} and Rice, {John P.}",
year = "2011",
month = "7",
doi = "10.1159/000324683",
language = "English",
volume = "71",
pages = "141--147",
journal = "Human Heredity",
issn = "0001-5652",
publisher = "S. Karger AG",
number = "3",

}

TY - JOUR

T1 - Copy number variation accuracy in genome-wide association studies

AU - Lin, Peng

AU - Hartz, Sarah M.

AU - Wang, Jen Chyong

AU - Krueger, Robert F.

AU - Foroud, Tatiana

AU - Edenberg, Howard

AU - Nurnberger, John

AU - Brooks, Andrew I.

AU - Tischfield, Jay A.

AU - Almasy, Laura

AU - Webb, Bradley T.

AU - Hesselbrock, Victor M.

AU - Porjesz, Bernice

AU - Goate, Alison M.

AU - Bierut, Laura J.

AU - Rice, John P.

PY - 2011/7

Y1 - 2011/7

N2 - Background/Aim: Copy number variations (CNVs) are a major source of alterations among individuals and are a potential risk factor in many diseases. Numerous diseases have been linked to deletions and duplications of these chromosomal segments. Data from genome-wide association studies and other microarrays may be used to identify CNVs by several different computer programs, but the reliability of the results has been questioned. Methods: To help researchers reduce the number of false-positive CNVs that need to be followed up with laboratory testing, we evaluated the relative performance of CNVPartition, PennCNV and QuantiSNP, and developed a statistical method for estimating sensitivity and positive predictive values of CNV calls and tested it on 96 duplicate samples in our dataset. Results: We found that the positive predictive rate increases with the number of probes in the CNV and the size of the CNV, with the highest positive predicted rates in CNVs of at least 500 kb and at least 100 probes. Our analysis also indicates that identifying CNVs reported by multiple programs can greatly improve the reproducibility rate and the positive predicted rate. Conclusion: Our methods can be used by investigators to identify CNVs in genome-wide data with greater reliability.

AB - Background/Aim: Copy number variations (CNVs) are a major source of alterations among individuals and are a potential risk factor in many diseases. Numerous diseases have been linked to deletions and duplications of these chromosomal segments. Data from genome-wide association studies and other microarrays may be used to identify CNVs by several different computer programs, but the reliability of the results has been questioned. Methods: To help researchers reduce the number of false-positive CNVs that need to be followed up with laboratory testing, we evaluated the relative performance of CNVPartition, PennCNV and QuantiSNP, and developed a statistical method for estimating sensitivity and positive predictive values of CNV calls and tested it on 96 duplicate samples in our dataset. Results: We found that the positive predictive rate increases with the number of probes in the CNV and the size of the CNV, with the highest positive predicted rates in CNVs of at least 500 kb and at least 100 probes. Our analysis also indicates that identifying CNVs reported by multiple programs can greatly improve the reproducibility rate and the positive predicted rate. Conclusion: Our methods can be used by investigators to identify CNVs in genome-wide data with greater reliability.

KW - Accuracy

KW - Copy number variations

KW - False positives

KW - Genome-wide association studies

UR - http://www.scopus.com/inward/record.url?scp=79960751640&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79960751640&partnerID=8YFLogxK

U2 - 10.1159/000324683

DO - 10.1159/000324683

M3 - Article

C2 - 21778733

AN - SCOPUS:79960751640

VL - 71

SP - 141

EP - 147

JO - Human Heredity

JF - Human Heredity

SN - 0001-5652

IS - 3

ER -