Simultaneous inferences based on empirical Bayes methods and false discovery rates ineQTL data analysis

Arindom Chakraborty, Guanglong Jiang, Malaz Boustani, Yunlong Liu, Todd Skaar, Lang Li

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Background: Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~1 0 6). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative.Results: The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDR<10% compared to the previous study done by Yang et al. (Genome Research, 2010).Conclusions: In contrast to previously available methods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.

Original languageEnglish
Article numberS8
JournalBMC Genomics
Volume14
Issue numberSUPP 8
DOIs
StatePublished - Dec 9 2013

Fingerprint

Quantitative Trait Loci
Single Nucleotide Polymorphism
Bayes Theorem
Genome-Wide Association Study
Genome
Gene Expression
Liver
Research

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Simultaneous inferences based on empirical Bayes methods and false discovery rates ineQTL data analysis. / Chakraborty, Arindom; Jiang, Guanglong; Boustani, Malaz; Liu, Yunlong; Skaar, Todd; Li, Lang.

In: BMC Genomics, Vol. 14, No. SUPP 8, S8, 09.12.2013.

Research output: Contribution to journalArticle

@article{de21523f684643efb6dd2fc21db4b9a2,
title = "Simultaneous inferences based on empirical Bayes methods and false discovery rates ineQTL data analysis",
abstract = "Background: Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~1 0 6). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative.Results: The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDR<10{\%} compared to the previous study done by Yang et al. (Genome Research, 2010).Conclusions: In contrast to previously available methods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.",
author = "Arindom Chakraborty and Guanglong Jiang and Malaz Boustani and Yunlong Liu and Todd Skaar and Lang Li",
year = "2013",
month = "12",
day = "9",
doi = "10.1186/1471-2164-14-S8-S8",
language = "English",
volume = "14",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "SUPP 8",

}

TY - JOUR

T1 - Simultaneous inferences based on empirical Bayes methods and false discovery rates ineQTL data analysis

AU - Chakraborty, Arindom

AU - Jiang, Guanglong

AU - Boustani, Malaz

AU - Liu, Yunlong

AU - Skaar, Todd

AU - Li, Lang

PY - 2013/12/9

Y1 - 2013/12/9

N2 - Background: Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~1 0 6). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative.Results: The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDR<10% compared to the previous study done by Yang et al. (Genome Research, 2010).Conclusions: In contrast to previously available methods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.

AB - Background: Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~1 0 6). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative.Results: The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDR<10% compared to the previous study done by Yang et al. (Genome Research, 2010).Conclusions: In contrast to previously available methods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.

UR - http://www.scopus.com/inward/record.url?scp=84889673757&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84889673757&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-14-S8-S8

DO - 10.1186/1471-2164-14-S8-S8

M3 - Article

VL - 14

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - SUPP 8

M1 - S8

ER -