A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes

Lang Li, Menggang Yu, Robarge D. Jason, Changyu Shen, Faouzi Azzouz, Howard L. McLeod, Silvana Borges-Gonzales, Anne Nguyen, Todd Skaar, Zeruesenay Desta, Christopher J. Sweeney, David A. Flockhart

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

In translational research, a genetic association study of a binary outcome has a twofold aim: test whether genetic/environmental variables or their combinations are associated with a clinical phenotype, and determine how those combinations are grouped to predict the phenotype (i.e., which combinations have a similarly distributed phenotype, and which ones have differently distributed phenotypes). The second part of this aim has high clinical appeal, because it can directly facilitate clinical decisions. Although traditional logistic regression can detect gene-gene or gene-environmental interaction effects on binary phenotypes, they cannot decisively determine how genotype combinations are grouped to predict the phenotype. Our proposed mixture model approach is valuable in this context. It concurrently detects main and interaction effects of genetic and environmental variables through a likelihood ratio test (LRT) and conducts phenotype cluster analysis based on genetic and environmental variable combinations. The theoretical distribution of the proposed mixture model's likelihood ratio test is robust not only to small sample size but also to unequal sample size in various genotype and environmental subgroups. Hypothesis testing through a likelihood ratio test results in a fast algorithm for p-value calculations. Extensive simulation studies demonstrate that mixture model, overall test in logistic regression, and Monte Carlo based logic regression constantly possess the best power to detect multi-way gene/environmental combinations. The mixture model approach has the highest recovery probability to recover the true partition in the simulation studies. Its applications are exemplified in interim data analyses for two cancer studies.

Original languageEnglish
Pages (from-to)1150-1177
Number of pages28
JournalJournal of Biopharmaceutical Statistics
Volume18
Issue number6
DOIs
StatePublished - Nov 2008

Fingerprint

Mixture Model
Phenotype
Binary
Gene
Interaction
Genes
Likelihood Ratio Test
Interaction Effects
Logistic Regression
Genotype
Sample Size
Logistic Models
Simulation Study
Genetic Association
Predict
Binary Outcomes
Translational Medical Research
Main Effect
Appeal
Small Sample Size

Keywords

  • Gene-gene interaction
  • Mixture model
  • Pharmacogenetics

ASJC Scopus subject areas

  • Pharmacology (medical)
  • Pharmacology
  • Statistics and Probability

Cite this

Li, L., Yu, M., Jason, R. D., Shen, C., Azzouz, F., McLeod, H. L., ... Flockhart, D. A. (2008). A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes. Journal of Biopharmaceutical Statistics, 18(6), 1150-1177. https://doi.org/10.1080/10543400802369038

A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes. / Li, Lang; Yu, Menggang; Jason, Robarge D.; Shen, Changyu; Azzouz, Faouzi; McLeod, Howard L.; Borges-Gonzales, Silvana; Nguyen, Anne; Skaar, Todd; Desta, Zeruesenay; Sweeney, Christopher J.; Flockhart, David A.

In: Journal of Biopharmaceutical Statistics, Vol. 18, No. 6, 11.2008, p. 1150-1177.

Research output: Contribution to journalArticle

Li, L, Yu, M, Jason, RD, Shen, C, Azzouz, F, McLeod, HL, Borges-Gonzales, S, Nguyen, A, Skaar, T, Desta, Z, Sweeney, CJ & Flockhart, DA 2008, 'A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes', Journal of Biopharmaceutical Statistics, vol. 18, no. 6, pp. 1150-1177. https://doi.org/10.1080/10543400802369038
Li, Lang ; Yu, Menggang ; Jason, Robarge D. ; Shen, Changyu ; Azzouz, Faouzi ; McLeod, Howard L. ; Borges-Gonzales, Silvana ; Nguyen, Anne ; Skaar, Todd ; Desta, Zeruesenay ; Sweeney, Christopher J. ; Flockhart, David A. / A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes. In: Journal of Biopharmaceutical Statistics. 2008 ; Vol. 18, No. 6. pp. 1150-1177.
@article{2b513c261c594d1aa31583fe26ee969f,
title = "A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes",
abstract = "In translational research, a genetic association study of a binary outcome has a twofold aim: test whether genetic/environmental variables or their combinations are associated with a clinical phenotype, and determine how those combinations are grouped to predict the phenotype (i.e., which combinations have a similarly distributed phenotype, and which ones have differently distributed phenotypes). The second part of this aim has high clinical appeal, because it can directly facilitate clinical decisions. Although traditional logistic regression can detect gene-gene or gene-environmental interaction effects on binary phenotypes, they cannot decisively determine how genotype combinations are grouped to predict the phenotype. Our proposed mixture model approach is valuable in this context. It concurrently detects main and interaction effects of genetic and environmental variables through a likelihood ratio test (LRT) and conducts phenotype cluster analysis based on genetic and environmental variable combinations. The theoretical distribution of the proposed mixture model's likelihood ratio test is robust not only to small sample size but also to unequal sample size in various genotype and environmental subgroups. Hypothesis testing through a likelihood ratio test results in a fast algorithm for p-value calculations. Extensive simulation studies demonstrate that mixture model, overall test in logistic regression, and Monte Carlo based logic regression constantly possess the best power to detect multi-way gene/environmental combinations. The mixture model approach has the highest recovery probability to recover the true partition in the simulation studies. Its applications are exemplified in interim data analyses for two cancer studies.",
keywords = "Gene-gene interaction, Mixture model, Pharmacogenetics",
author = "Lang Li and Menggang Yu and Jason, {Robarge D.} and Changyu Shen and Faouzi Azzouz and McLeod, {Howard L.} and Silvana Borges-Gonzales and Anne Nguyen and Todd Skaar and Zeruesenay Desta and Sweeney, {Christopher J.} and Flockhart, {David A.}",
year = "2008",
month = "11",
doi = "10.1080/10543400802369038",
language = "English",
volume = "18",
pages = "1150--1177",
journal = "Journal of Biopharmaceutical Statistics",
issn = "1054-3406",
publisher = "Taylor and Francis Ltd.",
number = "6",

}

TY - JOUR

T1 - A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes

AU - Li, Lang

AU - Yu, Menggang

AU - Jason, Robarge D.

AU - Shen, Changyu

AU - Azzouz, Faouzi

AU - McLeod, Howard L.

AU - Borges-Gonzales, Silvana

AU - Nguyen, Anne

AU - Skaar, Todd

AU - Desta, Zeruesenay

AU - Sweeney, Christopher J.

AU - Flockhart, David A.

PY - 2008/11

Y1 - 2008/11

N2 - In translational research, a genetic association study of a binary outcome has a twofold aim: test whether genetic/environmental variables or their combinations are associated with a clinical phenotype, and determine how those combinations are grouped to predict the phenotype (i.e., which combinations have a similarly distributed phenotype, and which ones have differently distributed phenotypes). The second part of this aim has high clinical appeal, because it can directly facilitate clinical decisions. Although traditional logistic regression can detect gene-gene or gene-environmental interaction effects on binary phenotypes, they cannot decisively determine how genotype combinations are grouped to predict the phenotype. Our proposed mixture model approach is valuable in this context. It concurrently detects main and interaction effects of genetic and environmental variables through a likelihood ratio test (LRT) and conducts phenotype cluster analysis based on genetic and environmental variable combinations. The theoretical distribution of the proposed mixture model's likelihood ratio test is robust not only to small sample size but also to unequal sample size in various genotype and environmental subgroups. Hypothesis testing through a likelihood ratio test results in a fast algorithm for p-value calculations. Extensive simulation studies demonstrate that mixture model, overall test in logistic regression, and Monte Carlo based logic regression constantly possess the best power to detect multi-way gene/environmental combinations. The mixture model approach has the highest recovery probability to recover the true partition in the simulation studies. Its applications are exemplified in interim data analyses for two cancer studies.

AB - In translational research, a genetic association study of a binary outcome has a twofold aim: test whether genetic/environmental variables or their combinations are associated with a clinical phenotype, and determine how those combinations are grouped to predict the phenotype (i.e., which combinations have a similarly distributed phenotype, and which ones have differently distributed phenotypes). The second part of this aim has high clinical appeal, because it can directly facilitate clinical decisions. Although traditional logistic regression can detect gene-gene or gene-environmental interaction effects on binary phenotypes, they cannot decisively determine how genotype combinations are grouped to predict the phenotype. Our proposed mixture model approach is valuable in this context. It concurrently detects main and interaction effects of genetic and environmental variables through a likelihood ratio test (LRT) and conducts phenotype cluster analysis based on genetic and environmental variable combinations. The theoretical distribution of the proposed mixture model's likelihood ratio test is robust not only to small sample size but also to unequal sample size in various genotype and environmental subgroups. Hypothesis testing through a likelihood ratio test results in a fast algorithm for p-value calculations. Extensive simulation studies demonstrate that mixture model, overall test in logistic regression, and Monte Carlo based logic regression constantly possess the best power to detect multi-way gene/environmental combinations. The mixture model approach has the highest recovery probability to recover the true partition in the simulation studies. Its applications are exemplified in interim data analyses for two cancer studies.

KW - Gene-gene interaction

KW - Mixture model

KW - Pharmacogenetics

UR - http://www.scopus.com/inward/record.url?scp=57349087128&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=57349087128&partnerID=8YFLogxK

U2 - 10.1080/10543400802369038

DO - 10.1080/10543400802369038

M3 - Article

C2 - 18991114

AN - SCOPUS:57349087128

VL - 18

SP - 1150

EP - 1177

JO - Journal of Biopharmaceutical Statistics

JF - Journal of Biopharmaceutical Statistics

SN - 1054-3406

IS - 6

ER -