Robust tests for gene–environment interaction in case-control and case-only designs

Yong Zang, Wing Kam Fung, Sha Cao, Hon Keung Tony Ng, Chi Zhang

Research output: Contribution to journalArticle

Abstract

The case-control and case-only designs are commonly used to detect the gene–environment (G–E) interaction. In principle, the tests based on these two designs require a pre-specified genetic model to achieve an expected power of detecting the G–E interaction. Unfortunately, for most complex diseases the underlying genetic models are unknown. It is well known that mis-specification of the genetic model can result in a substantial loss of power in the detection of the main genetic effect. However, limited effort has been dedicated to the study of G–E interaction. This issue has been investigated in this article with a conclusion that the genetic model mis-specification can not only undermine the power of detecting G–E interaction in both case-control and case-only designs but also distort the type I error rate in case-control design. To tackle this problem, a class of robust tests, namely MAX3, have been proposed for both the case-control and case-only designs. The proposed tests can well control the type I error rate and yield satisfactory power even when the genetic model is mis-specified. The asymptotic distribution and the p-value formula for MAX3 have also been derived. Comprehensive simulation studies and a real data application on the genome-wide association study (GWAS) have been conducted using these analytical tools and the results demonstrate desirable operating characteristics of the proposed robust tests.

Original languageEnglish (US)
Pages (from-to)79-92
Number of pages14
JournalComputational Statistics and Data Analysis
Volume129
DOIs
StatePublished - Jan 1 2019

Fingerprint

Gene-environment Interaction
Robust Tests
Case-control
Type I Error Rate
Specifications
Model Misspecification
Misspecification
Operating Characteristics
p-Value
Control Design
Model
Genes
Asymptotic distribution
Design
Genome
Simulation Study
Unknown
Demonstrate

Keywords

  • Case-control design
  • Case-only design
  • Genetic model
  • gene–environment interaction
  • Robust test

ASJC Scopus subject areas

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

Robust tests for gene–environment interaction in case-control and case-only designs. / Zang, Yong; Fung, Wing Kam; Cao, Sha; Ng, Hon Keung Tony; Zhang, Chi.

In: Computational Statistics and Data Analysis, Vol. 129, 01.01.2019, p. 79-92.

Research output: Contribution to journalArticle

@article{a884783440314183be50c93cc8e8d087,
title = "Robust tests for gene–environment interaction in case-control and case-only designs",
abstract = "The case-control and case-only designs are commonly used to detect the gene–environment (G–E) interaction. In principle, the tests based on these two designs require a pre-specified genetic model to achieve an expected power of detecting the G–E interaction. Unfortunately, for most complex diseases the underlying genetic models are unknown. It is well known that mis-specification of the genetic model can result in a substantial loss of power in the detection of the main genetic effect. However, limited effort has been dedicated to the study of G–E interaction. This issue has been investigated in this article with a conclusion that the genetic model mis-specification can not only undermine the power of detecting G–E interaction in both case-control and case-only designs but also distort the type I error rate in case-control design. To tackle this problem, a class of robust tests, namely MAX3, have been proposed for both the case-control and case-only designs. The proposed tests can well control the type I error rate and yield satisfactory power even when the genetic model is mis-specified. The asymptotic distribution and the p-value formula for MAX3 have also been derived. Comprehensive simulation studies and a real data application on the genome-wide association study (GWAS) have been conducted using these analytical tools and the results demonstrate desirable operating characteristics of the proposed robust tests.",
keywords = "Case-control design, Case-only design, Genetic model, gene–environment interaction, Robust test",
author = "Yong Zang and Fung, {Wing Kam} and Sha Cao and Ng, {Hon Keung Tony} and Chi Zhang",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.csda.2018.08.014",
language = "English (US)",
volume = "129",
pages = "79--92",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",

}

TY - JOUR

T1 - Robust tests for gene–environment interaction in case-control and case-only designs

AU - Zang, Yong

AU - Fung, Wing Kam

AU - Cao, Sha

AU - Ng, Hon Keung Tony

AU - Zhang, Chi

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The case-control and case-only designs are commonly used to detect the gene–environment (G–E) interaction. In principle, the tests based on these two designs require a pre-specified genetic model to achieve an expected power of detecting the G–E interaction. Unfortunately, for most complex diseases the underlying genetic models are unknown. It is well known that mis-specification of the genetic model can result in a substantial loss of power in the detection of the main genetic effect. However, limited effort has been dedicated to the study of G–E interaction. This issue has been investigated in this article with a conclusion that the genetic model mis-specification can not only undermine the power of detecting G–E interaction in both case-control and case-only designs but also distort the type I error rate in case-control design. To tackle this problem, a class of robust tests, namely MAX3, have been proposed for both the case-control and case-only designs. The proposed tests can well control the type I error rate and yield satisfactory power even when the genetic model is mis-specified. The asymptotic distribution and the p-value formula for MAX3 have also been derived. Comprehensive simulation studies and a real data application on the genome-wide association study (GWAS) have been conducted using these analytical tools and the results demonstrate desirable operating characteristics of the proposed robust tests.

AB - The case-control and case-only designs are commonly used to detect the gene–environment (G–E) interaction. In principle, the tests based on these two designs require a pre-specified genetic model to achieve an expected power of detecting the G–E interaction. Unfortunately, for most complex diseases the underlying genetic models are unknown. It is well known that mis-specification of the genetic model can result in a substantial loss of power in the detection of the main genetic effect. However, limited effort has been dedicated to the study of G–E interaction. This issue has been investigated in this article with a conclusion that the genetic model mis-specification can not only undermine the power of detecting G–E interaction in both case-control and case-only designs but also distort the type I error rate in case-control design. To tackle this problem, a class of robust tests, namely MAX3, have been proposed for both the case-control and case-only designs. The proposed tests can well control the type I error rate and yield satisfactory power even when the genetic model is mis-specified. The asymptotic distribution and the p-value formula for MAX3 have also been derived. Comprehensive simulation studies and a real data application on the genome-wide association study (GWAS) have been conducted using these analytical tools and the results demonstrate desirable operating characteristics of the proposed robust tests.

KW - Case-control design

KW - Case-only design

KW - Genetic model

KW - gene–environment interaction

KW - Robust test

UR - http://www.scopus.com/inward/record.url?scp=85053005960&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053005960&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2018.08.014

DO - 10.1016/j.csda.2018.08.014

M3 - Article

AN - SCOPUS:85053005960

VL - 129

SP - 79

EP - 92

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

ER -