A two-latent-class model for smoking cessation data with informative dropouts

Li Qin, Lisa A. Weissfeld, Changyu Shen, Michele D. Levine

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.

Original languageEnglish
Pages (from-to)2604-2619
Number of pages16
JournalCommunications in Statistics - Theory and Methods
Volume38
Issue number15
DOIs
StatePublished - Sep 2009

Fingerprint

Informative Dropout
Latent Class Model
Smoking
Latent Class
Missing Data
Nonignorable Missing Data
Model
Conditional Independence
Nominal or categorical data
Longitudinal Study
Receiver Operating Characteristic Curve
Logistic Regression
Maximum Likelihood Estimation
Monotone
Simulation
Modeling

Keywords

  • Area under ROC curve
  • Informative dropout
  • Latent class
  • Tetrachoric correlation

ASJC Scopus subject areas

  • Statistics and Probability

Cite this

A two-latent-class model for smoking cessation data with informative dropouts. / Qin, Li; Weissfeld, Lisa A.; Shen, Changyu; Levine, Michele D.

In: Communications in Statistics - Theory and Methods, Vol. 38, No. 15, 09.2009, p. 2604-2619.

Research output: Contribution to journalArticle

Qin, Li ; Weissfeld, Lisa A. ; Shen, Changyu ; Levine, Michele D. / A two-latent-class model for smoking cessation data with informative dropouts. In: Communications in Statistics - Theory and Methods. 2009 ; Vol. 38, No. 15. pp. 2604-2619.
@article{e90ccde6f2e34a60809ec9e8d1205ef6,
title = "A two-latent-class model for smoking cessation data with informative dropouts",
abstract = "Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.",
keywords = "Area under ROC curve, Informative dropout, Latent class, Tetrachoric correlation",
author = "Li Qin and Weissfeld, {Lisa A.} and Changyu Shen and Levine, {Michele D.}",
year = "2009",
month = "9",
doi = "10.1080/03610920802585849",
language = "English",
volume = "38",
pages = "2604--2619",
journal = "Communications in Statistics - Theory and Methods",
issn = "0361-0926",
publisher = "Taylor and Francis Ltd.",
number = "15",

}

TY - JOUR

T1 - A two-latent-class model for smoking cessation data with informative dropouts

AU - Qin, Li

AU - Weissfeld, Lisa A.

AU - Shen, Changyu

AU - Levine, Michele D.

PY - 2009/9

Y1 - 2009/9

N2 - Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.

AB - Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.

KW - Area under ROC curve

KW - Informative dropout

KW - Latent class

KW - Tetrachoric correlation

UR - http://www.scopus.com/inward/record.url?scp=70449560952&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449560952&partnerID=8YFLogxK

U2 - 10.1080/03610920802585849

DO - 10.1080/03610920802585849

M3 - Article

VL - 38

SP - 2604

EP - 2619

JO - Communications in Statistics - Theory and Methods

JF - Communications in Statistics - Theory and Methods

SN - 0361-0926

IS - 15

ER -