Variable selection and nonlinear effect discovery in partially linear mixture cure rate models

Abdullah Al Masud, Zhangsheng Yu, Wanzhu Tu

Research output: Contribution to journalArticle

Abstract

Survival data with long-term survivors are common in clinical investigations. Such data are often analyzed with mixture cure rate models. Existing model selection procedures do not readily discriminate nonlinear effects from linear ones. Here, we propose a procedure for accommodating nonlinear effects and for determining the cure rate model composition. The procedure is based on the Least Absolute Shrinkage and Selection Operators (LASSO). Specifically, by partitioning each variable into linear and nonlinear components, we use LASSO to select linear and nonlinear components. Operationally, we model the nonlinear components by cubic B-splines. The procedure adds to the existing variable selection methods an ability to discover hidden nonlinear effects in a cure rate model setting. To implement, we ascertain the maximum likelihood estimates by using an Expectation Maximization (EM) algorithm. We conduct an extensive simulation study to assess the operating characteristics of the selection procedure. We illustrate the use of the method by analyzing data from a real clinical study.

Original languageEnglish (US)
Pages (from-to)156-177
Number of pages22
JournalBiostatistics and Epidemiology
Volume3
Issue number1
DOIs
StatePublished - Jan 1 2019

    Fingerprint

Keywords

  • cubic B-splines
  • LASSO
  • Mixture cure rate models
  • variable selection

ASJC Scopus subject areas

  • Epidemiology
  • Health Informatics

Cite this