Variable selection and nonlinear effect discovery in partially linear mixture cure rate models

Abdullah Al Masud, Zhangsheng Yu, Wanzhu Tu

Research output: Contribution to journalArticle

Abstract

Survival data with long-term survivors are common in clinical investigations. Such data are often analyzed with mixture cure rate models. Existing model selection procedures do not readily discriminate nonlinear effects from linear ones. Here, we propose a procedure for accommodating nonlinear effects and for determining the cure rate model composition. The procedure is based on the Least Absolute Shrinkage and Selection Operators (LASSO). Specifically, by partitioning each variable into linear and nonlinear components, we use LASSO to select linear and nonlinear components. Operationally, we model the nonlinear components by cubic B-splines. The procedure adds to the existing variable selection methods an ability to discover hidden nonlinear effects in a cure rate model setting. To implement, we ascertain the maximum likelihood estimates by using an Expectation Maximization (EM) algorithm. We conduct an extensive simulation study to assess the operating characteristics of the selection procedure. We illustrate the use of the method by analyzing data from a real clinical study.

Original languageEnglish (US)
Pages (from-to)156-177
Number of pages22
JournalBiostatistics and Epidemiology
Volume3
Issue number1
DOIs
StatePublished - Jan 1 2019

Fingerprint

Likelihood Functions
Survivors
Survival
Clinical Studies

Keywords

  • cubic B-splines
  • LASSO
  • Mixture cure rate models
  • variable selection

ASJC Scopus subject areas

  • Epidemiology
  • Health Informatics

Cite this

Variable selection and nonlinear effect discovery in partially linear mixture cure rate models. / Masud, Abdullah Al; Yu, Zhangsheng; Tu, Wanzhu.

In: Biostatistics and Epidemiology, Vol. 3, No. 1, 01.01.2019, p. 156-177.

Research output: Contribution to journalArticle

@article{7ee0b910e43c48f9a9721bc53c5bcdce,
title = "Variable selection and nonlinear effect discovery in partially linear mixture cure rate models",
abstract = "Survival data with long-term survivors are common in clinical investigations. Such data are often analyzed with mixture cure rate models. Existing model selection procedures do not readily discriminate nonlinear effects from linear ones. Here, we propose a procedure for accommodating nonlinear effects and for determining the cure rate model composition. The procedure is based on the Least Absolute Shrinkage and Selection Operators (LASSO). Specifically, by partitioning each variable into linear and nonlinear components, we use LASSO to select linear and nonlinear components. Operationally, we model the nonlinear components by cubic B-splines. The procedure adds to the existing variable selection methods an ability to discover hidden nonlinear effects in a cure rate model setting. To implement, we ascertain the maximum likelihood estimates by using an Expectation Maximization (EM) algorithm. We conduct an extensive simulation study to assess the operating characteristics of the selection procedure. We illustrate the use of the method by analyzing data from a real clinical study.",
keywords = "cubic B-splines, LASSO, Mixture cure rate models, variable selection",
author = "Masud, {Abdullah Al} and Zhangsheng Yu and Wanzhu Tu",
year = "2019",
month = "1",
day = "1",
doi = "10.1080/24709360.2019.1663665",
language = "English (US)",
volume = "3",
pages = "156--177",
journal = "Biostatistics and Epidemiology",
issn = "2470-9360",
publisher = "Taylor and Francis Ltd.",
number = "1",

}

TY - JOUR

T1 - Variable selection and nonlinear effect discovery in partially linear mixture cure rate models

AU - Masud, Abdullah Al

AU - Yu, Zhangsheng

AU - Tu, Wanzhu

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Survival data with long-term survivors are common in clinical investigations. Such data are often analyzed with mixture cure rate models. Existing model selection procedures do not readily discriminate nonlinear effects from linear ones. Here, we propose a procedure for accommodating nonlinear effects and for determining the cure rate model composition. The procedure is based on the Least Absolute Shrinkage and Selection Operators (LASSO). Specifically, by partitioning each variable into linear and nonlinear components, we use LASSO to select linear and nonlinear components. Operationally, we model the nonlinear components by cubic B-splines. The procedure adds to the existing variable selection methods an ability to discover hidden nonlinear effects in a cure rate model setting. To implement, we ascertain the maximum likelihood estimates by using an Expectation Maximization (EM) algorithm. We conduct an extensive simulation study to assess the operating characteristics of the selection procedure. We illustrate the use of the method by analyzing data from a real clinical study.

AB - Survival data with long-term survivors are common in clinical investigations. Such data are often analyzed with mixture cure rate models. Existing model selection procedures do not readily discriminate nonlinear effects from linear ones. Here, we propose a procedure for accommodating nonlinear effects and for determining the cure rate model composition. The procedure is based on the Least Absolute Shrinkage and Selection Operators (LASSO). Specifically, by partitioning each variable into linear and nonlinear components, we use LASSO to select linear and nonlinear components. Operationally, we model the nonlinear components by cubic B-splines. The procedure adds to the existing variable selection methods an ability to discover hidden nonlinear effects in a cure rate model setting. To implement, we ascertain the maximum likelihood estimates by using an Expectation Maximization (EM) algorithm. We conduct an extensive simulation study to assess the operating characteristics of the selection procedure. We illustrate the use of the method by analyzing data from a real clinical study.

KW - cubic B-splines

KW - LASSO

KW - Mixture cure rate models

KW - variable selection

UR - http://www.scopus.com/inward/record.url?scp=85073454812&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073454812&partnerID=8YFLogxK

U2 - 10.1080/24709360.2019.1663665

DO - 10.1080/24709360.2019.1663665

M3 - Article

AN - SCOPUS:85073454812

VL - 3

SP - 156

EP - 177

JO - Biostatistics and Epidemiology

JF - Biostatistics and Epidemiology

SN - 2470-9360

IS - 1

ER -