Selecting pre-screening items for early intervention trials of dementia - A case study

Lang Li, Jeffrey Huang, Sharon Sun, Jianzhao Shen, Frederick Unverzagt, Sujuan Gao, Hugh Hendrie, Kathleen Hall, Siu Hui

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Our goal was to review and extend statistical methods for discriminating between normal subjects and those with dementia or cognitive impairment. We compared six different methods to one constructed by expert opinion, in their brevity and predictive power. The methods include logistic regression and neural networks, with standard and least absolute shrinkage and selection operator (LASSO) variable selection, as well as decision trees with and without boosting. These methods were applied to the baseline data of a subgroup of subjects in a dementia study, using their screening interview items to predict their clinical diagnosis of normal or non-normal (cognitively impaired or demented). The derived models were then validated on a different subgroup of subjects in the same study who had the screening and clinical diagnosis two to five years later. Performance of different models was compared based on their sensitivity and specificity in the validation sample. Generally, the six statistical methods performed slightly to moderately better than the expert-opinion model. Neural networks generally performed better than the logistic and decision tree models. LASSO improved the performance of logistic and neural network models, but it eliminated few input variables in the neural network. The single decision tree performed at least as well as the standard logistic model, and with fewer items, making it an attractive pre-screening tool. Using the boosting option for decision trees did not substantially improve the performance. We recommend that for each situation, different methods of classification should be attempted to obtain optimal results for a given purpose.

Original languageEnglish
Pages (from-to)271-283
Number of pages13
JournalStatistics in Medicine
Volume23
Issue number2
DOIs
StatePublished - Jan 30 2004

Fingerprint

Dementia
Decision tree
Screening
Decision Trees
Expert Opinion
Neural Networks
Boosting
Shrinkage
Statistical method
Logistics
Expert Testimony
Subgroup
Logistic Model
Variable Selection
Logistic Regression
Operator
Logistic Models
Neural Network Model
Model
Specificity

Keywords

  • Classification
  • Decision tree
  • Discrimination
  • LASSO
  • Logistic regression
  • Neural network

ASJC Scopus subject areas

  • Epidemiology

Cite this

Selecting pre-screening items for early intervention trials of dementia - A case study. / Li, Lang; Huang, Jeffrey; Sun, Sharon; Shen, Jianzhao; Unverzagt, Frederick; Gao, Sujuan; Hendrie, Hugh; Hall, Kathleen; Hui, Siu.

In: Statistics in Medicine, Vol. 23, No. 2, 30.01.2004, p. 271-283.

Research output: Contribution to journalArticle

@article{83fb7b8661864393b2b8dc0a72774cd1,
title = "Selecting pre-screening items for early intervention trials of dementia - A case study",
abstract = "Our goal was to review and extend statistical methods for discriminating between normal subjects and those with dementia or cognitive impairment. We compared six different methods to one constructed by expert opinion, in their brevity and predictive power. The methods include logistic regression and neural networks, with standard and least absolute shrinkage and selection operator (LASSO) variable selection, as well as decision trees with and without boosting. These methods were applied to the baseline data of a subgroup of subjects in a dementia study, using their screening interview items to predict their clinical diagnosis of normal or non-normal (cognitively impaired or demented). The derived models were then validated on a different subgroup of subjects in the same study who had the screening and clinical diagnosis two to five years later. Performance of different models was compared based on their sensitivity and specificity in the validation sample. Generally, the six statistical methods performed slightly to moderately better than the expert-opinion model. Neural networks generally performed better than the logistic and decision tree models. LASSO improved the performance of logistic and neural network models, but it eliminated few input variables in the neural network. The single decision tree performed at least as well as the standard logistic model, and with fewer items, making it an attractive pre-screening tool. Using the boosting option for decision trees did not substantially improve the performance. We recommend that for each situation, different methods of classification should be attempted to obtain optimal results for a given purpose.",
keywords = "Classification, Decision tree, Discrimination, LASSO, Logistic regression, Neural network",
author = "Lang Li and Jeffrey Huang and Sharon Sun and Jianzhao Shen and Frederick Unverzagt and Sujuan Gao and Hugh Hendrie and Kathleen Hall and Siu Hui",
year = "2004",
month = "1",
day = "30",
doi = "10.1002/sim.1715",
language = "English",
volume = "23",
pages = "271--283",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "2",

}

TY - JOUR

T1 - Selecting pre-screening items for early intervention trials of dementia - A case study

AU - Li, Lang

AU - Huang, Jeffrey

AU - Sun, Sharon

AU - Shen, Jianzhao

AU - Unverzagt, Frederick

AU - Gao, Sujuan

AU - Hendrie, Hugh

AU - Hall, Kathleen

AU - Hui, Siu

PY - 2004/1/30

Y1 - 2004/1/30

N2 - Our goal was to review and extend statistical methods for discriminating between normal subjects and those with dementia or cognitive impairment. We compared six different methods to one constructed by expert opinion, in their brevity and predictive power. The methods include logistic regression and neural networks, with standard and least absolute shrinkage and selection operator (LASSO) variable selection, as well as decision trees with and without boosting. These methods were applied to the baseline data of a subgroup of subjects in a dementia study, using their screening interview items to predict their clinical diagnosis of normal or non-normal (cognitively impaired or demented). The derived models were then validated on a different subgroup of subjects in the same study who had the screening and clinical diagnosis two to five years later. Performance of different models was compared based on their sensitivity and specificity in the validation sample. Generally, the six statistical methods performed slightly to moderately better than the expert-opinion model. Neural networks generally performed better than the logistic and decision tree models. LASSO improved the performance of logistic and neural network models, but it eliminated few input variables in the neural network. The single decision tree performed at least as well as the standard logistic model, and with fewer items, making it an attractive pre-screening tool. Using the boosting option for decision trees did not substantially improve the performance. We recommend that for each situation, different methods of classification should be attempted to obtain optimal results for a given purpose.

AB - Our goal was to review and extend statistical methods for discriminating between normal subjects and those with dementia or cognitive impairment. We compared six different methods to one constructed by expert opinion, in their brevity and predictive power. The methods include logistic regression and neural networks, with standard and least absolute shrinkage and selection operator (LASSO) variable selection, as well as decision trees with and without boosting. These methods were applied to the baseline data of a subgroup of subjects in a dementia study, using their screening interview items to predict their clinical diagnosis of normal or non-normal (cognitively impaired or demented). The derived models were then validated on a different subgroup of subjects in the same study who had the screening and clinical diagnosis two to five years later. Performance of different models was compared based on their sensitivity and specificity in the validation sample. Generally, the six statistical methods performed slightly to moderately better than the expert-opinion model. Neural networks generally performed better than the logistic and decision tree models. LASSO improved the performance of logistic and neural network models, but it eliminated few input variables in the neural network. The single decision tree performed at least as well as the standard logistic model, and with fewer items, making it an attractive pre-screening tool. Using the boosting option for decision trees did not substantially improve the performance. We recommend that for each situation, different methods of classification should be attempted to obtain optimal results for a given purpose.

KW - Classification

KW - Decision tree

KW - Discrimination

KW - LASSO

KW - Logistic regression

KW - Neural network

UR - http://www.scopus.com/inward/record.url?scp=0347093538&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0347093538&partnerID=8YFLogxK

U2 - 10.1002/sim.1715

DO - 10.1002/sim.1715

M3 - Article

C2 - 14716728

AN - SCOPUS:0347093538

VL - 23

SP - 271

EP - 283

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 2

ER -