On the uncertainty of individual prediction because of sampling predictors

Changyu Shen, Xiaochun Li

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re-sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re-sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life-death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set.

Original languageEnglish (US)
JournalStatistics in Medicine
DOIs
StateAccepted/In press - 2015

Fingerprint

Uncertainty
Predictors
Prediction Model
Medicine
Prediction
Proportional Hazards Models
Health Status
Logistic Models
Unit
Cox Regression Model
Research
Logistic Regression Model
Training Samples
Standard error
Health
Simulation Study
Subset
Line
Estimate
Demonstrate

Keywords

  • Conditional distribution
  • Frequentist principle
  • Prediction uncertainty
  • Predictor-sampling variation
  • Subject-sampling variation

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

On the uncertainty of individual prediction because of sampling predictors. / Shen, Changyu; Li, Xiaochun.

In: Statistics in Medicine, 2015.

Research output: Contribution to journalArticle

@article{b17d2076d8c24885b68aa0be573f6493,
title = "On the uncertainty of individual prediction because of sampling predictors",
abstract = "Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re-sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re-sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life-death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set.",
keywords = "Conditional distribution, Frequentist principle, Prediction uncertainty, Predictor-sampling variation, Subject-sampling variation",
author = "Changyu Shen and Xiaochun Li",
year = "2015",
doi = "10.1002/sim.6849",
language = "English (US)",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",

}

TY - JOUR

T1 - On the uncertainty of individual prediction because of sampling predictors

AU - Shen, Changyu

AU - Li, Xiaochun

PY - 2015

Y1 - 2015

N2 - Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re-sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re-sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life-death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set.

AB - Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re-sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re-sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life-death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set.

KW - Conditional distribution

KW - Frequentist principle

KW - Prediction uncertainty

KW - Predictor-sampling variation

KW - Subject-sampling variation

UR - http://www.scopus.com/inward/record.url?scp=84952705866&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84952705866&partnerID=8YFLogxK

U2 - 10.1002/sim.6849

DO - 10.1002/sim.6849

M3 - Article

C2 - 26712471

AN - SCOPUS:84952705866

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

ER -