Machine learning techniques for prediction of early childhood obesity

Tamara M. Dugan, S. Mukhopadhyay, A. Carroll, S. Downs

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

Objectives: This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA. Methods: Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Naïve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created. Results: Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85% and sensitivity of 89%. Additionally, the ID3 model had a positive predictive value of 84% and a negative predictive value of 88%. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model. Conclusions: This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.

Original languageEnglish (US)
Pages (from-to)506-520
Number of pages15
JournalApplied Clinical Informatics
Volume6
Issue number3
DOIs
StatePublished - Aug 12 2015

Fingerprint

Pediatric Obesity
Clinical Decision Support Systems
Learning systems
Decision support systems
Obesity
Machine Learning
Datasets

Keywords

  • Artificial intelligence
  • Bayes theorem
  • Decision trees
  • Obesity
  • Predictive analytics

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • Health Information Management

Cite this

Machine learning techniques for prediction of early childhood obesity. / Dugan, Tamara M.; Mukhopadhyay, S.; Carroll, A.; Downs, S.

In: Applied Clinical Informatics, Vol. 6, No. 3, 12.08.2015, p. 506-520.

Research output: Contribution to journalArticle

@article{5e2a25e4f84248f4a1f27a97ad10bcb2,
title = "Machine learning techniques for prediction of early childhood obesity",
abstract = "Objectives: This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA. Methods: Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Na{\"i}ve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created. Results: Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85{\%} and sensitivity of 89{\%}. Additionally, the ID3 model had a positive predictive value of 84{\%} and a negative predictive value of 88{\%}. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model. Conclusions: This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.",
keywords = "Artificial intelligence, Bayes theorem, Decision trees, Obesity, Predictive analytics",
author = "Dugan, {Tamara M.} and S. Mukhopadhyay and A. Carroll and S. Downs",
year = "2015",
month = "8",
day = "12",
doi = "10.4338/ACI-2015-03-RA-0036",
language = "English (US)",
volume = "6",
pages = "506--520",
journal = "Applied Clinical Informatics",
issn = "1869-0327",
publisher = "Schattauer GmbH",
number = "3",

}

TY - JOUR

T1 - Machine learning techniques for prediction of early childhood obesity

AU - Dugan, Tamara M.

AU - Mukhopadhyay, S.

AU - Carroll, A.

AU - Downs, S.

PY - 2015/8/12

Y1 - 2015/8/12

N2 - Objectives: This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA. Methods: Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Naïve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created. Results: Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85% and sensitivity of 89%. Additionally, the ID3 model had a positive predictive value of 84% and a negative predictive value of 88%. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model. Conclusions: This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.

AB - Objectives: This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA. Methods: Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Naïve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created. Results: Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85% and sensitivity of 89%. Additionally, the ID3 model had a positive predictive value of 84% and a negative predictive value of 88%. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model. Conclusions: This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.

KW - Artificial intelligence

KW - Bayes theorem

KW - Decision trees

KW - Obesity

KW - Predictive analytics

UR - http://www.scopus.com/inward/record.url?scp=84939805066&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939805066&partnerID=8YFLogxK

U2 - 10.4338/ACI-2015-03-RA-0036

DO - 10.4338/ACI-2015-03-RA-0036

M3 - Article

C2 - 26448795

AN - SCOPUS:84939805066

VL - 6

SP - 506

EP - 520

JO - Applied Clinical Informatics

JF - Applied Clinical Informatics

SN - 1869-0327

IS - 3

ER -