Identification of high risk patients for colonoscopy surveillance from EMR text

Eric Sherer, Hsin Ying Huang, Thomas Imperiale, Gaurav Nanda, Mark Lehto

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

In the United States, colorectal cancer (CRC) is third most common cancer diagnosed in both men and women. As such, screening tests such as colonoscopy are used to detect CRC. In addition, the removal of pre-cancerous adenomatous polyps via polypectomy during a colonoscopy is associated with lower lifetime incidence (Winawer et al., 1993) and mortality (Baxter et al., 2009). However, this risk is not uniform across the population, so a follow-up colonoscopy needs to be regularly scheduled especially for high-risk patients. The purpose of this study is to calibrate and validate a computerized method to identify average-risk patients for screening colonoscopies, patients with inadequate preparation quality, and patents at high risk for colorectal cancer based on free text in the electronic medical record. This study presents the use of naïve Bayes machine learning tool for prediction. The results showed the effectiveness of this automated method in terms of sensitivity (98%, 96%, and 100%) and positive predictive value (98%, 98%, and 85%), which lessens the effort of manual coding and improves its accuracy.

Original languageEnglish (US)
Title of host publicationAdvances in Physical Ergonomics and Safety
PublisherCRC Press
Pages88-95
Number of pages8
ISBN (Electronic)9781439870594
ISBN (Print)9781439870389
DOIs
StatePublished - Jan 1 2012

Fingerprint

Screening
Electronic medical equipment
Learning systems

Keywords

  • And colonoscopy surveillance
  • Coding accuracy
  • Computerized coding
  • Manual coding
  • Text mining

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Sherer, E., Huang, H. Y., Imperiale, T., Nanda, G., & Lehto, M. (2012). Identification of high risk patients for colonoscopy surveillance from EMR text. In Advances in Physical Ergonomics and Safety (pp. 88-95). CRC Press. https://doi.org/10.1201/b12323

Identification of high risk patients for colonoscopy surveillance from EMR text. / Sherer, Eric; Huang, Hsin Ying; Imperiale, Thomas; Nanda, Gaurav; Lehto, Mark.

Advances in Physical Ergonomics and Safety. CRC Press, 2012. p. 88-95.

Research output: Chapter in Book/Report/Conference proceedingChapter

Sherer, E, Huang, HY, Imperiale, T, Nanda, G & Lehto, M 2012, Identification of high risk patients for colonoscopy surveillance from EMR text. in Advances in Physical Ergonomics and Safety. CRC Press, pp. 88-95. https://doi.org/10.1201/b12323
Sherer E, Huang HY, Imperiale T, Nanda G, Lehto M. Identification of high risk patients for colonoscopy surveillance from EMR text. In Advances in Physical Ergonomics and Safety. CRC Press. 2012. p. 88-95 https://doi.org/10.1201/b12323
Sherer, Eric ; Huang, Hsin Ying ; Imperiale, Thomas ; Nanda, Gaurav ; Lehto, Mark. / Identification of high risk patients for colonoscopy surveillance from EMR text. Advances in Physical Ergonomics and Safety. CRC Press, 2012. pp. 88-95
@inbook{ae9e42d78f6b475f9eb1d1ec4faf86dc,
title = "Identification of high risk patients for colonoscopy surveillance from EMR text",
abstract = "In the United States, colorectal cancer (CRC) is third most common cancer diagnosed in both men and women. As such, screening tests such as colonoscopy are used to detect CRC. In addition, the removal of pre-cancerous adenomatous polyps via polypectomy during a colonoscopy is associated with lower lifetime incidence (Winawer et al., 1993) and mortality (Baxter et al., 2009). However, this risk is not uniform across the population, so a follow-up colonoscopy needs to be regularly scheduled especially for high-risk patients. The purpose of this study is to calibrate and validate a computerized method to identify average-risk patients for screening colonoscopies, patients with inadequate preparation quality, and patents at high risk for colorectal cancer based on free text in the electronic medical record. This study presents the use of na{\"i}ve Bayes machine learning tool for prediction. The results showed the effectiveness of this automated method in terms of sensitivity (98{\%}, 96{\%}, and 100{\%}) and positive predictive value (98{\%}, 98{\%}, and 85{\%}), which lessens the effort of manual coding and improves its accuracy.",
keywords = "And colonoscopy surveillance, Coding accuracy, Computerized coding, Manual coding, Text mining",
author = "Eric Sherer and Huang, {Hsin Ying} and Thomas Imperiale and Gaurav Nanda and Mark Lehto",
year = "2012",
month = "1",
day = "1",
doi = "10.1201/b12323",
language = "English (US)",
isbn = "9781439870389",
pages = "88--95",
booktitle = "Advances in Physical Ergonomics and Safety",
publisher = "CRC Press",

}

TY - CHAP

T1 - Identification of high risk patients for colonoscopy surveillance from EMR text

AU - Sherer, Eric

AU - Huang, Hsin Ying

AU - Imperiale, Thomas

AU - Nanda, Gaurav

AU - Lehto, Mark

PY - 2012/1/1

Y1 - 2012/1/1

N2 - In the United States, colorectal cancer (CRC) is third most common cancer diagnosed in both men and women. As such, screening tests such as colonoscopy are used to detect CRC. In addition, the removal of pre-cancerous adenomatous polyps via polypectomy during a colonoscopy is associated with lower lifetime incidence (Winawer et al., 1993) and mortality (Baxter et al., 2009). However, this risk is not uniform across the population, so a follow-up colonoscopy needs to be regularly scheduled especially for high-risk patients. The purpose of this study is to calibrate and validate a computerized method to identify average-risk patients for screening colonoscopies, patients with inadequate preparation quality, and patents at high risk for colorectal cancer based on free text in the electronic medical record. This study presents the use of naïve Bayes machine learning tool for prediction. The results showed the effectiveness of this automated method in terms of sensitivity (98%, 96%, and 100%) and positive predictive value (98%, 98%, and 85%), which lessens the effort of manual coding and improves its accuracy.

AB - In the United States, colorectal cancer (CRC) is third most common cancer diagnosed in both men and women. As such, screening tests such as colonoscopy are used to detect CRC. In addition, the removal of pre-cancerous adenomatous polyps via polypectomy during a colonoscopy is associated with lower lifetime incidence (Winawer et al., 1993) and mortality (Baxter et al., 2009). However, this risk is not uniform across the population, so a follow-up colonoscopy needs to be regularly scheduled especially for high-risk patients. The purpose of this study is to calibrate and validate a computerized method to identify average-risk patients for screening colonoscopies, patients with inadequate preparation quality, and patents at high risk for colorectal cancer based on free text in the electronic medical record. This study presents the use of naïve Bayes machine learning tool for prediction. The results showed the effectiveness of this automated method in terms of sensitivity (98%, 96%, and 100%) and positive predictive value (98%, 98%, and 85%), which lessens the effort of manual coding and improves its accuracy.

KW - And colonoscopy surveillance

KW - Coding accuracy

KW - Computerized coding

KW - Manual coding

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=85055491691&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055491691&partnerID=8YFLogxK

U2 - 10.1201/b12323

DO - 10.1201/b12323

M3 - Chapter

AN - SCOPUS:85055491691

SN - 9781439870389

SP - 88

EP - 95

BT - Advances in Physical Ergonomics and Safety

PB - CRC Press

ER -