Identification of high risk patients for colonoscopy surveillance from EMR text

Eric Sherer, Hsin Ying Huang, Thomas Imperiale, Gaurav Nanda, Mark Lehto

Research output: Chapter in Book/Report/Conference proceedingChapter


In the United States, colorectal cancer (CRC) is third most common cancer diagnosed in both men and women. As such, screening tests such as colonoscopy are used to detect CRC. In addition, the removal of pre-cancerous adenomatous polyps via polypectomy during a colonoscopy is associated with lower lifetime incidence (Winawer et al., 1993) and mortality (Baxter et al., 2009). However, this risk is not uniform across the population, so a follow-up colonoscopy needs to be regularly scheduled especially for high-risk patients. The purpose of this study is to calibrate and validate a computerized method to identify average-risk patients for screening colonoscopies, patients with inadequate preparation quality, and patents at high risk for colorectal cancer based on free text in the electronic medical record. This study presents the use of naïve Bayes machine learning tool for prediction. The results showed the effectiveness of this automated method in terms of sensitivity (98%, 96%, and 100%) and positive predictive value (98%, 98%, and 85%), which lessens the effort of manual coding and improves its accuracy.

Original languageEnglish (US)
Title of host publicationAdvances in Physical Ergonomics and Safety
PublisherCRC Press
Number of pages8
ISBN (Electronic)9781439870594
ISBN (Print)9781439870389
StatePublished - Jan 1 2012


  • And colonoscopy surveillance
  • Coding accuracy
  • Computerized coding
  • Manual coding
  • Text mining

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Identification of high risk patients for colonoscopy surveillance from EMR text'. Together they form a unique fingerprint.

  • Cite this

    Sherer, E., Huang, H. Y., Imperiale, T., Nanda, G., & Lehto, M. (2012). Identification of high risk patients for colonoscopy surveillance from EMR text. In Advances in Physical Ergonomics and Safety (pp. 88-95). CRC Press.