Multi-center colonoscopy quality measurement utilizing natural language processing

Timothy Imler, Justin Morea, Charles Kahi, Jon Cardwell, Cynthia S. Johnson, Huiping Xu, Dennis Ahnen, Fadi Antaki, Christopher Ashley, Gyorgy Baffy, Ilseung Cho, Jason Dominitz, Jason Hou, Mark Korsten, Anil Nagar, Kittichai Promrat, Douglas Robertson, Sameer Saini, Amandeep Shergill, Walter SmalleyThomas Imperiale

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

Background:An accurate system for tracking of colonoscopy quality and surveillance intervals could improve the effectiveness and cost-effectiveness of colorectal cancer (CRC) screening and surveillance. The purpose of this study was to create and test such a system across multiple institutions utilizing natural language processing (NLP).Methods:From 42,569 colonoscopies with pathology records from 13 centers, we randomly sampled 750 paired reports. We trained (n=250) and tested (n=500) an NLP-based program with 19 measurements that encompass colonoscopy quality measures and surveillance interval determination, using blinded, paired, annotated expert manual review as the reference standard. The remaining 41,819 nonannotated documents were processed through the NLP system without manual review to assess performance consistency. The primary outcome was system accuracy across the 19 measures.Results:A total of 176 (23.5%) documents with 252 (1.8%) discrepant content points resulted from paired annotation. Error rate within the 500 test documents was 31.2% for NLP and 25.4% for the paired annotators (P=0.001). At the content point level within the test set, the error rate was 3.5% for NLP and 1.9% for the paired annotators (P=0.04). When eight vaguely worded documents were removed, 125 of 492 (25.4%) were incorrect by NLP and 104 of 492 (21.1%) by the initial annotator (P=0.07). Rates of pathologic findings calculated from NLP were similar to those calculated by annotation for the majority of measurements. Test set accuracy was 99.6% for CRC, 95% for advanced adenoma, 94.6% for nonadvanced adenoma, 99.8% for advanced sessile serrated polyps, 99.2% for nonadvanced sessile serrated polyps, 96.8% for large hyperplastic polyps, and 96.0% for small hyperplastic polyps. Lesion location showed high accuracy (87.0-99.8%). Accuracy for number of adenomas was 92%.Conclusions:NLP can accurately report adenoma detection rate and the components for determining guideline-adherent colonoscopy surveillance intervals across multiple sites that utilize different methods for reporting colonoscopy findings.

Original languageEnglish
Pages (from-to)543-552
Number of pages10
JournalAmerican Journal of Gastroenterology
Volume110
Issue number4
DOIs
StatePublished - Apr 16 2015

Fingerprint

Natural Language Processing
Colonoscopy
Polyps
Adenoma
Colorectal Neoplasms
Early Detection of Cancer
Cost-Benefit Analysis
Guidelines
Pathology

ASJC Scopus subject areas

  • Gastroenterology
  • Medicine(all)

Cite this

Multi-center colonoscopy quality measurement utilizing natural language processing. / Imler, Timothy; Morea, Justin; Kahi, Charles; Cardwell, Jon; Johnson, Cynthia S.; Xu, Huiping; Ahnen, Dennis; Antaki, Fadi; Ashley, Christopher; Baffy, Gyorgy; Cho, Ilseung; Dominitz, Jason; Hou, Jason; Korsten, Mark; Nagar, Anil; Promrat, Kittichai; Robertson, Douglas; Saini, Sameer; Shergill, Amandeep; Smalley, Walter; Imperiale, Thomas.

In: American Journal of Gastroenterology, Vol. 110, No. 4, 16.04.2015, p. 543-552.

Research output: Contribution to journalArticle

Imler, T, Morea, J, Kahi, C, Cardwell, J, Johnson, CS, Xu, H, Ahnen, D, Antaki, F, Ashley, C, Baffy, G, Cho, I, Dominitz, J, Hou, J, Korsten, M, Nagar, A, Promrat, K, Robertson, D, Saini, S, Shergill, A, Smalley, W & Imperiale, T 2015, 'Multi-center colonoscopy quality measurement utilizing natural language processing', American Journal of Gastroenterology, vol. 110, no. 4, pp. 543-552. https://doi.org/10.1038/ajg.2015.51
Imler, Timothy ; Morea, Justin ; Kahi, Charles ; Cardwell, Jon ; Johnson, Cynthia S. ; Xu, Huiping ; Ahnen, Dennis ; Antaki, Fadi ; Ashley, Christopher ; Baffy, Gyorgy ; Cho, Ilseung ; Dominitz, Jason ; Hou, Jason ; Korsten, Mark ; Nagar, Anil ; Promrat, Kittichai ; Robertson, Douglas ; Saini, Sameer ; Shergill, Amandeep ; Smalley, Walter ; Imperiale, Thomas. / Multi-center colonoscopy quality measurement utilizing natural language processing. In: American Journal of Gastroenterology. 2015 ; Vol. 110, No. 4. pp. 543-552.
@article{fea9f494848e47b0872d4392b589354b,
title = "Multi-center colonoscopy quality measurement utilizing natural language processing",
abstract = "Background:An accurate system for tracking of colonoscopy quality and surveillance intervals could improve the effectiveness and cost-effectiveness of colorectal cancer (CRC) screening and surveillance. The purpose of this study was to create and test such a system across multiple institutions utilizing natural language processing (NLP).Methods:From 42,569 colonoscopies with pathology records from 13 centers, we randomly sampled 750 paired reports. We trained (n=250) and tested (n=500) an NLP-based program with 19 measurements that encompass colonoscopy quality measures and surveillance interval determination, using blinded, paired, annotated expert manual review as the reference standard. The remaining 41,819 nonannotated documents were processed through the NLP system without manual review to assess performance consistency. The primary outcome was system accuracy across the 19 measures.Results:A total of 176 (23.5{\%}) documents with 252 (1.8{\%}) discrepant content points resulted from paired annotation. Error rate within the 500 test documents was 31.2{\%} for NLP and 25.4{\%} for the paired annotators (P=0.001). At the content point level within the test set, the error rate was 3.5{\%} for NLP and 1.9{\%} for the paired annotators (P=0.04). When eight vaguely worded documents were removed, 125 of 492 (25.4{\%}) were incorrect by NLP and 104 of 492 (21.1{\%}) by the initial annotator (P=0.07). Rates of pathologic findings calculated from NLP were similar to those calculated by annotation for the majority of measurements. Test set accuracy was 99.6{\%} for CRC, 95{\%} for advanced adenoma, 94.6{\%} for nonadvanced adenoma, 99.8{\%} for advanced sessile serrated polyps, 99.2{\%} for nonadvanced sessile serrated polyps, 96.8{\%} for large hyperplastic polyps, and 96.0{\%} for small hyperplastic polyps. Lesion location showed high accuracy (87.0-99.8{\%}). Accuracy for number of adenomas was 92{\%}.Conclusions:NLP can accurately report adenoma detection rate and the components for determining guideline-adherent colonoscopy surveillance intervals across multiple sites that utilize different methods for reporting colonoscopy findings.",
author = "Timothy Imler and Justin Morea and Charles Kahi and Jon Cardwell and Johnson, {Cynthia S.} and Huiping Xu and Dennis Ahnen and Fadi Antaki and Christopher Ashley and Gyorgy Baffy and Ilseung Cho and Jason Dominitz and Jason Hou and Mark Korsten and Anil Nagar and Kittichai Promrat and Douglas Robertson and Sameer Saini and Amandeep Shergill and Walter Smalley and Thomas Imperiale",
year = "2015",
month = "4",
day = "16",
doi = "10.1038/ajg.2015.51",
language = "English",
volume = "110",
pages = "543--552",
journal = "American Journal of Gastroenterology",
issn = "0002-9270",
publisher = "Nature Publishing Group",
number = "4",

}

TY - JOUR

T1 - Multi-center colonoscopy quality measurement utilizing natural language processing

AU - Imler, Timothy

AU - Morea, Justin

AU - Kahi, Charles

AU - Cardwell, Jon

AU - Johnson, Cynthia S.

AU - Xu, Huiping

AU - Ahnen, Dennis

AU - Antaki, Fadi

AU - Ashley, Christopher

AU - Baffy, Gyorgy

AU - Cho, Ilseung

AU - Dominitz, Jason

AU - Hou, Jason

AU - Korsten, Mark

AU - Nagar, Anil

AU - Promrat, Kittichai

AU - Robertson, Douglas

AU - Saini, Sameer

AU - Shergill, Amandeep

AU - Smalley, Walter

AU - Imperiale, Thomas

PY - 2015/4/16

Y1 - 2015/4/16

N2 - Background:An accurate system for tracking of colonoscopy quality and surveillance intervals could improve the effectiveness and cost-effectiveness of colorectal cancer (CRC) screening and surveillance. The purpose of this study was to create and test such a system across multiple institutions utilizing natural language processing (NLP).Methods:From 42,569 colonoscopies with pathology records from 13 centers, we randomly sampled 750 paired reports. We trained (n=250) and tested (n=500) an NLP-based program with 19 measurements that encompass colonoscopy quality measures and surveillance interval determination, using blinded, paired, annotated expert manual review as the reference standard. The remaining 41,819 nonannotated documents were processed through the NLP system without manual review to assess performance consistency. The primary outcome was system accuracy across the 19 measures.Results:A total of 176 (23.5%) documents with 252 (1.8%) discrepant content points resulted from paired annotation. Error rate within the 500 test documents was 31.2% for NLP and 25.4% for the paired annotators (P=0.001). At the content point level within the test set, the error rate was 3.5% for NLP and 1.9% for the paired annotators (P=0.04). When eight vaguely worded documents were removed, 125 of 492 (25.4%) were incorrect by NLP and 104 of 492 (21.1%) by the initial annotator (P=0.07). Rates of pathologic findings calculated from NLP were similar to those calculated by annotation for the majority of measurements. Test set accuracy was 99.6% for CRC, 95% for advanced adenoma, 94.6% for nonadvanced adenoma, 99.8% for advanced sessile serrated polyps, 99.2% for nonadvanced sessile serrated polyps, 96.8% for large hyperplastic polyps, and 96.0% for small hyperplastic polyps. Lesion location showed high accuracy (87.0-99.8%). Accuracy for number of adenomas was 92%.Conclusions:NLP can accurately report adenoma detection rate and the components for determining guideline-adherent colonoscopy surveillance intervals across multiple sites that utilize different methods for reporting colonoscopy findings.

AB - Background:An accurate system for tracking of colonoscopy quality and surveillance intervals could improve the effectiveness and cost-effectiveness of colorectal cancer (CRC) screening and surveillance. The purpose of this study was to create and test such a system across multiple institutions utilizing natural language processing (NLP).Methods:From 42,569 colonoscopies with pathology records from 13 centers, we randomly sampled 750 paired reports. We trained (n=250) and tested (n=500) an NLP-based program with 19 measurements that encompass colonoscopy quality measures and surveillance interval determination, using blinded, paired, annotated expert manual review as the reference standard. The remaining 41,819 nonannotated documents were processed through the NLP system without manual review to assess performance consistency. The primary outcome was system accuracy across the 19 measures.Results:A total of 176 (23.5%) documents with 252 (1.8%) discrepant content points resulted from paired annotation. Error rate within the 500 test documents was 31.2% for NLP and 25.4% for the paired annotators (P=0.001). At the content point level within the test set, the error rate was 3.5% for NLP and 1.9% for the paired annotators (P=0.04). When eight vaguely worded documents were removed, 125 of 492 (25.4%) were incorrect by NLP and 104 of 492 (21.1%) by the initial annotator (P=0.07). Rates of pathologic findings calculated from NLP were similar to those calculated by annotation for the majority of measurements. Test set accuracy was 99.6% for CRC, 95% for advanced adenoma, 94.6% for nonadvanced adenoma, 99.8% for advanced sessile serrated polyps, 99.2% for nonadvanced sessile serrated polyps, 96.8% for large hyperplastic polyps, and 96.0% for small hyperplastic polyps. Lesion location showed high accuracy (87.0-99.8%). Accuracy for number of adenomas was 92%.Conclusions:NLP can accurately report adenoma detection rate and the components for determining guideline-adherent colonoscopy surveillance intervals across multiple sites that utilize different methods for reporting colonoscopy findings.

UR - http://www.scopus.com/inward/record.url?scp=84927800663&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84927800663&partnerID=8YFLogxK

U2 - 10.1038/ajg.2015.51

DO - 10.1038/ajg.2015.51

M3 - Article

C2 - 25756240

AN - SCOPUS:84927800663

VL - 110

SP - 543

EP - 552

JO - American Journal of Gastroenterology

JF - American Journal of Gastroenterology

SN - 0002-9270

IS - 4

ER -