Evaluating Methods for Identifying Cancer in Free-Text Pathology Reports Using Various Machine Learning and Data Preprocessing Approaches

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Automated detection methods can address delays and incompleteness in cancer case reporting. Existing automated efforts are largely dependent on complex dictionaries and coded data. Using a gold standard of manually reviewed pathology reports, we evaluated the performance of alternative input formats and decision models on a convenience sample of free-text pathology reports. Results showed that the input format significantly impacted performance, and specific algorithms yielded better results for presicion, recall and accuracy. We conclude that our approach is sufficiently accurate for practical purposes and represents a generalized process.

Original languageEnglish (US)
Title of host publicationStudies in Health Technology and Informatics
PublisherIOS Press
Pages1070
Number of pages1
Volume216
ISBN (Print)9781614995630
DOIs
StatePublished - 2015
Event15th World Congress on Health and Biomedical Informatics, MEDINFO 2015 - Sao Paulo, Brazil
Duration: Aug 19 2015Aug 23 2015

Publication series

NameStudies in Health Technology and Informatics
Volume216
ISSN (Print)09269630
ISSN (Electronic)18798365

Other

Other15th World Congress on Health and Biomedical Informatics, MEDINFO 2015
CountryBrazil
CitySao Paulo
Period8/19/158/23/15

Keywords

  • cancer
  • data preprocessing
  • decision models
  • ontologies
  • pathology
  • Public health reporting

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Fingerprint Dive into the research topics of 'Evaluating Methods for Identifying Cancer in Free-Text Pathology Reports Using Various Machine Learning and Data Preprocessing Approaches'. Together they form a unique fingerprint.

  • Cite this