An efficient pancreatic cyst identification methodology using natural language processing

Saeed Mehrabi, C. Max Schmidt, Joshua A. Waters, Chris Beesley, Anand Krishnan, Joe Kesterson, Paul Dexter, Mohammed A. Al-Haddad, William M. Tierney, Mathew Palakal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9% and 89% respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9% and recall of 95.7%. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5% precision and 97.43% recall. SDP improved the NegEx algorithm by increasing the recall to 98.12%.

Original languageEnglish (US)
Title of host publicationMEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics
PublisherIOS Press
Pages822-826
Number of pages5
Edition1-2
ISBN (Print)9781614992882
DOIs
StatePublished - Jan 1 2013
Event14th World Congress on Medical and Health Informatics, MEDINFO 2013 - Copenhagen, Denmark
Duration: Aug 20 2013Aug 23 2013

Publication series

NameStudies in Health Technology and Informatics
Number1-2
Volume192
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Other

Other14th World Congress on Medical and Health Informatics, MEDINFO 2013
CountryDenmark
CityCopenhagen
Period8/20/138/23/13

Fingerprint

Natural Language Processing
Pancreatic Cyst
Natural language processing systems
Processing
Pancreatic Neoplasms
Medical Records
Neoplasms
Retrospective Studies

Keywords

  • dependency parser
  • Natural language processing
  • negation
  • Pancreatic cyst
  • Unstructured Information Management Architecture

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management
  • Medicine(all)

Cite this

Mehrabi, S., Schmidt, C. M., Waters, J. A., Beesley, C., Krishnan, A., Kesterson, J., ... Palakal, M. (2013). An efficient pancreatic cyst identification methodology using natural language processing. In MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics (1-2 ed., pp. 822-826). (Studies in Health Technology and Informatics; Vol. 192, No. 1-2). IOS Press. https://doi.org/10.3233/978-1-61499-289-9-822

An efficient pancreatic cyst identification methodology using natural language processing. / Mehrabi, Saeed; Schmidt, C. Max; Waters, Joshua A.; Beesley, Chris; Krishnan, Anand; Kesterson, Joe; Dexter, Paul; Al-Haddad, Mohammed A.; Tierney, William M.; Palakal, Mathew.

MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics. 1-2. ed. IOS Press, 2013. p. 822-826 (Studies in Health Technology and Informatics; Vol. 192, No. 1-2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mehrabi, S, Schmidt, CM, Waters, JA, Beesley, C, Krishnan, A, Kesterson, J, Dexter, P, Al-Haddad, MA, Tierney, WM & Palakal, M 2013, An efficient pancreatic cyst identification methodology using natural language processing. in MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics. 1-2 edn, Studies in Health Technology and Informatics, no. 1-2, vol. 192, IOS Press, pp. 822-826, 14th World Congress on Medical and Health Informatics, MEDINFO 2013, Copenhagen, Denmark, 8/20/13. https://doi.org/10.3233/978-1-61499-289-9-822
Mehrabi S, Schmidt CM, Waters JA, Beesley C, Krishnan A, Kesterson J et al. An efficient pancreatic cyst identification methodology using natural language processing. In MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics. 1-2 ed. IOS Press. 2013. p. 822-826. (Studies in Health Technology and Informatics; 1-2). https://doi.org/10.3233/978-1-61499-289-9-822
Mehrabi, Saeed ; Schmidt, C. Max ; Waters, Joshua A. ; Beesley, Chris ; Krishnan, Anand ; Kesterson, Joe ; Dexter, Paul ; Al-Haddad, Mohammed A. ; Tierney, William M. ; Palakal, Mathew. / An efficient pancreatic cyst identification methodology using natural language processing. MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics. 1-2. ed. IOS Press, 2013. pp. 822-826 (Studies in Health Technology and Informatics; 1-2).
@inproceedings{54da4cd996424936a63affed62a70873,
title = "An efficient pancreatic cyst identification methodology using natural language processing",
abstract = "Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9{\%} and 89{\%} respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9{\%} and recall of 95.7{\%}. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5{\%} precision and 97.43{\%} recall. SDP improved the NegEx algorithm by increasing the recall to 98.12{\%}.",
keywords = "dependency parser, Natural language processing, negation, Pancreatic cyst, Unstructured Information Management Architecture",
author = "Saeed Mehrabi and Schmidt, {C. Max} and Waters, {Joshua A.} and Chris Beesley and Anand Krishnan and Joe Kesterson and Paul Dexter and Al-Haddad, {Mohammed A.} and Tierney, {William M.} and Mathew Palakal",
year = "2013",
month = "1",
day = "1",
doi = "10.3233/978-1-61499-289-9-822",
language = "English (US)",
isbn = "9781614992882",
series = "Studies in Health Technology and Informatics",
publisher = "IOS Press",
number = "1-2",
pages = "822--826",
booktitle = "MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics",
edition = "1-2",

}

TY - GEN

T1 - An efficient pancreatic cyst identification methodology using natural language processing

AU - Mehrabi, Saeed

AU - Schmidt, C. Max

AU - Waters, Joshua A.

AU - Beesley, Chris

AU - Krishnan, Anand

AU - Kesterson, Joe

AU - Dexter, Paul

AU - Al-Haddad, Mohammed A.

AU - Tierney, William M.

AU - Palakal, Mathew

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9% and 89% respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9% and recall of 95.7%. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5% precision and 97.43% recall. SDP improved the NegEx algorithm by increasing the recall to 98.12%.

AB - Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9% and 89% respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9% and recall of 95.7%. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5% precision and 97.43% recall. SDP improved the NegEx algorithm by increasing the recall to 98.12%.

KW - dependency parser

KW - Natural language processing

KW - negation

KW - Pancreatic cyst

KW - Unstructured Information Management Architecture

UR - http://www.scopus.com/inward/record.url?scp=84894328654&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894328654&partnerID=8YFLogxK

U2 - 10.3233/978-1-61499-289-9-822

DO - 10.3233/978-1-61499-289-9-822

M3 - Conference contribution

C2 - 23920672

AN - SCOPUS:84894328654

SN - 9781614992882

T3 - Studies in Health Technology and Informatics

SP - 822

EP - 826

BT - MEDINFO 2013 - Proceedings of the 14th World Congress on Medical and Health Informatics

PB - IOS Press

ER -