DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx

Saeed Mehrabi, Anand Krishnan, Sunghwan Sohn, Alexandra M. Roch, Heidi Schmidt, Joe Kesterson, Chris Beesley, Paul Dexter, C. Schmidt, Hongfang Liu, Mathew Palakal

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs.

Original languageEnglish (US)
Pages (from-to)213-219
Number of pages7
JournalJournal of Biomedical Informatics
Volume54
DOIs
StatePublished - Apr 1 2015

Fingerprint

Natural Language Processing
Electronic Health Records
Health
Processing
Cohort Studies

Keywords

  • Dependency parser
  • Natural language processing
  • Negation

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

Mehrabi, S., Krishnan, A., Sohn, S., Roch, A. M., Schmidt, H., Kesterson, J., ... Palakal, M. (2015). DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx. Journal of Biomedical Informatics, 54, 213-219. https://doi.org/10.1016/j.jbi.2015.02.010

DEEPEN : A negation detection system for clinical text incorporating dependency relation into NegEx. / Mehrabi, Saeed; Krishnan, Anand; Sohn, Sunghwan; Roch, Alexandra M.; Schmidt, Heidi; Kesterson, Joe; Beesley, Chris; Dexter, Paul; Schmidt, C.; Liu, Hongfang; Palakal, Mathew.

In: Journal of Biomedical Informatics, Vol. 54, 01.04.2015, p. 213-219.

Research output: Contribution to journalArticle

Mehrabi, S, Krishnan, A, Sohn, S, Roch, AM, Schmidt, H, Kesterson, J, Beesley, C, Dexter, P, Schmidt, C, Liu, H & Palakal, M 2015, 'DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx', Journal of Biomedical Informatics, vol. 54, pp. 213-219. https://doi.org/10.1016/j.jbi.2015.02.010
Mehrabi, Saeed ; Krishnan, Anand ; Sohn, Sunghwan ; Roch, Alexandra M. ; Schmidt, Heidi ; Kesterson, Joe ; Beesley, Chris ; Dexter, Paul ; Schmidt, C. ; Liu, Hongfang ; Palakal, Mathew. / DEEPEN : A negation detection system for clinical text incorporating dependency relation into NegEx. In: Journal of Biomedical Informatics. 2015 ; Vol. 54. pp. 213-219.
@article{228df0a2f9c541a9a5884b09e6e6c2e5,
title = "DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx",
abstract = "In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs.",
keywords = "Dependency parser, Natural language processing, Negation",
author = "Saeed Mehrabi and Anand Krishnan and Sunghwan Sohn and Roch, {Alexandra M.} and Heidi Schmidt and Joe Kesterson and Chris Beesley and Paul Dexter and C. Schmidt and Hongfang Liu and Mathew Palakal",
year = "2015",
month = "4",
day = "1",
doi = "10.1016/j.jbi.2015.02.010",
language = "English (US)",
volume = "54",
pages = "213--219",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - DEEPEN

T2 - A negation detection system for clinical text incorporating dependency relation into NegEx

AU - Mehrabi, Saeed

AU - Krishnan, Anand

AU - Sohn, Sunghwan

AU - Roch, Alexandra M.

AU - Schmidt, Heidi

AU - Kesterson, Joe

AU - Beesley, Chris

AU - Dexter, Paul

AU - Schmidt, C.

AU - Liu, Hongfang

AU - Palakal, Mathew

PY - 2015/4/1

Y1 - 2015/4/1

N2 - In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs.

AB - In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs.

KW - Dependency parser

KW - Natural language processing

KW - Negation

UR - http://www.scopus.com/inward/record.url?scp=84927927981&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84927927981&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2015.02.010

DO - 10.1016/j.jbi.2015.02.010

M3 - Article

C2 - 25791500

AN - SCOPUS:84927927981

VL - 54

SP - 213

EP - 219

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

ER -