Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data

Priyanka Gandhi, Xiao Luo, Susan Storey, Zuoyi Zhang, Zhi Han, Kun Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Patients with chronic conditions such as breast cancer and colorectal cancer often present with different symptoms, such ‘fatigue’, ‘pain’ and ‘depression’. These symptoms add to patients’ distress and functional impairment if left untreated. In this research, we investigate a symptom clustering and association mining framework to firstly extract and cluster the symptoms from the Electronic Health Record (EHR) clinical reports, then secondly to analyze the associations between symptom clusters and clinical attributes. The universal sentence coder and a modified seed based k-means algorithm are used for symptom coding and clustering. The results show that the symptom clusters have different associations between breast cancer and colorectal cancer, as well as for different time frames after chemotherapy. The results also show that breast cancer patients have slightly more symptoms from these three symptom clusters compared to the colorectal cancer patients within 12 months after the chemotherapy. Whereas, the colorectal cancer patient cohort has slightly more depression on average between 48 months and 54 months after the chemotherapy. Through applying the association rule mining, we find some informative rules, such as ‘if a patient is at a higher cancer stage of colorectal cancer (3B), but no fatigue symptom, he or she likely doesn’t have depression and peripheral neuropathy’. Our methods can be generalized to analyze symptom clusters of other chronic diseases where symptom management is critical.

Original languageEnglish (US)
Title of host publicationACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
PublisherAssociation for Computing Machinery, Inc
Pages405-413
Number of pages9
ISBN (Electronic)9781450366663
DOIs
StatePublished - Sep 4 2019
Event10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2019 - Niagara Falls, United States
Duration: Sep 7 2019Sep 10 2019

Publication series

NameACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Conference

Conference10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2019
CountryUnited States
CityNiagara Falls
Period9/7/199/10/19

Fingerprint

Chemotherapy
Electronic Health Records
Colorectal Neoplasms
Health
Breast Neoplasms
Fatigue of materials
Association rules
Depression
Drug Therapy
Fatigue
Cluster Analysis
Seed
Peripheral Nervous System Diseases
Disease Management
Seeds
Chronic Disease
Pain
Research
Neoplasms

Keywords

  • Association Mining
  • Breast Cancer
  • Colorectal Cancer
  • EHR data
  • Medical Term Embedding
  • Symptom Clustering

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Cite this

Gandhi, P., Luo, X., Storey, S., Zhang, Z., Han, Z., & Huang, K. (2019). Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data. In ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (pp. 405-413). (ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics). Association for Computing Machinery, Inc. https://doi.org/10.1145/3307339.3342164

Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data. / Gandhi, Priyanka; Luo, Xiao; Storey, Susan; Zhang, Zuoyi; Han, Zhi; Huang, Kun.

ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Association for Computing Machinery, Inc, 2019. p. 405-413 (ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gandhi, P, Luo, X, Storey, S, Zhang, Z, Han, Z & Huang, K 2019, Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data. in ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Association for Computing Machinery, Inc, pp. 405-413, 10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2019, Niagara Falls, United States, 9/7/19. https://doi.org/10.1145/3307339.3342164
Gandhi P, Luo X, Storey S, Zhang Z, Han Z, Huang K. Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data. In ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Association for Computing Machinery, Inc. 2019. p. 405-413. (ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics). https://doi.org/10.1145/3307339.3342164
Gandhi, Priyanka ; Luo, Xiao ; Storey, Susan ; Zhang, Zuoyi ; Han, Zhi ; Huang, Kun. / Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data. ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Association for Computing Machinery, Inc, 2019. pp. 405-413 (ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics).
@inproceedings{5299c2d871ab4499b722c95825c74929,
title = "Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data",
abstract = "Patients with chronic conditions such as breast cancer and colorectal cancer often present with different symptoms, such ‘fatigue’, ‘pain’ and ‘depression’. These symptoms add to patients’ distress and functional impairment if left untreated. In this research, we investigate a symptom clustering and association mining framework to firstly extract and cluster the symptoms from the Electronic Health Record (EHR) clinical reports, then secondly to analyze the associations between symptom clusters and clinical attributes. The universal sentence coder and a modified seed based k-means algorithm are used for symptom coding and clustering. The results show that the symptom clusters have different associations between breast cancer and colorectal cancer, as well as for different time frames after chemotherapy. The results also show that breast cancer patients have slightly more symptoms from these three symptom clusters compared to the colorectal cancer patients within 12 months after the chemotherapy. Whereas, the colorectal cancer patient cohort has slightly more depression on average between 48 months and 54 months after the chemotherapy. Through applying the association rule mining, we find some informative rules, such as ‘if a patient is at a higher cancer stage of colorectal cancer (3B), but no fatigue symptom, he or she likely doesn’t have depression and peripheral neuropathy’. Our methods can be generalized to analyze symptom clusters of other chronic diseases where symptom management is critical.",
keywords = "Association Mining, Breast Cancer, Colorectal Cancer, EHR data, Medical Term Embedding, Symptom Clustering",
author = "Priyanka Gandhi and Xiao Luo and Susan Storey and Zuoyi Zhang and Zhi Han and Kun Huang",
year = "2019",
month = "9",
day = "4",
doi = "10.1145/3307339.3342164",
language = "English (US)",
series = "ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics",
publisher = "Association for Computing Machinery, Inc",
pages = "405--413",
booktitle = "ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics",

}

TY - GEN

T1 - Identifying symptom clusters in breast cancer and colorectal cancer patients using EHR data

AU - Gandhi, Priyanka

AU - Luo, Xiao

AU - Storey, Susan

AU - Zhang, Zuoyi

AU - Han, Zhi

AU - Huang, Kun

PY - 2019/9/4

Y1 - 2019/9/4

N2 - Patients with chronic conditions such as breast cancer and colorectal cancer often present with different symptoms, such ‘fatigue’, ‘pain’ and ‘depression’. These symptoms add to patients’ distress and functional impairment if left untreated. In this research, we investigate a symptom clustering and association mining framework to firstly extract and cluster the symptoms from the Electronic Health Record (EHR) clinical reports, then secondly to analyze the associations between symptom clusters and clinical attributes. The universal sentence coder and a modified seed based k-means algorithm are used for symptom coding and clustering. The results show that the symptom clusters have different associations between breast cancer and colorectal cancer, as well as for different time frames after chemotherapy. The results also show that breast cancer patients have slightly more symptoms from these three symptom clusters compared to the colorectal cancer patients within 12 months after the chemotherapy. Whereas, the colorectal cancer patient cohort has slightly more depression on average between 48 months and 54 months after the chemotherapy. Through applying the association rule mining, we find some informative rules, such as ‘if a patient is at a higher cancer stage of colorectal cancer (3B), but no fatigue symptom, he or she likely doesn’t have depression and peripheral neuropathy’. Our methods can be generalized to analyze symptom clusters of other chronic diseases where symptom management is critical.

AB - Patients with chronic conditions such as breast cancer and colorectal cancer often present with different symptoms, such ‘fatigue’, ‘pain’ and ‘depression’. These symptoms add to patients’ distress and functional impairment if left untreated. In this research, we investigate a symptom clustering and association mining framework to firstly extract and cluster the symptoms from the Electronic Health Record (EHR) clinical reports, then secondly to analyze the associations between symptom clusters and clinical attributes. The universal sentence coder and a modified seed based k-means algorithm are used for symptom coding and clustering. The results show that the symptom clusters have different associations between breast cancer and colorectal cancer, as well as for different time frames after chemotherapy. The results also show that breast cancer patients have slightly more symptoms from these three symptom clusters compared to the colorectal cancer patients within 12 months after the chemotherapy. Whereas, the colorectal cancer patient cohort has slightly more depression on average between 48 months and 54 months after the chemotherapy. Through applying the association rule mining, we find some informative rules, such as ‘if a patient is at a higher cancer stage of colorectal cancer (3B), but no fatigue symptom, he or she likely doesn’t have depression and peripheral neuropathy’. Our methods can be generalized to analyze symptom clusters of other chronic diseases where symptom management is critical.

KW - Association Mining

KW - Breast Cancer

KW - Colorectal Cancer

KW - EHR data

KW - Medical Term Embedding

KW - Symptom Clustering

UR - http://www.scopus.com/inward/record.url?scp=85073145893&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073145893&partnerID=8YFLogxK

U2 - 10.1145/3307339.3342164

DO - 10.1145/3307339.3342164

M3 - Conference contribution

AN - SCOPUS:85073145893

T3 - ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

SP - 405

EP - 413

BT - ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

PB - Association for Computing Machinery, Inc

ER -