Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data

Laura P. Ruppert, Jinghua He, Joel Martin, George Eckert, Fangqian Ouyang, Abby Church, Paul Dexter, Siu Hui, David Haggstrom

Research output: Contribution to journalArticle

Abstract

BACKGROUND: Large automated electronic health records (EHRs), if brought together in a federated data model, have the potential to serve as valuable population-based tools in studying the patterns and effectiveness of treatment. The Indiana Network for Patient Care (INPC) is a unique federated EHR data repository that contains data collected from a large population across various health care settings throughout the state of Indiana. The INPC clinical data environment allows quick access and extraction of information from medical charts. The purpose of this project was to evaluate 2 different methods of record linkage between the Indiana State Cancer Registry (ISCR) and INPC, determine the match rate for linkage between the ISCR and INPC data for patients diagnosed with cancer, and to assess the completeness of the ISCR based on additional validated cancer cases identified in the INPC EHRs. METHODS: Deterministic and probabilistic algorithms were applied to link ISCR cases to the INPC. The linkage results were validated by manual review and the accuracy assessed with positive predictive value (PPV). Medical charts of melanoma and lung cancer cases identified in INPC but not linked to ISCR were manually reviewed to identify true incidence cancers missed by the ISCR, from which the completeness of the ISCR was estimated for each cancer. RESULTS: Both deterministic and probabilistic approaches to linking ISCR and INPC had extremely high PPV (>99%) for identifying true matches for the overall cohort and each subcohort. The combined match rate for melanoma and lung cancer cases identified in the ISCR that matched to any patient occurrence in INPC (not by disease) was 85.5% for the complete cohort, 94.4% for melanoma, and 84.4% for lung cancer. The estimated completeness of capture by the ISCR was 84% for melanoma and 98% for lung cancer. Conclusion: Cancer registries can be successfully linked to patients’ EHR data from institutions participating in a regional health information organization (RHIO) with a high match rate. A pragmatic approach to data linkage may apply both deterministic and probabilistic approaches together for the diverse purposes of cancer control research. The RHIO has the potential to add value to the state cancer registry through the identification of additional true incident cases, but more advanced approaches, such as natural language processing, are needed.

Original languageEnglish (US)
Pages (from-to)174-178
Number of pages5
JournalJournal of registry management
Volume43
Issue number4
StatePublished - Dec 1 2016

Fingerprint

Registries
Patient Care
Neoplasms
Electronic Health Records
Melanoma
Lung Neoplasms
Information Storage and Retrieval
Natural Language Processing
Health
Population

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Ruppert, L. P., He, J., Martin, J., Eckert, G., Ouyang, F., Church, A., ... Haggstrom, D. (2016). Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data. Journal of registry management, 43(4), 174-178.

Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data. / Ruppert, Laura P.; He, Jinghua; Martin, Joel; Eckert, George; Ouyang, Fangqian; Church, Abby; Dexter, Paul; Hui, Siu; Haggstrom, David.

In: Journal of registry management, Vol. 43, No. 4, 01.12.2016, p. 174-178.

Research output: Contribution to journalArticle

Ruppert, LP, He, J, Martin, J, Eckert, G, Ouyang, F, Church, A, Dexter, P, Hui, S & Haggstrom, D 2016, 'Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data', Journal of registry management, vol. 43, no. 4, pp. 174-178.
Ruppert LP, He J, Martin J, Eckert G, Ouyang F, Church A et al. Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data. Journal of registry management. 2016 Dec 1;43(4):174-178.
Ruppert, Laura P. ; He, Jinghua ; Martin, Joel ; Eckert, George ; Ouyang, Fangqian ; Church, Abby ; Dexter, Paul ; Hui, Siu ; Haggstrom, David. / Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data. In: Journal of registry management. 2016 ; Vol. 43, No. 4. pp. 174-178.
@article{88e8a09e366d48c9be11ec7bf6406f97,
title = "Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data",
abstract = "BACKGROUND: Large automated electronic health records (EHRs), if brought together in a federated data model, have the potential to serve as valuable population-based tools in studying the patterns and effectiveness of treatment. The Indiana Network for Patient Care (INPC) is a unique federated EHR data repository that contains data collected from a large population across various health care settings throughout the state of Indiana. The INPC clinical data environment allows quick access and extraction of information from medical charts. The purpose of this project was to evaluate 2 different methods of record linkage between the Indiana State Cancer Registry (ISCR) and INPC, determine the match rate for linkage between the ISCR and INPC data for patients diagnosed with cancer, and to assess the completeness of the ISCR based on additional validated cancer cases identified in the INPC EHRs. METHODS: Deterministic and probabilistic algorithms were applied to link ISCR cases to the INPC. The linkage results were validated by manual review and the accuracy assessed with positive predictive value (PPV). Medical charts of melanoma and lung cancer cases identified in INPC but not linked to ISCR were manually reviewed to identify true incidence cancers missed by the ISCR, from which the completeness of the ISCR was estimated for each cancer. RESULTS: Both deterministic and probabilistic approaches to linking ISCR and INPC had extremely high PPV (>99{\%}) for identifying true matches for the overall cohort and each subcohort. The combined match rate for melanoma and lung cancer cases identified in the ISCR that matched to any patient occurrence in INPC (not by disease) was 85.5{\%} for the complete cohort, 94.4{\%} for melanoma, and 84.4{\%} for lung cancer. The estimated completeness of capture by the ISCR was 84{\%} for melanoma and 98{\%} for lung cancer. Conclusion: Cancer registries can be successfully linked to patients’ EHR data from institutions participating in a regional health information organization (RHIO) with a high match rate. A pragmatic approach to data linkage may apply both deterministic and probabilistic approaches together for the diverse purposes of cancer control research. The RHIO has the potential to add value to the state cancer registry through the identification of additional true incident cases, but more advanced approaches, such as natural language processing, are needed.",
author = "Ruppert, {Laura P.} and Jinghua He and Joel Martin and George Eckert and Fangqian Ouyang and Abby Church and Paul Dexter and Siu Hui and David Haggstrom",
year = "2016",
month = "12",
day = "1",
language = "English (US)",
volume = "43",
pages = "174--178",
journal = "Journal of registry management",
issn = "1945-6123",
publisher = "National Cancer Registrars Association",
number = "4",

}

TY - JOUR

T1 - Linkage of Indiana State Cancer Registry and Indiana Network for Patient Care Data

AU - Ruppert, Laura P.

AU - He, Jinghua

AU - Martin, Joel

AU - Eckert, George

AU - Ouyang, Fangqian

AU - Church, Abby

AU - Dexter, Paul

AU - Hui, Siu

AU - Haggstrom, David

PY - 2016/12/1

Y1 - 2016/12/1

N2 - BACKGROUND: Large automated electronic health records (EHRs), if brought together in a federated data model, have the potential to serve as valuable population-based tools in studying the patterns and effectiveness of treatment. The Indiana Network for Patient Care (INPC) is a unique federated EHR data repository that contains data collected from a large population across various health care settings throughout the state of Indiana. The INPC clinical data environment allows quick access and extraction of information from medical charts. The purpose of this project was to evaluate 2 different methods of record linkage between the Indiana State Cancer Registry (ISCR) and INPC, determine the match rate for linkage between the ISCR and INPC data for patients diagnosed with cancer, and to assess the completeness of the ISCR based on additional validated cancer cases identified in the INPC EHRs. METHODS: Deterministic and probabilistic algorithms were applied to link ISCR cases to the INPC. The linkage results were validated by manual review and the accuracy assessed with positive predictive value (PPV). Medical charts of melanoma and lung cancer cases identified in INPC but not linked to ISCR were manually reviewed to identify true incidence cancers missed by the ISCR, from which the completeness of the ISCR was estimated for each cancer. RESULTS: Both deterministic and probabilistic approaches to linking ISCR and INPC had extremely high PPV (>99%) for identifying true matches for the overall cohort and each subcohort. The combined match rate for melanoma and lung cancer cases identified in the ISCR that matched to any patient occurrence in INPC (not by disease) was 85.5% for the complete cohort, 94.4% for melanoma, and 84.4% for lung cancer. The estimated completeness of capture by the ISCR was 84% for melanoma and 98% for lung cancer. Conclusion: Cancer registries can be successfully linked to patients’ EHR data from institutions participating in a regional health information organization (RHIO) with a high match rate. A pragmatic approach to data linkage may apply both deterministic and probabilistic approaches together for the diverse purposes of cancer control research. The RHIO has the potential to add value to the state cancer registry through the identification of additional true incident cases, but more advanced approaches, such as natural language processing, are needed.

AB - BACKGROUND: Large automated electronic health records (EHRs), if brought together in a federated data model, have the potential to serve as valuable population-based tools in studying the patterns and effectiveness of treatment. The Indiana Network for Patient Care (INPC) is a unique federated EHR data repository that contains data collected from a large population across various health care settings throughout the state of Indiana. The INPC clinical data environment allows quick access and extraction of information from medical charts. The purpose of this project was to evaluate 2 different methods of record linkage between the Indiana State Cancer Registry (ISCR) and INPC, determine the match rate for linkage between the ISCR and INPC data for patients diagnosed with cancer, and to assess the completeness of the ISCR based on additional validated cancer cases identified in the INPC EHRs. METHODS: Deterministic and probabilistic algorithms were applied to link ISCR cases to the INPC. The linkage results were validated by manual review and the accuracy assessed with positive predictive value (PPV). Medical charts of melanoma and lung cancer cases identified in INPC but not linked to ISCR were manually reviewed to identify true incidence cancers missed by the ISCR, from which the completeness of the ISCR was estimated for each cancer. RESULTS: Both deterministic and probabilistic approaches to linking ISCR and INPC had extremely high PPV (>99%) for identifying true matches for the overall cohort and each subcohort. The combined match rate for melanoma and lung cancer cases identified in the ISCR that matched to any patient occurrence in INPC (not by disease) was 85.5% for the complete cohort, 94.4% for melanoma, and 84.4% for lung cancer. The estimated completeness of capture by the ISCR was 84% for melanoma and 98% for lung cancer. Conclusion: Cancer registries can be successfully linked to patients’ EHR data from institutions participating in a regional health information organization (RHIO) with a high match rate. A pragmatic approach to data linkage may apply both deterministic and probabilistic approaches together for the diverse purposes of cancer control research. The RHIO has the potential to add value to the state cancer registry through the identification of additional true incident cases, but more advanced approaches, such as natural language processing, are needed.

UR - http://www.scopus.com/inward/record.url?scp=85046276068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046276068&partnerID=8YFLogxK

M3 - Article

VL - 43

SP - 174

EP - 178

JO - Journal of registry management

JF - Journal of registry management

SN - 1945-6123

IS - 4

ER -