Evaluating the effect of data standardization and validation on patient matching accuracy

Shaun J. Grannis, Huiping Xu, Joshua R. Vest, Suranga Kasthurirathne, Na Bo, Ben Moscovitch, Rita Torkzadeh, Josh Rising

Research output: Contribution to journalArticle

Abstract

Objective: This study evaluated the degree to which recommendations for demographic data standardization improve patient matching accuracy using real-world datasets. Materials and Methods: We used 4 manually reviewed datasets, containing a random selection of matches and nonmatches. Matching datasets included health information exchange (HIE) records, public health registry records, Social Security Death Master File records, and newborn screening records. Standardized fields including last name, telephone number, social security number, date of birth, and address. Matching performance was evaluated using 4 metrics: sensitivity, specificity, positive predictive value, and accuracy. Results: Standardizing address was independently associated with improved matching sensitivities for both the public health and HIE datasets of approximately 0.6% and 4.5%. Overall accuracy was unchanged for both datasets due to reduced match specificity. We observed no similar impact for address standardization in the death master file dataset. Standardizing last name yielded improved matching sensitivity of 0.6% for the HIE dataset, while overall accuracy remained the same due to a decrease inmatch specificity.We noted no similar impact for other datasets. Standardizing other individual fields (telephone, date of birth, or social security number) showed no matching improvements. As standardizing address and last name improved matching sensitivity, we examined the combined effect of address and last name standardization, which showed that standardization improved sensitivity from 81.3% to 91.6% for the HIE dataset. Conclusions: Data standardization can improve match rates, thus ensuring that patients and clinicians have better data on which to make decisions to enhance care quality and safety.

Original languageEnglish (US)
Pages (from-to)447-456
Number of pages10
JournalJournal of the American Medical Informatics Association
Volume26
Issue number5
DOIs
StatePublished - Mar 8 2019

Fingerprint

Names
Social Security
Telephone
Public Health
Parturition
Datasets
Quality of Health Care
Registries
Demography
Newborn Infant
Safety
Sensitivity and Specificity
Health Information Exchange

Keywords

  • data standards
  • interoperability
  • patient identification
  • patient matching
  • record linkage

ASJC Scopus subject areas

  • Health Informatics

Cite this

Evaluating the effect of data standardization and validation on patient matching accuracy. / Grannis, Shaun J.; Xu, Huiping; Vest, Joshua R.; Kasthurirathne, Suranga; Bo, Na; Moscovitch, Ben; Torkzadeh, Rita; Rising, Josh.

In: Journal of the American Medical Informatics Association, Vol. 26, No. 5, 08.03.2019, p. 447-456.

Research output: Contribution to journalArticle

Grannis, Shaun J. ; Xu, Huiping ; Vest, Joshua R. ; Kasthurirathne, Suranga ; Bo, Na ; Moscovitch, Ben ; Torkzadeh, Rita ; Rising, Josh. / Evaluating the effect of data standardization and validation on patient matching accuracy. In: Journal of the American Medical Informatics Association. 2019 ; Vol. 26, No. 5. pp. 447-456.
@article{23d24a3a009c44f3a6b7d19e8d686ec3,
title = "Evaluating the effect of data standardization and validation on patient matching accuracy",
abstract = "Objective: This study evaluated the degree to which recommendations for demographic data standardization improve patient matching accuracy using real-world datasets. Materials and Methods: We used 4 manually reviewed datasets, containing a random selection of matches and nonmatches. Matching datasets included health information exchange (HIE) records, public health registry records, Social Security Death Master File records, and newborn screening records. Standardized fields including last name, telephone number, social security number, date of birth, and address. Matching performance was evaluated using 4 metrics: sensitivity, specificity, positive predictive value, and accuracy. Results: Standardizing address was independently associated with improved matching sensitivities for both the public health and HIE datasets of approximately 0.6{\%} and 4.5{\%}. Overall accuracy was unchanged for both datasets due to reduced match specificity. We observed no similar impact for address standardization in the death master file dataset. Standardizing last name yielded improved matching sensitivity of 0.6{\%} for the HIE dataset, while overall accuracy remained the same due to a decrease inmatch specificity.We noted no similar impact for other datasets. Standardizing other individual fields (telephone, date of birth, or social security number) showed no matching improvements. As standardizing address and last name improved matching sensitivity, we examined the combined effect of address and last name standardization, which showed that standardization improved sensitivity from 81.3{\%} to 91.6{\%} for the HIE dataset. Conclusions: Data standardization can improve match rates, thus ensuring that patients and clinicians have better data on which to make decisions to enhance care quality and safety.",
keywords = "data standards, interoperability, patient identification, patient matching, record linkage",
author = "Grannis, {Shaun J.} and Huiping Xu and Vest, {Joshua R.} and Suranga Kasthurirathne and Na Bo and Ben Moscovitch and Rita Torkzadeh and Josh Rising",
year = "2019",
month = "3",
day = "8",
doi = "10.1093/jamia/ocy191",
language = "English (US)",
volume = "26",
pages = "447--456",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "5",

}

TY - JOUR

T1 - Evaluating the effect of data standardization and validation on patient matching accuracy

AU - Grannis, Shaun J.

AU - Xu, Huiping

AU - Vest, Joshua R.

AU - Kasthurirathne, Suranga

AU - Bo, Na

AU - Moscovitch, Ben

AU - Torkzadeh, Rita

AU - Rising, Josh

PY - 2019/3/8

Y1 - 2019/3/8

N2 - Objective: This study evaluated the degree to which recommendations for demographic data standardization improve patient matching accuracy using real-world datasets. Materials and Methods: We used 4 manually reviewed datasets, containing a random selection of matches and nonmatches. Matching datasets included health information exchange (HIE) records, public health registry records, Social Security Death Master File records, and newborn screening records. Standardized fields including last name, telephone number, social security number, date of birth, and address. Matching performance was evaluated using 4 metrics: sensitivity, specificity, positive predictive value, and accuracy. Results: Standardizing address was independently associated with improved matching sensitivities for both the public health and HIE datasets of approximately 0.6% and 4.5%. Overall accuracy was unchanged for both datasets due to reduced match specificity. We observed no similar impact for address standardization in the death master file dataset. Standardizing last name yielded improved matching sensitivity of 0.6% for the HIE dataset, while overall accuracy remained the same due to a decrease inmatch specificity.We noted no similar impact for other datasets. Standardizing other individual fields (telephone, date of birth, or social security number) showed no matching improvements. As standardizing address and last name improved matching sensitivity, we examined the combined effect of address and last name standardization, which showed that standardization improved sensitivity from 81.3% to 91.6% for the HIE dataset. Conclusions: Data standardization can improve match rates, thus ensuring that patients and clinicians have better data on which to make decisions to enhance care quality and safety.

AB - Objective: This study evaluated the degree to which recommendations for demographic data standardization improve patient matching accuracy using real-world datasets. Materials and Methods: We used 4 manually reviewed datasets, containing a random selection of matches and nonmatches. Matching datasets included health information exchange (HIE) records, public health registry records, Social Security Death Master File records, and newborn screening records. Standardized fields including last name, telephone number, social security number, date of birth, and address. Matching performance was evaluated using 4 metrics: sensitivity, specificity, positive predictive value, and accuracy. Results: Standardizing address was independently associated with improved matching sensitivities for both the public health and HIE datasets of approximately 0.6% and 4.5%. Overall accuracy was unchanged for both datasets due to reduced match specificity. We observed no similar impact for address standardization in the death master file dataset. Standardizing last name yielded improved matching sensitivity of 0.6% for the HIE dataset, while overall accuracy remained the same due to a decrease inmatch specificity.We noted no similar impact for other datasets. Standardizing other individual fields (telephone, date of birth, or social security number) showed no matching improvements. As standardizing address and last name improved matching sensitivity, we examined the combined effect of address and last name standardization, which showed that standardization improved sensitivity from 81.3% to 91.6% for the HIE dataset. Conclusions: Data standardization can improve match rates, thus ensuring that patients and clinicians have better data on which to make decisions to enhance care quality and safety.

KW - data standards

KW - interoperability

KW - patient identification

KW - patient matching

KW - record linkage

UR - http://www.scopus.com/inward/record.url?scp=85063712650&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063712650&partnerID=8YFLogxK

U2 - 10.1093/jamia/ocy191

DO - 10.1093/jamia/ocy191

M3 - Article

C2 - 30848796

AN - SCOPUS:85063712650

VL - 26

SP - 447

EP - 456

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - 5

ER -