Linkage of patient records from disparate sources

Xiaochun Li, Changyu Shen

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.

Original languageEnglish
Pages (from-to)31-38
Number of pages8
JournalStatistical Methods in Medical Research
Volume22
Issue number1
DOIs
StatePublished - Feb 2013

Fingerprint

Record Linkage
Linkage
Information Dissemination
Diagnostics
Latent Class Model
Testing
Comprehensive Evaluation
Data Sharing
Synthetic Data
Availability
Generator
Unknown
Methodology
Evaluation
Range of data

Keywords

  • Bayesian methods
  • diagnostic tests
  • Fellegi-Sunter model
  • k-means
  • latent class model
  • patient matching
  • record linkage

ASJC Scopus subject areas

  • Epidemiology
  • Health Information Management
  • Statistics and Probability

Cite this

Linkage of patient records from disparate sources. / Li, Xiaochun; Shen, Changyu.

In: Statistical Methods in Medical Research, Vol. 22, No. 1, 02.2013, p. 31-38.

Research output: Contribution to journalArticle

@article{9738514506e04a6e8bace2367d0ebb45,
title = "Linkage of patient records from disparate sources",
abstract = "We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.",
keywords = "Bayesian methods, diagnostic tests, Fellegi-Sunter model, k-means, latent class model, patient matching, record linkage",
author = "Xiaochun Li and Changyu Shen",
year = "2013",
month = "2",
doi = "10.1177/0962280211403600",
language = "English",
volume = "22",
pages = "31--38",
journal = "Statistical Methods in Medical Research",
issn = "0962-2802",
publisher = "SAGE Publications Ltd",
number = "1",

}

TY - JOUR

T1 - Linkage of patient records from disparate sources

AU - Li, Xiaochun

AU - Shen, Changyu

PY - 2013/2

Y1 - 2013/2

N2 - We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.

AB - We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.

KW - Bayesian methods

KW - diagnostic tests

KW - Fellegi-Sunter model

KW - k-means

KW - latent class model

KW - patient matching

KW - record linkage

UR - http://www.scopus.com/inward/record.url?scp=84874497959&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874497959&partnerID=8YFLogxK

U2 - 10.1177/0962280211403600

DO - 10.1177/0962280211403600

M3 - Article

VL - 22

SP - 31

EP - 38

JO - Statistical Methods in Medical Research

JF - Statistical Methods in Medical Research

SN - 0962-2802

IS - 1

ER -