Literature mining on pharmacokinetics numerical data: A feasibility study

Zhiping Wang, Seongho Kim, Sara K. Quinney, Yingying Guo, Stephen D. Hall, Luis M. Rocha, Lang Li

Research output: Contribution to journalArticle

17 Scopus citations


A feasibility study of literature mining is conducted on drug PK parameter numerical data with a sequential mining strategy. Firstly, an entity template library is built to retrieve pharmacokinetics relevant articles. Then a set of tagging and extraction rules are applied to retrieve PK data from the article abstracts. To estimate the PK parameter population-average mean and between-study variance, a linear mixed meta-analysis model and an E-M algorithm are developed to describe the probability distributions of PK parameters. Finally, a cross-validation procedure is developed to ascertain false-positive mining results. Using this approach to mine midazolam (MDZ) PK data, an 88% precision rate and 92% recall rate are achieved, with an F-score = 90%. It greatly out-performs a conventional data mining approach (support vector machine), which has an F-score of 68.1%. Further investigate on 7 more drugs reveals comparable performances of our sequential mining approach.

Original languageEnglish (US)
Pages (from-to)726-735
Number of pages10
JournalJournal of biomedical informatics
Issue number4
StatePublished - Aug 1 2009


  • Clearance
  • Data mining
  • Entity recognition
  • Information extraction
  • Linear mixed model
  • Midazolam
  • Pharmacokinetics

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Fingerprint Dive into the research topics of 'Literature mining on pharmacokinetics numerical data: A feasibility study'. Together they form a unique fingerprint.

  • Cite this