An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry

Jaesik Jeong, Xue Shi, Xiang Zhang, Seongho Kim, Changyu Shen

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Background: Mass spectrometry (MS) based metabolite profiling has been increasingly popular for scientific and biomedical studies, primarily due to recent technological development such as comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS). Nevertheless, the identifications of metabolites from complex samples are subject to errors. Statistical/computational approaches to improve the accuracy of the identifications and false positive estimate are in great need. We propose an empirical Bayes model which accounts for a competing score in addition to the similarity score to tackle this problem. The competition score characterizes the propensity of a candidate metabolite of being matched to some spectrum based on the metabolite's similarity score with other spectra in the library searched against. The competition score allows the model to properly assess the evidence on the presence/absence status of a metabolite based on whether or not the metabolite is matched to some sample spectrum.Results: With a mixture of metabolite standards, we demonstrated that our method has better identification accuracy than other four existing methods. Moreover, our method has reliable false discovery rate estimate. We also applied our method to the data collected from the plasma of a rat and identified some metabolites from the plasma under the control of false discovery rate.Conclusions: We developed an empirical Bayes model for metabolite identification and validated the method through a mixture of metabolite standards and rat plasma. The results show that our hierarchical model improves identification accuracy as compared with methods that do not structurally model the involved variables. The improvement in identification accuracy is likely to facilitate downstream analysis such as peak alignment and biomarker identification. Raw data and result matrices can be found at http://www.biostat.iupui.edu/~ChangyuShen/index.htm. Trial Registration: 2123938128573429.

Original languageEnglish
Article number392
JournalBMC Bioinformatics
Volume12
DOIs
StatePublished - Oct 10 2011

Fingerprint

Gas Chromatography
Empirical Bayes
Mass Spectrometry
Metabolites
Gas chromatography
Gas Chromatography-Mass Spectrometry
Mass spectrometry
Identification (control systems)
Plasma
Model
Propensity Score
Plasmas
Status Epilepticus
Model Identification
Time-of-flight
Biomarkers
Hierarchical Model
Rats
Profiling
False Positive

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics
  • Structural Biology

Cite this

An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry. / Jeong, Jaesik; Shi, Xue; Zhang, Xiang; Kim, Seongho; Shen, Changyu.

In: BMC Bioinformatics, Vol. 12, 392, 10.10.2011.

Research output: Contribution to journalArticle

Jeong, Jaesik ; Shi, Xue ; Zhang, Xiang ; Kim, Seongho ; Shen, Changyu. / An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry. In: BMC Bioinformatics. 2011 ; Vol. 12.
@article{099b5ffb61c24fcbad132c73a361e2d4,
title = "An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry",
abstract = "Background: Mass spectrometry (MS) based metabolite profiling has been increasingly popular for scientific and biomedical studies, primarily due to recent technological development such as comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS). Nevertheless, the identifications of metabolites from complex samples are subject to errors. Statistical/computational approaches to improve the accuracy of the identifications and false positive estimate are in great need. We propose an empirical Bayes model which accounts for a competing score in addition to the similarity score to tackle this problem. The competition score characterizes the propensity of a candidate metabolite of being matched to some spectrum based on the metabolite's similarity score with other spectra in the library searched against. The competition score allows the model to properly assess the evidence on the presence/absence status of a metabolite based on whether or not the metabolite is matched to some sample spectrum.Results: With a mixture of metabolite standards, we demonstrated that our method has better identification accuracy than other four existing methods. Moreover, our method has reliable false discovery rate estimate. We also applied our method to the data collected from the plasma of a rat and identified some metabolites from the plasma under the control of false discovery rate.Conclusions: We developed an empirical Bayes model for metabolite identification and validated the method through a mixture of metabolite standards and rat plasma. The results show that our hierarchical model improves identification accuracy as compared with methods that do not structurally model the involved variables. The improvement in identification accuracy is likely to facilitate downstream analysis such as peak alignment and biomarker identification. Raw data and result matrices can be found at http://www.biostat.iupui.edu/~ChangyuShen/index.htm. Trial Registration: 2123938128573429.",
author = "Jaesik Jeong and Xue Shi and Xiang Zhang and Seongho Kim and Changyu Shen",
year = "2011",
month = "10",
day = "10",
doi = "10.1186/1471-2105-12-392",
language = "English",
volume = "12",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry

AU - Jeong, Jaesik

AU - Shi, Xue

AU - Zhang, Xiang

AU - Kim, Seongho

AU - Shen, Changyu

PY - 2011/10/10

Y1 - 2011/10/10

N2 - Background: Mass spectrometry (MS) based metabolite profiling has been increasingly popular for scientific and biomedical studies, primarily due to recent technological development such as comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS). Nevertheless, the identifications of metabolites from complex samples are subject to errors. Statistical/computational approaches to improve the accuracy of the identifications and false positive estimate are in great need. We propose an empirical Bayes model which accounts for a competing score in addition to the similarity score to tackle this problem. The competition score characterizes the propensity of a candidate metabolite of being matched to some spectrum based on the metabolite's similarity score with other spectra in the library searched against. The competition score allows the model to properly assess the evidence on the presence/absence status of a metabolite based on whether or not the metabolite is matched to some sample spectrum.Results: With a mixture of metabolite standards, we demonstrated that our method has better identification accuracy than other four existing methods. Moreover, our method has reliable false discovery rate estimate. We also applied our method to the data collected from the plasma of a rat and identified some metabolites from the plasma under the control of false discovery rate.Conclusions: We developed an empirical Bayes model for metabolite identification and validated the method through a mixture of metabolite standards and rat plasma. The results show that our hierarchical model improves identification accuracy as compared with methods that do not structurally model the involved variables. The improvement in identification accuracy is likely to facilitate downstream analysis such as peak alignment and biomarker identification. Raw data and result matrices can be found at http://www.biostat.iupui.edu/~ChangyuShen/index.htm. Trial Registration: 2123938128573429.

AB - Background: Mass spectrometry (MS) based metabolite profiling has been increasingly popular for scientific and biomedical studies, primarily due to recent technological development such as comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS). Nevertheless, the identifications of metabolites from complex samples are subject to errors. Statistical/computational approaches to improve the accuracy of the identifications and false positive estimate are in great need. We propose an empirical Bayes model which accounts for a competing score in addition to the similarity score to tackle this problem. The competition score characterizes the propensity of a candidate metabolite of being matched to some spectrum based on the metabolite's similarity score with other spectra in the library searched against. The competition score allows the model to properly assess the evidence on the presence/absence status of a metabolite based on whether or not the metabolite is matched to some sample spectrum.Results: With a mixture of metabolite standards, we demonstrated that our method has better identification accuracy than other four existing methods. Moreover, our method has reliable false discovery rate estimate. We also applied our method to the data collected from the plasma of a rat and identified some metabolites from the plasma under the control of false discovery rate.Conclusions: We developed an empirical Bayes model for metabolite identification and validated the method through a mixture of metabolite standards and rat plasma. The results show that our hierarchical model improves identification accuracy as compared with methods that do not structurally model the involved variables. The improvement in identification accuracy is likely to facilitate downstream analysis such as peak alignment and biomarker identification. Raw data and result matrices can be found at http://www.biostat.iupui.edu/~ChangyuShen/index.htm. Trial Registration: 2123938128573429.

UR - http://www.scopus.com/inward/record.url?scp=80053600914&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053600914&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-12-392

DO - 10.1186/1471-2105-12-392

M3 - Article

C2 - 21985394

AN - SCOPUS:80053600914

VL - 12

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 392

ER -