Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data

Jaroslaw Harezlak, Michael C. Wu, Mike Wang, Armin Schwartzman, David C. Christiani, Xihong Lin

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.

Original languageEnglish
Pages (from-to)217-224
Number of pages8
JournalJournal of Proteome Research
Volume7
Issue number1
DOIs
StatePublished - Jan 2008

Fingerprint

Arsenic
Biomarkers
Proteomics
Mass spectrometry
Mass Spectrometry
Learning
Bangladesh
Environmental Exposure
Blood Proteins
Early Diagnosis
Statistical methods
Proteins
Impurities
Plasmas
Population

ASJC Scopus subject areas

  • Genetics
  • Biotechnology
  • Biochemistry

Cite this

Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data. / Harezlak, Jaroslaw; Wu, Michael C.; Wang, Mike; Schwartzman, Armin; Christiani, David C.; Lin, Xihong.

In: Journal of Proteome Research, Vol. 7, No. 1, 01.2008, p. 217-224.

Research output: Contribution to journalArticle

Harezlak, Jaroslaw ; Wu, Michael C. ; Wang, Mike ; Schwartzman, Armin ; Christiani, David C. ; Lin, Xihong. / Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data. In: Journal of Proteome Research. 2008 ; Vol. 7, No. 1. pp. 217-224.
@article{94923a973715429998cde0cc1b41ca72,
title = "Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data",
abstract = "Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.",
author = "Jaroslaw Harezlak and Wu, {Michael C.} and Mike Wang and Armin Schwartzman and Christiani, {David C.} and Xihong Lin",
year = "2008",
month = "1",
doi = "10.1021/pr070491n",
language = "English",
volume = "7",
pages = "217--224",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "1",

}

TY - JOUR

T1 - Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data

AU - Harezlak, Jaroslaw

AU - Wu, Michael C.

AU - Wang, Mike

AU - Schwartzman, Armin

AU - Christiani, David C.

AU - Lin, Xihong

PY - 2008/1

Y1 - 2008/1

N2 - Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.

AB - Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.

UR - http://www.scopus.com/inward/record.url?scp=38649132568&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38649132568&partnerID=8YFLogxK

U2 - 10.1021/pr070491n

DO - 10.1021/pr070491n

M3 - Article

C2 - 18173220

AN - SCOPUS:38649132568

VL - 7

SP - 217

EP - 224

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 1

ER -