A neural network approach to multi-biomarker panel discovery by high-throughput plasma proteomics profiling of breast cancer

Fan Zhang, Jake Chen, Mu Wang, Renee Drabier

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background: In the past several years, there has been increasing interest and enthusiasm in molecular biomarkers as tools for early detection of cancer. Liquid chromatography tandem mass spectrometry (LC/MS/MS) based plasma proteomics profiling technique is a promising technology platform to study candidate protein biomarkers for early detection of cancer. Factors such as inherent variability, protein detectability limitation, and peptide discovery biases among LC/MS/MS platforms have made the classification and prediction of proteomics profiles challenging. Developing proteomics data analysis methods to identify multi-protein biomarker panels for breast cancer diagnosis based on neural networks provides hope for improving both the sensitivity and the specificity of candidate cancer biomarkers for early detection. Results: In our previous method, we developed a Feed Forward Neural Network-based method to build the classifier for plasma samples of breast cancer and then applied the classifier to predict blind dataset of breast cancer. However, the optimal combination C∗in our previous method was actually determined by applying the trained FFNN on the testing set with the combination. Therefore, in this paper, we applied a three way data split to the Feed Forward Neural Network for training, validation and testing based. We found that the prediction performance of the FFNN model based on the three way data split outperforms our previous method and the prediction performance is improved from (AUC = 0.8706, precision = 82.5%, accuracy = 82.5%, sensitivity = 82.5%, specificity = 82.5% for the testing set) to (AUC = 0.895, precision = 86.84%, accuracy = 85%, sensitivity = 82.5%, specificity = 87.5% for the testing set). Conclusions: Further pathway analysis showed that the top three five-marker panels are associated with complement and coagulation cascades, signaling, activation, and hemostasis, which are consistent with previous findings. We believe the new approach is a better solution for multi-biomarker panel discovery and it can be applied to other clinical proteomics.

Original languageEnglish (US)
Article numberS10
JournalBMC Proceedings
Volume7
DOIs
StatePublished - Dec 20 2013

Fingerprint

Biomarkers
Proteomics
Throughput
Breast Neoplasms
Neural networks
Plasmas
Feedforward neural networks
Testing
Classifiers
Early Detection of Cancer
Sensitivity and Specificity
Area Under Curve
Proteins
Liquid chromatography
Tumor Biomarkers
Coagulation
Mass spectrometry
Tandem Mass Spectrometry
Hemostasis
Liquid Chromatography

ASJC Scopus subject areas

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

A neural network approach to multi-biomarker panel discovery by high-throughput plasma proteomics profiling of breast cancer. / Zhang, Fan; Chen, Jake; Wang, Mu; Drabier, Renee.

In: BMC Proceedings, Vol. 7, S10, 20.12.2013.

Research output: Contribution to journalArticle

@article{c24a4c3e6b7d48c2bf945d4c153fa016,
title = "A neural network approach to multi-biomarker panel discovery by high-throughput plasma proteomics profiling of breast cancer",
abstract = "Background: In the past several years, there has been increasing interest and enthusiasm in molecular biomarkers as tools for early detection of cancer. Liquid chromatography tandem mass spectrometry (LC/MS/MS) based plasma proteomics profiling technique is a promising technology platform to study candidate protein biomarkers for early detection of cancer. Factors such as inherent variability, protein detectability limitation, and peptide discovery biases among LC/MS/MS platforms have made the classification and prediction of proteomics profiles challenging. Developing proteomics data analysis methods to identify multi-protein biomarker panels for breast cancer diagnosis based on neural networks provides hope for improving both the sensitivity and the specificity of candidate cancer biomarkers for early detection. Results: In our previous method, we developed a Feed Forward Neural Network-based method to build the classifier for plasma samples of breast cancer and then applied the classifier to predict blind dataset of breast cancer. However, the optimal combination C∗in our previous method was actually determined by applying the trained FFNN on the testing set with the combination. Therefore, in this paper, we applied a three way data split to the Feed Forward Neural Network for training, validation and testing based. We found that the prediction performance of the FFNN model based on the three way data split outperforms our previous method and the prediction performance is improved from (AUC = 0.8706, precision = 82.5{\%}, accuracy = 82.5{\%}, sensitivity = 82.5{\%}, specificity = 82.5{\%} for the testing set) to (AUC = 0.895, precision = 86.84{\%}, accuracy = 85{\%}, sensitivity = 82.5{\%}, specificity = 87.5{\%} for the testing set). Conclusions: Further pathway analysis showed that the top three five-marker panels are associated with complement and coagulation cascades, signaling, activation, and hemostasis, which are consistent with previous findings. We believe the new approach is a better solution for multi-biomarker panel discovery and it can be applied to other clinical proteomics.",
author = "Fan Zhang and Jake Chen and Mu Wang and Renee Drabier",
year = "2013",
month = "12",
day = "20",
doi = "10.1186/1753-6561-7-S7-S10",
language = "English (US)",
volume = "7",
journal = "BMC Proceedings",
issn = "1753-6561",
publisher = "BioMed Central",

}

TY - JOUR

T1 - A neural network approach to multi-biomarker panel discovery by high-throughput plasma proteomics profiling of breast cancer

AU - Zhang, Fan

AU - Chen, Jake

AU - Wang, Mu

AU - Drabier, Renee

PY - 2013/12/20

Y1 - 2013/12/20

N2 - Background: In the past several years, there has been increasing interest and enthusiasm in molecular biomarkers as tools for early detection of cancer. Liquid chromatography tandem mass spectrometry (LC/MS/MS) based plasma proteomics profiling technique is a promising technology platform to study candidate protein biomarkers for early detection of cancer. Factors such as inherent variability, protein detectability limitation, and peptide discovery biases among LC/MS/MS platforms have made the classification and prediction of proteomics profiles challenging. Developing proteomics data analysis methods to identify multi-protein biomarker panels for breast cancer diagnosis based on neural networks provides hope for improving both the sensitivity and the specificity of candidate cancer biomarkers for early detection. Results: In our previous method, we developed a Feed Forward Neural Network-based method to build the classifier for plasma samples of breast cancer and then applied the classifier to predict blind dataset of breast cancer. However, the optimal combination C∗in our previous method was actually determined by applying the trained FFNN on the testing set with the combination. Therefore, in this paper, we applied a three way data split to the Feed Forward Neural Network for training, validation and testing based. We found that the prediction performance of the FFNN model based on the three way data split outperforms our previous method and the prediction performance is improved from (AUC = 0.8706, precision = 82.5%, accuracy = 82.5%, sensitivity = 82.5%, specificity = 82.5% for the testing set) to (AUC = 0.895, precision = 86.84%, accuracy = 85%, sensitivity = 82.5%, specificity = 87.5% for the testing set). Conclusions: Further pathway analysis showed that the top three five-marker panels are associated with complement and coagulation cascades, signaling, activation, and hemostasis, which are consistent with previous findings. We believe the new approach is a better solution for multi-biomarker panel discovery and it can be applied to other clinical proteomics.

AB - Background: In the past several years, there has been increasing interest and enthusiasm in molecular biomarkers as tools for early detection of cancer. Liquid chromatography tandem mass spectrometry (LC/MS/MS) based plasma proteomics profiling technique is a promising technology platform to study candidate protein biomarkers for early detection of cancer. Factors such as inherent variability, protein detectability limitation, and peptide discovery biases among LC/MS/MS platforms have made the classification and prediction of proteomics profiles challenging. Developing proteomics data analysis methods to identify multi-protein biomarker panels for breast cancer diagnosis based on neural networks provides hope for improving both the sensitivity and the specificity of candidate cancer biomarkers for early detection. Results: In our previous method, we developed a Feed Forward Neural Network-based method to build the classifier for plasma samples of breast cancer and then applied the classifier to predict blind dataset of breast cancer. However, the optimal combination C∗in our previous method was actually determined by applying the trained FFNN on the testing set with the combination. Therefore, in this paper, we applied a three way data split to the Feed Forward Neural Network for training, validation and testing based. We found that the prediction performance of the FFNN model based on the three way data split outperforms our previous method and the prediction performance is improved from (AUC = 0.8706, precision = 82.5%, accuracy = 82.5%, sensitivity = 82.5%, specificity = 82.5% for the testing set) to (AUC = 0.895, precision = 86.84%, accuracy = 85%, sensitivity = 82.5%, specificity = 87.5% for the testing set). Conclusions: Further pathway analysis showed that the top three five-marker panels are associated with complement and coagulation cascades, signaling, activation, and hemostasis, which are consistent with previous findings. We believe the new approach is a better solution for multi-biomarker panel discovery and it can be applied to other clinical proteomics.

UR - http://www.scopus.com/inward/record.url?scp=85018192922&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018192922&partnerID=8YFLogxK

U2 - 10.1186/1753-6561-7-S7-S10

DO - 10.1186/1753-6561-7-S7-S10

M3 - Article

VL - 7

JO - BMC Proceedings

JF - BMC Proceedings

SN - 1753-6561

M1 - S10

ER -