Evaluation of top-down mass spectral identification with homologous protein sequences

Ziwei Li, Bo He, Qiang Kou, Zhe Wang, Si Wu, Yunlong Liu, Weixing Feng, Xiaowen Liu

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Background: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. Results: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. Conclusions: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.

Original languageEnglish (US)
Article number494
JournalBMC Bioinformatics
Volume19
DOIs
StatePublished - Dec 28 2018

Fingerprint

Protein Sequence
Sequence Homology
Escherichia coli K12
Protein Databases
Mass Spectrometry
Databases
Proteins
Evaluation
Proteome
Mass spectrometry
Escherichia coli
Escherichia Coli
Software
Software Tools
MCF-7 Cells
Post Translational Protein Processing
Unknown
Mutation
Protein
Cell

Keywords

  • Homologous protein database
  • Mass spectrometry
  • Top-down

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Evaluation of top-down mass spectral identification with homologous protein sequences. / Li, Ziwei; He, Bo; Kou, Qiang; Wang, Zhe; Wu, Si; Liu, Yunlong; Feng, Weixing; Liu, Xiaowen.

In: BMC Bioinformatics, Vol. 19, 494, 28.12.2018.

Research output: Contribution to journalArticle

Li, Ziwei ; He, Bo ; Kou, Qiang ; Wang, Zhe ; Wu, Si ; Liu, Yunlong ; Feng, Weixing ; Liu, Xiaowen. / Evaluation of top-down mass spectral identification with homologous protein sequences. In: BMC Bioinformatics. 2018 ; Vol. 19.
@article{2d17f3a565cd4e89a96e944d7e6092ab,
title = "Evaluation of top-down mass spectral identification with homologous protein sequences",
abstract = "Background: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. Results: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. Conclusions: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.",
keywords = "Homologous protein database, Mass spectrometry, Top-down",
author = "Ziwei Li and Bo He and Qiang Kou and Zhe Wang and Si Wu and Yunlong Liu and Weixing Feng and Xiaowen Liu",
year = "2018",
month = "12",
day = "28",
doi = "10.1186/s12859-018-2462-1",
language = "English (US)",
volume = "19",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Evaluation of top-down mass spectral identification with homologous protein sequences

AU - Li, Ziwei

AU - He, Bo

AU - Kou, Qiang

AU - Wang, Zhe

AU - Wu, Si

AU - Liu, Yunlong

AU - Feng, Weixing

AU - Liu, Xiaowen

PY - 2018/12/28

Y1 - 2018/12/28

N2 - Background: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. Results: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. Conclusions: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.

AB - Background: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. Results: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. Conclusions: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.

KW - Homologous protein database

KW - Mass spectrometry

KW - Top-down

UR - http://www.scopus.com/inward/record.url?scp=85059236813&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059236813&partnerID=8YFLogxK

U2 - 10.1186/s12859-018-2462-1

DO - 10.1186/s12859-018-2462-1

M3 - Article

C2 - 30591035

AN - SCOPUS:85059236813

VL - 19

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 494

ER -