Evaluation of top-down mass spectral identification with homologous protein sequences

Ziwei Li, Bo He, Qiang Kou, Zhe Wang, Si Wu, Yunlong Liu, Weixing Feng, Xiaowen Liu

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Background: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. Results: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. Conclusions: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.

Original languageEnglish (US)
Article number494
JournalBMC bioinformatics
StatePublished - Dec 28 2018


  • Homologous protein database
  • Mass spectrometry
  • Top-down

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Evaluation of top-down mass spectral identification with homologous protein sequences'. Together they form a unique fingerprint.

Cite this