Improving protein order-disorder classification using charge-hydropathy plots

Fei Huang, Christopher J. Oldfield, Bin Xue, Wei Lun Hsu, Jingwei Meng, Xiaowen Liu, Li Shen, Pedro Romero, Vladimir N. Uversky, A. Dunker

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

Background: The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale. Results: Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy. Conclusion: We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder.

Original languageEnglish (US)
Article numberS4
JournalBMC Bioinformatics
Volume15
Issue number17
DOIs
StatePublished - Dec 16 2014

Fingerprint

Intrinsically Disordered Proteins
Order disorder transitions
Disorder
Charge
Proteins
Protein
Amino Acids
Alternatives

Keywords

  • Intrinsically disordered proteins
  • Natively unstructured or unfolded proteins
  • Structure and disorder prediction
  • Support vector machines

ASJC Scopus subject areas

  • Applied Mathematics
  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

Improving protein order-disorder classification using charge-hydropathy plots. / Huang, Fei; Oldfield, Christopher J.; Xue, Bin; Hsu, Wei Lun; Meng, Jingwei; Liu, Xiaowen; Shen, Li; Romero, Pedro; Uversky, Vladimir N.; Dunker, A.

In: BMC Bioinformatics, Vol. 15, No. 17, S4, 16.12.2014.

Research output: Contribution to journalArticle

Huang, F, Oldfield, CJ, Xue, B, Hsu, WL, Meng, J, Liu, X, Shen, L, Romero, P, Uversky, VN & Dunker, A 2014, 'Improving protein order-disorder classification using charge-hydropathy plots', BMC Bioinformatics, vol. 15, no. 17, S4. https://doi.org/10.1186/1471-2105-15-S17-S4
Huang, Fei ; Oldfield, Christopher J. ; Xue, Bin ; Hsu, Wei Lun ; Meng, Jingwei ; Liu, Xiaowen ; Shen, Li ; Romero, Pedro ; Uversky, Vladimir N. ; Dunker, A. / Improving protein order-disorder classification using charge-hydropathy plots. In: BMC Bioinformatics. 2014 ; Vol. 15, No. 17.
@article{20de3012a1874915ad67d2f214f9c96f,
title = "Improving protein order-disorder classification using charge-hydropathy plots",
abstract = "Background: The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale. Results: Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79{\%}, 84{\%}, and 90{\%}, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy. Conclusion: We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder.",
keywords = "Intrinsically disordered proteins, Natively unstructured or unfolded proteins, Structure and disorder prediction, Support vector machines",
author = "Fei Huang and Oldfield, {Christopher J.} and Bin Xue and Hsu, {Wei Lun} and Jingwei Meng and Xiaowen Liu and Li Shen and Pedro Romero and Uversky, {Vladimir N.} and A. Dunker",
year = "2014",
month = "12",
day = "16",
doi = "10.1186/1471-2105-15-S17-S4",
language = "English (US)",
volume = "15",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "17",

}

TY - JOUR

T1 - Improving protein order-disorder classification using charge-hydropathy plots

AU - Huang, Fei

AU - Oldfield, Christopher J.

AU - Xue, Bin

AU - Hsu, Wei Lun

AU - Meng, Jingwei

AU - Liu, Xiaowen

AU - Shen, Li

AU - Romero, Pedro

AU - Uversky, Vladimir N.

AU - Dunker, A.

PY - 2014/12/16

Y1 - 2014/12/16

N2 - Background: The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale. Results: Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy. Conclusion: We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder.

AB - Background: The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale. Results: Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy. Conclusion: We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder.

KW - Intrinsically disordered proteins

KW - Natively unstructured or unfolded proteins

KW - Structure and disorder prediction

KW - Support vector machines

UR - http://www.scopus.com/inward/record.url?scp=84961666223&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961666223&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-15-S17-S4

DO - 10.1186/1471-2105-15-S17-S4

M3 - Article

C2 - 25559583

AN - SCOPUS:84961666223

VL - 15

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - 17

M1 - S4

ER -