Sequence fingerprints distinguish erroneous from correct predictions of intrinsically disordered protein regions

Konda Mani Saravanan, A. Dunker, Sankaran Krishnaswamy

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

More than 60 prediction methods for intrinsically disordered proteins (IDPs) have been developed over the years, many of which are accessible on the World Wide Web. Nearly, all of these predictors give balanced accuracies in the ~65%–~80% range. Since predictors are not perfect, further studies are required to uncover the role of amino acid residues in native IDP as compared to predicted IDP regions. In the present work, we make use of sequences of 100% predicted IDP regions, false positive disorder predictions, and experimentally determined IDP regions to distinguish the characteristics of native versus predicted IDP regions. A higher occurrence of asparagine is observed in sequences of native IDP regions but not in sequences of false positive predictions of IDP regions. The occurrences of certain combinations of amino acids at the pentapeptide level provide a distinguishing feature in the IDPs with respect to globular proteins. The distinguishing features presented in this paper provide insights into the sequence fingerprints of amino acid residues in experimentally determined as compared to predicted IDP regions. These observations and additional work along these lines should enable the development of improvements in the accuracy of disorder prediction algorithm.

Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalJournal of Biomolecular Structure and Dynamics
DOIs
StateAccepted/In press - Dec 28 2017

Fingerprint

Intrinsically Disordered Proteins
Dermatoglyphics
Amino Acids
Asparagine
Amino Acid Sequence

Keywords

  • dipeptide analysis
  • native intrinsic disorder
  • predicted intrinsic disorder
  • secondary structure prediction
  • sequence fingerprint

ASJC Scopus subject areas

  • Structural Biology
  • Molecular Biology

Cite this

Sequence fingerprints distinguish erroneous from correct predictions of intrinsically disordered protein regions. / Saravanan, Konda Mani; Dunker, A.; Krishnaswamy, Sankaran.

In: Journal of Biomolecular Structure and Dynamics, 28.12.2017, p. 1-14.

Research output: Contribution to journalArticle

@article{c19d28b7ec174a9598bb9929815581d9,
title = "Sequence fingerprints distinguish erroneous from correct predictions of intrinsically disordered protein regions",
abstract = "More than 60 prediction methods for intrinsically disordered proteins (IDPs) have been developed over the years, many of which are accessible on the World Wide Web. Nearly, all of these predictors give balanced accuracies in the ~65{\%}–~80{\%} range. Since predictors are not perfect, further studies are required to uncover the role of amino acid residues in native IDP as compared to predicted IDP regions. In the present work, we make use of sequences of 100{\%} predicted IDP regions, false positive disorder predictions, and experimentally determined IDP regions to distinguish the characteristics of native versus predicted IDP regions. A higher occurrence of asparagine is observed in sequences of native IDP regions but not in sequences of false positive predictions of IDP regions. The occurrences of certain combinations of amino acids at the pentapeptide level provide a distinguishing feature in the IDPs with respect to globular proteins. The distinguishing features presented in this paper provide insights into the sequence fingerprints of amino acid residues in experimentally determined as compared to predicted IDP regions. These observations and additional work along these lines should enable the development of improvements in the accuracy of disorder prediction algorithm.",
keywords = "dipeptide analysis, native intrinsic disorder, predicted intrinsic disorder, secondary structure prediction, sequence fingerprint",
author = "Saravanan, {Konda Mani} and A. Dunker and Sankaran Krishnaswamy",
year = "2017",
month = "12",
day = "28",
doi = "10.1080/07391102.2017.1415822",
language = "English (US)",
pages = "1--14",
journal = "Journal of Biomolecular Structure and Dynamics",
issn = "0739-1102",
publisher = "Adenine Press",

}

TY - JOUR

T1 - Sequence fingerprints distinguish erroneous from correct predictions of intrinsically disordered protein regions

AU - Saravanan, Konda Mani

AU - Dunker, A.

AU - Krishnaswamy, Sankaran

PY - 2017/12/28

Y1 - 2017/12/28

N2 - More than 60 prediction methods for intrinsically disordered proteins (IDPs) have been developed over the years, many of which are accessible on the World Wide Web. Nearly, all of these predictors give balanced accuracies in the ~65%–~80% range. Since predictors are not perfect, further studies are required to uncover the role of amino acid residues in native IDP as compared to predicted IDP regions. In the present work, we make use of sequences of 100% predicted IDP regions, false positive disorder predictions, and experimentally determined IDP regions to distinguish the characteristics of native versus predicted IDP regions. A higher occurrence of asparagine is observed in sequences of native IDP regions but not in sequences of false positive predictions of IDP regions. The occurrences of certain combinations of amino acids at the pentapeptide level provide a distinguishing feature in the IDPs with respect to globular proteins. The distinguishing features presented in this paper provide insights into the sequence fingerprints of amino acid residues in experimentally determined as compared to predicted IDP regions. These observations and additional work along these lines should enable the development of improvements in the accuracy of disorder prediction algorithm.

AB - More than 60 prediction methods for intrinsically disordered proteins (IDPs) have been developed over the years, many of which are accessible on the World Wide Web. Nearly, all of these predictors give balanced accuracies in the ~65%–~80% range. Since predictors are not perfect, further studies are required to uncover the role of amino acid residues in native IDP as compared to predicted IDP regions. In the present work, we make use of sequences of 100% predicted IDP regions, false positive disorder predictions, and experimentally determined IDP regions to distinguish the characteristics of native versus predicted IDP regions. A higher occurrence of asparagine is observed in sequences of native IDP regions but not in sequences of false positive predictions of IDP regions. The occurrences of certain combinations of amino acids at the pentapeptide level provide a distinguishing feature in the IDPs with respect to globular proteins. The distinguishing features presented in this paper provide insights into the sequence fingerprints of amino acid residues in experimentally determined as compared to predicted IDP regions. These observations and additional work along these lines should enable the development of improvements in the accuracy of disorder prediction algorithm.

KW - dipeptide analysis

KW - native intrinsic disorder

KW - predicted intrinsic disorder

KW - secondary structure prediction

KW - sequence fingerprint

UR - http://www.scopus.com/inward/record.url?scp=85039158695&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85039158695&partnerID=8YFLogxK

U2 - 10.1080/07391102.2017.1415822

DO - 10.1080/07391102.2017.1415822

M3 - Article

C2 - 29228892

AN - SCOPUS:85039158695

SP - 1

EP - 14

JO - Journal of Biomolecular Structure and Dynamics

JF - Journal of Biomolecular Structure and Dynamics

SN - 0739-1102

ER -