Predicting Intrinsic Disorder From Amino Acid Sequence

Zoran Obradovic, Kang Peng, Slobodan Vucetic, Predrag Radivojac, Celeste J. Brown, A. Dunker

Research output: Contribution to journalArticle

301 Citations (Scopus)

Abstract

Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence.

Original languageEnglish
Pages (from-to)566-572
Number of pages7
JournalProteins: Structure, Function and Genetics
Volume53
Issue numberSUPPL. 6
DOIs
StatePublished - 2003

Fingerprint

Amino Acid Sequence
Amino Acids
Proteins
Experiments
Sampling
Neural networks

Keywords

  • Intrinsically disordered
  • Machine learning
  • Natively unfolded
  • Neural networks
  • Ordinary least squares regression

ASJC Scopus subject areas

  • Genetics
  • Structural Biology
  • Biochemistry

Cite this

Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., Brown, C. J., & Dunker, A. (2003). Predicting Intrinsic Disorder From Amino Acid Sequence. Proteins: Structure, Function and Genetics, 53(SUPPL. 6), 566-572. https://doi.org/10.1002/prot.10532

Predicting Intrinsic Disorder From Amino Acid Sequence. / Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J.; Dunker, A.

In: Proteins: Structure, Function and Genetics, Vol. 53, No. SUPPL. 6, 2003, p. 566-572.

Research output: Contribution to journalArticle

Obradovic, Z, Peng, K, Vucetic, S, Radivojac, P, Brown, CJ & Dunker, A 2003, 'Predicting Intrinsic Disorder From Amino Acid Sequence', Proteins: Structure, Function and Genetics, vol. 53, no. SUPPL. 6, pp. 566-572. https://doi.org/10.1002/prot.10532
Obradovic, Zoran ; Peng, Kang ; Vucetic, Slobodan ; Radivojac, Predrag ; Brown, Celeste J. ; Dunker, A. / Predicting Intrinsic Disorder From Amino Acid Sequence. In: Proteins: Structure, Function and Genetics. 2003 ; Vol. 53, No. SUPPL. 6. pp. 566-572.
@article{a86e2f643765489b891449fe93d6b781,
title = "Predicting Intrinsic Disorder From Amino Acid Sequence",
abstract = "Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77{\%} to 91{\%} for the ordered regions and from 56{\%} to 78{\%} for the disordered segments. The average of the order and disorder predictions ranged from 73{\%} to 77{\%}. The prediction of disorder in the shorter segments was poor, from 25{\%} to 66{\%} correct, while the prediction of disorder in the longer segments was better, from 75{\%} to 95{\%} correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence.",
keywords = "Intrinsically disordered, Machine learning, Natively unfolded, Neural networks, Ordinary least squares regression",
author = "Zoran Obradovic and Kang Peng and Slobodan Vucetic and Predrag Radivojac and Brown, {Celeste J.} and A. Dunker",
year = "2003",
doi = "10.1002/prot.10532",
language = "English",
volume = "53",
pages = "566--572",
journal = "Proteins: Structure, Function and Genetics",
issn = "0887-3585",
publisher = "Wiley-Liss Inc.",
number = "SUPPL. 6",

}

TY - JOUR

T1 - Predicting Intrinsic Disorder From Amino Acid Sequence

AU - Obradovic, Zoran

AU - Peng, Kang

AU - Vucetic, Slobodan

AU - Radivojac, Predrag

AU - Brown, Celeste J.

AU - Dunker, A.

PY - 2003

Y1 - 2003

N2 - Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence.

AB - Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence.

KW - Intrinsically disordered

KW - Machine learning

KW - Natively unfolded

KW - Neural networks

KW - Ordinary least squares regression

UR - http://www.scopus.com/inward/record.url?scp=0242267500&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0242267500&partnerID=8YFLogxK

U2 - 10.1002/prot.10532

DO - 10.1002/prot.10532

M3 - Article

C2 - 14579347

AN - SCOPUS:0242267500

VL - 53

SP - 566

EP - 572

JO - Proteins: Structure, Function and Genetics

JF - Proteins: Structure, Function and Genetics

SN - 0887-3585

IS - SUPPL. 6

ER -