MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins

Fatemeh Miri Disfani, Wei Lun Hsu, Marcin J. Mizianty, Christopher J. Oldfield, Bin Xue, A. Dunker, Vladimir N. Uversky, Lukasz Kurgan

Research output: Contribution to journalArticle

183 Citations (Scopus)

Abstract

Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (α, β, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: α-MoRF-Pred that predicts α-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues.

Original languageEnglish
Article numberbts209
JournalBioinformatics
Volume28
Issue number12
DOIs
StatePublished - Jun 2012

Fingerprint

Feature Recognition
Molecular recognition
Disorder
Proteins
Protein
Prediction
Sequence Alignment
Hydrophobic and Hydrophilic Interactions
Amino Acids
Predict
Datasets
Hydrophobicity
Computational methods
Coil
Accessibility
Computational Methods
Support vector machines
Annotation
Amino acids

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability
  • Medicine(all)

Cite this

MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. / Disfani, Fatemeh Miri; Hsu, Wei Lun; Mizianty, Marcin J.; Oldfield, Christopher J.; Xue, Bin; Dunker, A.; Uversky, Vladimir N.; Kurgan, Lukasz.

In: Bioinformatics, Vol. 28, No. 12, bts209, 06.2012.

Research output: Contribution to journalArticle

Disfani, Fatemeh Miri ; Hsu, Wei Lun ; Mizianty, Marcin J. ; Oldfield, Christopher J. ; Xue, Bin ; Dunker, A. ; Uversky, Vladimir N. ; Kurgan, Lukasz. / MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. In: Bioinformatics. 2012 ; Vol. 28, No. 12.
@article{b2031653d8b0415c8be11ceabb9c299b,
title = "MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins",
abstract = "Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (α, β, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: α-MoRF-Pred that predicts α-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues.",
author = "Disfani, {Fatemeh Miri} and Hsu, {Wei Lun} and Mizianty, {Marcin J.} and Oldfield, {Christopher J.} and Bin Xue and A. Dunker and Uversky, {Vladimir N.} and Lukasz Kurgan",
year = "2012",
month = "6",
doi = "10.1093/bioinformatics/bts209",
language = "English",
volume = "28",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TY - JOUR

T1 - MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins

AU - Disfani, Fatemeh Miri

AU - Hsu, Wei Lun

AU - Mizianty, Marcin J.

AU - Oldfield, Christopher J.

AU - Xue, Bin

AU - Dunker, A.

AU - Uversky, Vladimir N.

AU - Kurgan, Lukasz

PY - 2012/6

Y1 - 2012/6

N2 - Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (α, β, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: α-MoRF-Pred that predicts α-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues.

AB - Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (α, β, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: α-MoRF-Pred that predicts α-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues.

UR - http://www.scopus.com/inward/record.url?scp=84863518014&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863518014&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bts209

DO - 10.1093/bioinformatics/bts209

M3 - Article

C2 - 22689782

AN - SCOPUS:84863518014

VL - 28

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

M1 - bts209

ER -