Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure

G. E. Arnold, A. Dunker, S. J. Johns, R. J. Douthart

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

The conditional probability, P(σ|x), is a statement of the probability that the value of σ will be found given the prior information that a value of x has been observed. Here σ represents any one of the secondary structure types, α, β, τ, and ρ for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou-Fasman conformational parameters for helix, P(α), for sheet, P(β), and for turn, P(τ); and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, I(α), for sheet, I(β), for turn, I(τ), and for random structure, I(ρ). Plots of P(σ|x) vs. x are demonstrated to provide information about the correlation between structure and attribute, σ and x. The separations between different P(σ|x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P (α|x), P(β|x), P(τ|x) and P(ρ|x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix > P(α) > > N-cap > C- cap ≃ I(α) ≃ L(τ). The information value for turns, I(τ), was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: I(β) > > P(β) ≃ hydropathy > I(ρ) ≃ hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: I(α) ≃ P(α) ≃ hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: P(β) > > I(ρ) ≃ hydropathy. Indications of the absence of σ could be as useful for some applications as the indication of the presence of σ.

Original languageEnglish (US)
Pages (from-to)382-399
Number of pages18
JournalProteins: Structure, Function and Genetics
Volume12
Issue number4
DOIs
StatePublished - 1992
Externally publishedYes

Fingerprint

Secondary Protein Structure
Amino Acid Sequence
Amino Acids
Proteins

Keywords

  • amino acid sequence
  • distributions
  • protein secondary structure

ASJC Scopus subject areas

  • Genetics
  • Structural Biology
  • Biochemistry

Cite this

Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure. / Arnold, G. E.; Dunker, A.; Johns, S. J.; Douthart, R. J.

In: Proteins: Structure, Function and Genetics, Vol. 12, No. 4, 1992, p. 382-399.

Research output: Contribution to journalArticle

@article{78b7b6022e7245b3956365fae44fdcc1,
title = "Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure",
abstract = "The conditional probability, P(σ|x), is a statement of the probability that the value of σ will be found given the prior information that a value of x has been observed. Here σ represents any one of the secondary structure types, α, β, τ, and ρ for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou-Fasman conformational parameters for helix, P(α), for sheet, P(β), and for turn, P(τ); and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, I(α), for sheet, I(β), for turn, I(τ), and for random structure, I(ρ). Plots of P(σ|x) vs. x are demonstrated to provide information about the correlation between structure and attribute, σ and x. The separations between different P(σ|x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P (α|x), P(β|x), P(τ|x) and P(ρ|x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix > P(α) > > N-cap > C- cap ≃ I(α) ≃ L(τ). The information value for turns, I(τ), was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: I(β) > > P(β) ≃ hydropathy > I(ρ) ≃ hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: I(α) ≃ P(α) ≃ hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: P(β) > > I(ρ) ≃ hydropathy. Indications of the absence of σ could be as useful for some applications as the indication of the presence of σ.",
keywords = "amino acid sequence, distributions, protein secondary structure",
author = "Arnold, {G. E.} and A. Dunker and Johns, {S. J.} and Douthart, {R. J.}",
year = "1992",
doi = "10.1002/prot.340120410",
language = "English (US)",
volume = "12",
pages = "382--399",
journal = "Proteins: Structure, Function and Genetics",
issn = "0887-3585",
publisher = "Wiley-Liss Inc.",
number = "4",

}

TY - JOUR

T1 - Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure

AU - Arnold, G. E.

AU - Dunker, A.

AU - Johns, S. J.

AU - Douthart, R. J.

PY - 1992

Y1 - 1992

N2 - The conditional probability, P(σ|x), is a statement of the probability that the value of σ will be found given the prior information that a value of x has been observed. Here σ represents any one of the secondary structure types, α, β, τ, and ρ for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou-Fasman conformational parameters for helix, P(α), for sheet, P(β), and for turn, P(τ); and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, I(α), for sheet, I(β), for turn, I(τ), and for random structure, I(ρ). Plots of P(σ|x) vs. x are demonstrated to provide information about the correlation between structure and attribute, σ and x. The separations between different P(σ|x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P (α|x), P(β|x), P(τ|x) and P(ρ|x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix > P(α) > > N-cap > C- cap ≃ I(α) ≃ L(τ). The information value for turns, I(τ), was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: I(β) > > P(β) ≃ hydropathy > I(ρ) ≃ hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: I(α) ≃ P(α) ≃ hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: P(β) > > I(ρ) ≃ hydropathy. Indications of the absence of σ could be as useful for some applications as the indication of the presence of σ.

AB - The conditional probability, P(σ|x), is a statement of the probability that the value of σ will be found given the prior information that a value of x has been observed. Here σ represents any one of the secondary structure types, α, β, τ, and ρ for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou-Fasman conformational parameters for helix, P(α), for sheet, P(β), and for turn, P(τ); and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, I(α), for sheet, I(β), for turn, I(τ), and for random structure, I(ρ). Plots of P(σ|x) vs. x are demonstrated to provide information about the correlation between structure and attribute, σ and x. The separations between different P(σ|x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P (α|x), P(β|x), P(τ|x) and P(ρ|x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix > P(α) > > N-cap > C- cap ≃ I(α) ≃ L(τ). The information value for turns, I(τ), was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: I(β) > > P(β) ≃ hydropathy > I(ρ) ≃ hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: I(α) ≃ P(α) ≃ hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: P(β) > > I(ρ) ≃ hydropathy. Indications of the absence of σ could be as useful for some applications as the indication of the presence of σ.

KW - amino acid sequence

KW - distributions

KW - protein secondary structure

UR - http://www.scopus.com/inward/record.url?scp=0026629597&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0026629597&partnerID=8YFLogxK

U2 - 10.1002/prot.340120410

DO - 10.1002/prot.340120410

M3 - Article

VL - 12

SP - 382

EP - 399

JO - Proteins: Structure, Function and Genetics

JF - Proteins: Structure, Function and Genetics

SN - 0887-3585

IS - 4

ER -