Enrichment of chemical libraries docked to protein conformational ensembles and application to aldehyde dehydrogenase 2

Bo Wang, Cameron D. Buchman, Liwei Li, Thomas Hurley, Samy Meroueh

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Molecular recognition is a complex process that involves a large ensemble of structures of the receptor and ligand. Yet, most structure-based virtual screening is carried out on a single structure typically from X-ray crystallography. Explicit-solvent molecular dynamics (MD) simulations offer an opportunity to sample multiple conformational states of a protein. Here we evaluate our recently developed scoring method SVMSP in its ability to enrich chemical libraries docked to MD structures of seven proteins from the Directory of Useful Decoys (DUD). SVMSP is a target-specific rescoring method that combines machine learning with statistical potentials. We find that enrichment power as measured by the area under the ROC curve (ROC-AUC) is not affected by increasing the number of MD structures. Among individual MD snapshots, many exhibited enrichment that was significantly better than the crystal structure, but no correlation between enrichment and structural deviation from crystal structure was found. We followed an innovative approach by training SVMSP scoring models using MD structures (SVMSPMD). The resulting models were applied to two difficult cases (p38 and CDK2) for which enrichment was not better than random. We found remarkable increase in enrichment power, particularly for p38, where the ROC-AUC increased by 0.30 to 0.85. Finally, we explored approaches for a priori identification of MD snapshots with high enrichment power from an MD simulation in the absence of active compounds. We found that the use of randomly selected compounds docked to the target of interest using SVMSP led to notable enrichment for EGFR and Src MD snapshots. SVMSP rescoring of protein-compound MD structures was applied for the search of small-molecule inhibitors of the mitochondrial enzyme aldehyde dehydrogenase 2 (ALDH2). Rank-ordering of a commercial library of 50000 compounds docked to MD structures of ALDH2 led to five small-molecule inhibitors. Four compounds had IC50s below 5 M. These compounds serve as leads for the design and synthesis of more potent and selective ALDH2 inhibitors.

Original languageEnglish
Pages (from-to)2105-2116
Number of pages12
JournalJournal of Chemical Information and Modeling
Volume54
Issue number7
DOIs
StatePublished - Jul 28 2014

Fingerprint

Aldehyde Dehydrogenase
Aldehydes
Molecular dynamics
Proteins
Crystal structure
Oxidoreductases
Molecular recognition
simulation
Molecules
X ray crystallography
Computer simulation
Learning systems
Screening
Enzymes
Ligands

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications
  • Library and Information Sciences

Cite this

Enrichment of chemical libraries docked to protein conformational ensembles and application to aldehyde dehydrogenase 2. / Wang, Bo; Buchman, Cameron D.; Li, Liwei; Hurley, Thomas; Meroueh, Samy.

In: Journal of Chemical Information and Modeling, Vol. 54, No. 7, 28.07.2014, p. 2105-2116.

Research output: Contribution to journalArticle

@article{d89a6d453b424400a332a7aa713a1b20,
title = "Enrichment of chemical libraries docked to protein conformational ensembles and application to aldehyde dehydrogenase 2",
abstract = "Molecular recognition is a complex process that involves a large ensemble of structures of the receptor and ligand. Yet, most structure-based virtual screening is carried out on a single structure typically from X-ray crystallography. Explicit-solvent molecular dynamics (MD) simulations offer an opportunity to sample multiple conformational states of a protein. Here we evaluate our recently developed scoring method SVMSP in its ability to enrich chemical libraries docked to MD structures of seven proteins from the Directory of Useful Decoys (DUD). SVMSP is a target-specific rescoring method that combines machine learning with statistical potentials. We find that enrichment power as measured by the area under the ROC curve (ROC-AUC) is not affected by increasing the number of MD structures. Among individual MD snapshots, many exhibited enrichment that was significantly better than the crystal structure, but no correlation between enrichment and structural deviation from crystal structure was found. We followed an innovative approach by training SVMSP scoring models using MD structures (SVMSPMD). The resulting models were applied to two difficult cases (p38 and CDK2) for which enrichment was not better than random. We found remarkable increase in enrichment power, particularly for p38, where the ROC-AUC increased by 0.30 to 0.85. Finally, we explored approaches for a priori identification of MD snapshots with high enrichment power from an MD simulation in the absence of active compounds. We found that the use of randomly selected compounds docked to the target of interest using SVMSP led to notable enrichment for EGFR and Src MD snapshots. SVMSP rescoring of protein-compound MD structures was applied for the search of small-molecule inhibitors of the mitochondrial enzyme aldehyde dehydrogenase 2 (ALDH2). Rank-ordering of a commercial library of 50000 compounds docked to MD structures of ALDH2 led to five small-molecule inhibitors. Four compounds had IC50s below 5 M. These compounds serve as leads for the design and synthesis of more potent and selective ALDH2 inhibitors.",
author = "Bo Wang and Buchman, {Cameron D.} and Liwei Li and Thomas Hurley and Samy Meroueh",
year = "2014",
month = "7",
day = "28",
doi = "10.1021/ci5002026",
language = "English",
volume = "54",
pages = "2105--2116",
journal = "Journal of Chemical Information and Computer Sciences",
issn = "0095-2338",
publisher = "American Chemical Society",
number = "7",

}

TY - JOUR

T1 - Enrichment of chemical libraries docked to protein conformational ensembles and application to aldehyde dehydrogenase 2

AU - Wang, Bo

AU - Buchman, Cameron D.

AU - Li, Liwei

AU - Hurley, Thomas

AU - Meroueh, Samy

PY - 2014/7/28

Y1 - 2014/7/28

N2 - Molecular recognition is a complex process that involves a large ensemble of structures of the receptor and ligand. Yet, most structure-based virtual screening is carried out on a single structure typically from X-ray crystallography. Explicit-solvent molecular dynamics (MD) simulations offer an opportunity to sample multiple conformational states of a protein. Here we evaluate our recently developed scoring method SVMSP in its ability to enrich chemical libraries docked to MD structures of seven proteins from the Directory of Useful Decoys (DUD). SVMSP is a target-specific rescoring method that combines machine learning with statistical potentials. We find that enrichment power as measured by the area under the ROC curve (ROC-AUC) is not affected by increasing the number of MD structures. Among individual MD snapshots, many exhibited enrichment that was significantly better than the crystal structure, but no correlation between enrichment and structural deviation from crystal structure was found. We followed an innovative approach by training SVMSP scoring models using MD structures (SVMSPMD). The resulting models were applied to two difficult cases (p38 and CDK2) for which enrichment was not better than random. We found remarkable increase in enrichment power, particularly for p38, where the ROC-AUC increased by 0.30 to 0.85. Finally, we explored approaches for a priori identification of MD snapshots with high enrichment power from an MD simulation in the absence of active compounds. We found that the use of randomly selected compounds docked to the target of interest using SVMSP led to notable enrichment for EGFR and Src MD snapshots. SVMSP rescoring of protein-compound MD structures was applied for the search of small-molecule inhibitors of the mitochondrial enzyme aldehyde dehydrogenase 2 (ALDH2). Rank-ordering of a commercial library of 50000 compounds docked to MD structures of ALDH2 led to five small-molecule inhibitors. Four compounds had IC50s below 5 M. These compounds serve as leads for the design and synthesis of more potent and selective ALDH2 inhibitors.

AB - Molecular recognition is a complex process that involves a large ensemble of structures of the receptor and ligand. Yet, most structure-based virtual screening is carried out on a single structure typically from X-ray crystallography. Explicit-solvent molecular dynamics (MD) simulations offer an opportunity to sample multiple conformational states of a protein. Here we evaluate our recently developed scoring method SVMSP in its ability to enrich chemical libraries docked to MD structures of seven proteins from the Directory of Useful Decoys (DUD). SVMSP is a target-specific rescoring method that combines machine learning with statistical potentials. We find that enrichment power as measured by the area under the ROC curve (ROC-AUC) is not affected by increasing the number of MD structures. Among individual MD snapshots, many exhibited enrichment that was significantly better than the crystal structure, but no correlation between enrichment and structural deviation from crystal structure was found. We followed an innovative approach by training SVMSP scoring models using MD structures (SVMSPMD). The resulting models were applied to two difficult cases (p38 and CDK2) for which enrichment was not better than random. We found remarkable increase in enrichment power, particularly for p38, where the ROC-AUC increased by 0.30 to 0.85. Finally, we explored approaches for a priori identification of MD snapshots with high enrichment power from an MD simulation in the absence of active compounds. We found that the use of randomly selected compounds docked to the target of interest using SVMSP led to notable enrichment for EGFR and Src MD snapshots. SVMSP rescoring of protein-compound MD structures was applied for the search of small-molecule inhibitors of the mitochondrial enzyme aldehyde dehydrogenase 2 (ALDH2). Rank-ordering of a commercial library of 50000 compounds docked to MD structures of ALDH2 led to five small-molecule inhibitors. Four compounds had IC50s below 5 M. These compounds serve as leads for the design and synthesis of more potent and selective ALDH2 inhibitors.

UR - http://www.scopus.com/inward/record.url?scp=84905025315&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905025315&partnerID=8YFLogxK

U2 - 10.1021/ci5002026

DO - 10.1021/ci5002026

M3 - Article

VL - 54

SP - 2105

EP - 2116

JO - Journal of Chemical Information and Computer Sciences

JF - Journal of Chemical Information and Computer Sciences

SN - 0095-2338

IS - 7

ER -