Poster Presentation 50th Lorne Proteins Conference 2025

Using machine learning and enhanced sampling to expand the toolbox in structure-based drug discovery (#402)

Albert Ardevol 1 , Thomas Coudrat 1 , Alex Caputo 1 , Matt Dennis 1 , Davy Guan 2 , leo lebrat 1 3 , Hendrik Falk 1 , Katherine Locock 1 , Lewis Blackman 1
  1. CSIRO, Clayton, VIC, Australia
  2. Data61, CSIRO, Eveleigh, NSW, Australia
  3. Data61, CSIRO, Herston, QLD, Australia

Molecular docking and virtual screening are essential for early drug discovery, enabling the rapid evaluation of large compound libraries. However, the success in the hit identification phase relies on a series of theoretical assumptions that may not hold in practice. Identifying not only the protein target but also the relevant 3D structure of the protein(s), especially when the binding site is cryptic or exists in an ensemble of metastable states, can be challenging. Integrating this complexity in a virtual screening workflow capable of screening ultra-large libraries of compounds covering a comprehensive chemical space requires not only extensive computational resources but also new methods that can effectively overcome past limitations.

 

To address this, molecular dynamics (MD) methods were incorporated to identify new binding sites, providing comprehensive view of target flexibility. Machine learning (ML) enabled pipeline was developed using deep learning and active learning, trained on small initial docking datasets to predict binding affinities across billions of compounds efficiently. Experimental validation of top-ranked compounds confirmed the power of integrating ML models with MD insights, accelerating the identification of promising antivirals and antibacterials. This combined strategy demonstrates that AI-driven screening, paired with molecular dynamics, can efficiently explore vast chemical spaces to discover new potential therapeutics.