CCDC, Exscientia, and Oxford University are collaborating on an automated, quantitative method for informing the design of compound selectivity across protein families.

The amount of structural data on protein drug targets continues to grow. However, successfully mining this data to form testable hypotheses that drive drug discovery can prove challenging. Selectivity for the target protein is a crucial property in the development of new therapeutics. In a recent paper in the Journal of Chemical Information and Modeling, authors from the CCDC, Exscientia, and Oxford University show how an automated process leveraging ‘ensemble hotspot maps’ can identify key structural differences that contribute to the selectivity of a compound for one protein over another.

The power of hotspot mapping to advance drug design

Hotspot mapping quantifies the propensity for compounds to exploit interactions in a preferred binding site—providing a 3D grid of data to help score and prioritize compounds. The power of this method lies in how it finds key interactions during early-phase drug discovery and then distils the information into easily interpretable results.

Chris Radoux is Head of Structural Bioinformatics at Exscientia and a co-author on the paper. He explained: “Adding hotspot maps early in a drug discovery project can provide a molecular blueprint using the protein structure alone.

“This can be used to help determine how druggable a given pocket of a target protein is and to prioritise fragment starting points for compound design. The highest scoring interactions can then be used to guide computational methods and algorithms.”

Leveraging real-world, empirical data for reliability

The script used to generate the hotspot maps is a Python package called ‘Hotspots API’, which leverages the data in the Cambridge Structural Database (CSD) via CCDC’s IsoStar library of interactions. The CSD is the world’s repository for small-molecule organic and metal-organic crystal structures—containing over 1.1 million structures from x-ray and neutron diffraction analyses. IsoStar is a web application that uses the CSD to generate thousands of interactive 3D scatterplots that show the probability of occurrence and spatial characteristics of interactions between pairs of chemical functional groups.

Dr Jason Cole is a Senior Research Fellow at CCDC. He said: “Using CSD data for this type of analysis provides different insights from energy-calculation-based methods, as the interactions observed in the CSD are influenced by more than their strength.”

Study impacts

Exscientia is a global leader in pharmatech, which sits at the interface of advanced AI application and complex drug discovery. They have implemented the hotspot mapping in-house within multiple drug discovery programs and use it to guide target validation and drug design. In addition, a research team at the University of Cambridge recently published in Nature how they used fragment hotspot mapping to identify structures that may assist in designing DNA-dependent protein kinase catalytic subunit inhibitors, which show potential as cancer therapeutics.

Read the papers

Mihaela D. Smilova, Peter R. Curran, Chris J. Radoux, Frank von Delft, Jason C. Cole, Anthony R. Bradley, and Brian D. Marsden, J Chem Inf Model, 2022 62 (2), 284-294. DOI: 10.1021/acs.jcim.1c00823

 Curran, P. R. et al. “Hotspots API: A Python Package for the Detection of Small Molecule Binding Hotspots and Application to Structure-Based Drug Design.” J. Chem. Inf. Model. 2020, 1911. DOI:

Liang, S., Thomas, S.E., Chaplin, A.K. et al. “Structural insights into inhibitor regulation of the DNA repair protein DNA-PKcs.” Nature, 2022 601, 643–648. DOI: 10.1038/s41586-021-04274-9