ChEMBL extracts data from the core medicinal chemistry literature and therefore reflects ongoing developments in drug discovery. One area currently attracting high interest is targeted protein degradation: compounds that direct disease-causing proteins to the cell’s degradation machinery. These modalities are both present and rapidly increasing within ChEMBL, providing an important data source for the community.
However, new and emerging modalities bring new challenges. Data should be well structured and FAIR, but for newer modalities, controlled vocabularies may not exist or adequately cover the breadth of data being reported. Curation effort in this area supports our broader goals towards generating bespoke datasets and improving AI-readiness.
We recently had the opportunity to attend the first ISCB UK conference where we presented our work towards the capture and annotation of diverse targeted protein degraders, including modalities that use the proteasome, lysosome, and heat shock proteins to destroy their targets. We showcased our data, including for Vepdegestrant (CHEMBL5095210), the first approved PROTAC which is indicated for ER-positive breast cancer.
You can now watch our presentation or view our slides to find out how ChEMBL is progressing in this area and how we are working to understand, capture, and enhance the annotation of these entities. Our first round of curation has produced a bioactivity dataset of degrader-target interactions characterised by associated bioactivity measurements and supported by metadata embedded within the ChEMBL data ecosystem. Our next release (version 37) will have a new MODALITY flag enabling easy extraction of Targeted Protein Degradation data from ChEMBL. Further refinement and (re)structuring of these data is also underway for future releases which will better support downstream applications such as drug discovery and AI.
If you have ideas about how ChEMBL can address emerging modalities, come and talk to us at our upcoming UGM or fill in our survey. You can also view our other presentations from ISCB UK: a poster on the patent resource SureChEMBL and on our research into chemical probes in the literature.
Comments