Skip to main content

Posts

The first PROTAC has been approved: can we find it in ChEMBL?

  ChEMBL extracts data from the core medicinal chemistry literature and therefore reflects ongoing developments in drug discovery. One area currently attracting high interest is targeted protein degradation: compounds that direct disease-causing proteins to the cell’s degradation machinery.  These modalities are both present and rapidly increasing within ChEMBL, providing an important data source for the community.   Targeted Protein Degradation Data in ChEMBL However, new and emerging modalities bring new challenges. Data should be well structured and FAIR, but for newer modalities, controlled vocabularies may not exist or adequately cover the breadth of data being reported. Cu ration effort in this area supports our broader goals towards generating bespoke datasets and improving AI-readiness.  W e recently had the opportunity to attend the first ISCB UK conference where we presented our work towards the capture and annotatio...
Recent posts

Help Shape the Future of ChEMBL: Take Our User Survey Now!

  Dear ChEMBL and SureChEMBL Community, As you know we are dedicated to keeping ChEMBL and SureChEMBL world-class resources for the scientific community, but to do that effectively, we need to hear from you.  To ensure we are consistently meeting your research needs, we are excited to launch our latest ChEMBL User Survey . Why Your Feedback Matters As we plan the next phases and future developments for our platforms, this survey is your opportunity to have a direct impact. By sharing how you interact with our data, you help us understand what works, what doesn't, and what you need next. Your insights will allow us to: Refine the interface: Make navigating and extracting data smoother and more intuitive. Prioritize new features: Focus our development efforts on the tools that will best support your drug discovery and research processes. Evolve with you: Ensure that ChEMBL and SureChEMBL stay in sync with the...

ChEMBL UGM: Speakers confirmed

I'm very excited. We now have a list of confirmed speakers for the upcoming ChEMBL UGM (June 10-11). Uday Abu-Shehab, University of Vienna Katie Beckwith, Ignota Labs Evan Bolton, PubChem Giovanni Cincilla, Healx Wei Dai, Queen Mary University of London Wim Dehaen, University of Chemistry and Technology Prague Luca Falciola, SCIBILIS Thierry Hanser, Ixelis Tobias Harren, Universität Hamburg Greg Landrum, ETH Zurich John Mayfield, NextMove Software John Overington, DrugHunter Carl Schiebroek, ETH Zurich Chris Southan, University of Edinburgh Brandon Walts, SciBite I think it's going to be a really exciting line-up covering a diverse range of topics.  If you still haven't signed up , note that the closing date for in-person registration is 18th May. Note also that we have a limit in numbers for Day Two, and it's first come, first served.

OPSIN v2.9.0 released

Just a quick note to say that Daniel Lowe has released OPSIN v.2.9.0 , the first release since Oct 2023. This is now available via the EMBL-EBI OPSIN server . The release notes describe a mixture of minor bug fixes and improvements: Support for IUPAC recommended primed number-letter locants e.g. 2''a Command-line output now includes warnings e.g. ambiguity SMILES writer now starts from a * atom if one is present Added numbering to nicotine Correctly interpretation of locanted perhalo terms and perhaloalkylalkanes Improved additive bond formation for phosphoryl Corrected locants on tolyl and assume p-tolyl if unspecified triazine is now interpreted as 1,3,5-triazine if unspecified Corrected interpretation of dithiazolium Fixed rare SMILES writing bug where slashes could be inconsistent Fixed ylidenethenylidene being parsed as [ylidene][thenylidene] instead of [yliden][ethenylidene] Fixed bug in spiro superscript inferring when a bridge is length 0 

OPSIN vs AI

I recently prepared a few slides on OPSIN for an internal presentation, and was looking for a simple use case. The first thing I tried turned out to be more interesting that I expected. If you visit the OPSIN website , there are three examples provided to illustrate its functionality. Daniel's original website, at the Uni of Cambridge, had 2,4,6-trinitrotoluene (TNT) as the example. With the move to EMBL-EBI and associated rewrite of the frontend, I thought about keeping this but decided that something more biologically-relevant would be appropriate. In the end, I comprised by keeping the 2,4,6- as a nod to the original, but used a saccharide instead: 2,4,6-tri-O-methyl-D-glucopyranose. Now click on "Search Google", to do a search using the InChIKey. My attention was drawn to the AI summary results, which I captured at the time (maybe you can tell when?) in the screenshot below: "The string  UTLUVTKMAWSZKV-NEIVSKJXSA-N is an InChIKey (International Chemical Identifie...

Second announcement of 2nd ChEMBL User Group Meeting

This is a reminder that the 2nd ChEMBL User Group Meeting will take place on June 10-11 on the Wellcome Genome Campus, Hinxton, near Cambridge, UK. This event is dedicated to building and supporting the ChEMBL and SureChEMBL user communities. This is a two day event; while hybrid attendance on Day 1 is possible, we really encourage in-person participation to allow you to meet the team, present your work, network, and to take part in Day 2 setting the scene for the future direction of the group. The deadline for speaker registration is two weeks from now, on 18th March so register now . We hope to see you there.