Breaking news 📢
We are excited to announce our new journal article that presents the comprehensive drug data in ChEMBL. The paper describes the state-of-the-art processes to curate and integrate the high-quality drug and clinical candidate drug data. The drug curation processes have been developed over more than 15 years and this is the first time that they have been published.
Published as a 'Perspectives' article in the Journal of Medicinal Chemistry, the paper educates ChEMBL users, helping them to understand the nature of the drug and clinical candidate data and the rationale that underlies curation decisions. Given the increasing reliance on high-quality data in computational drug discovery, AI and machine learning, the integrated nature of the drug data within the ChEMBL bioactivity resource is a critical asset.
This is a bumper week for drug data in ChEMBL! On Monday, our latest ChEMBL 36 release included a major update to all drug and clinical candidate drug data. Some key aspects of the drug data update are:
- EMA:
- Captures EMA medicinal products up to June 2004
- First routine inclusion of EMA vaccine components
- Pref_name convention developed for vaccine components (e.g. “SARS-COV-2 VIRUS, INACTIVATED”)
- FDA:
- Orange Book updated to Nov 2024 (generic medicinal products)
- New drugs extended to include all new FDA Novel Molecular Entities and FDA Biological License Applications that were approved in 2023 and 2024
- Withdrawn drugs: now 327 drug forms, an increase of 103 drug forms
- Black box warnings:
- NLP pipeline updated (spaCy 3.8.2), applied to 2024 and 2025 FDA labels
Further details for drugs and clinical candidate drugs in ChEMBL 36 are given in the release notes.
Comments