Skip to main content

ChEMBL 25 and new web interface released

We are pleased to announce the release of ChEMBL 25 and our new web interface. This version of the database, prepared on 10/12/2018 contains:

  • 2,335,417 compound records
  • 1,879,206 compounds (of which 1,870,461 have mol files)
  • 15,504,603 activities
  • 1,125,387 assays
  • 12,482 targets
  • 72,271 documents


Data can be downloaded from the ChEMBL ftp site: ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_25

Please see ChEMBL_25 release notes for full details of all changes in this release: ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_25/chembl_25_release_notes.txt


DATA CHANGES SINCE THE LAST RELEASE

# Deposited Data Sets:

Kuster Lab Chemical Proteomics Drug Profiling (src_id = 48, Document ChEMBL_ID = CHEMBL3991601):
Data have been included from the publication: The target landscape of clinical kinase drugs. Klaeger S, Heinzlmeir S and Wilhelm M et al (2017), Science, 358-6367 (https://doi.org/10.1126/science.aan4368)

# In Vivo Assay Classification:

A classification scheme has been created for in vivo assays. This is stored in the ASSAY_CLASSIFICATION table in the database schema and consists of a three-level classification. Level 1 corresponds to the top-levels of the ATC classification i.e., anatomical system/therapeutic area (e.g., CARDIOVASCULAR SYSTEM, MUSCULO-SKELETAL SYSTEM, NERVOUS SYSTEM). Level 2 provides a more fine-grained classification of the phenotype or biological process being studied (e.g., Learning and Memory, Anti-Obesity Activity, Gastric Function). Level three represents the specific in vivo assay being performed (e.g., Laser Induced Thrombosis, Hypoxia Tolerance Test in Rats, Paw Edema Test) and is assigned a specific ASSAY_CLASS_ID. Individual in vivo assays in the ChEMBL ASSAYS table are mapped to reference in-vivo assays in the ASSAY_CLASSIFICATION table via the ASSAY_CLASS_MAP table. More information about the classification scheme is available in the following publication: https://doi.org/10.1038/sdata.2018.230. The assay classification is available via web services and will be included in the ChEMBL web interface in the near future.

# Updated Data Sets:
Scientific Literature
Patent Bioactivity Data
BindingDB Database (corrections to compound structures)


WEB INTERFACE/WEB SERVICE CHANGES SINCE THE LAST RELEASE

# Web Interface:

The new ChEMBL web interface is now live at https://www.ebi.ac.uk/chembl (this replaces the previous beta version). The old ChEMBL web interface will be retired before the ChEMBL_26 release, but is available on the following URL until then: https://www.ebi.ac.uk/chembl/old. The new interface provides richer search and filtering capabilities. Documentation regarding this new functionality and frequently asked questions are available on our help pages: https://chembl.gitbook.io/chembl-interface-documentation/

# Changes to Web Services:

The Assay web service has been updated to include both assay_parameters and the in vivo assay classification for an assay (where applicable):
https://www.ebi.ac.uk/chembl/api/data/assay

A separate endpoint has also been created for the in vivo assay classification:
https://www.ebi.ac.uk/chembl/api/data/assay_class

The Activity web service has been updated to include activity_properties. The 'published_type', 'published_relation', 'published_value' and 'published_units' fields have also been renamed to 'type', 'relation', 'value' and 'units':
https://www.ebi.ac.uk/chembl/api/data/activity

A new endpoint has been created to retrieve supplementary data associated with an activity measurement (or list of measurements):
https://www.ebi.ac.uk/chembl/api/data/activity_supplementary_data_by_activity


SCHEMA CHANGES SINCE THE LAST RELEASE

# Tables Added:

ASSAY_CLASSIFICATION:
Classification scheme for phenotypic assays e.g., by therapeutic area, phenotype/process and assay type. Can be used to find standard assays for a particular disease area or phenotype e.g., anti-obesity assays. Currently data are available only for in vivo efficacy assays

COLUMN_NAME DATA_TYPE COMMENT
ASSAY_CLASS_ID NUMBER(9,0) Primary key
L1 VARCHAR2(100) High level classification e.g., by anatomical/therapeutic area
L2 VARCHAR2(100) Mid-level classification e.g., by phenotype/biological process
L3 VARCHAR2(1000) Fine-grained classification e.g., by assay type
CLASS_TYPE VARCHAR2(50) The type of assay being classified e.g., in vivo efficacy
SOURCE VARCHAR2(50) Source from which the assay class was obtained

ASSAY_CLASS_MAP:
Mapping table linking assays to classes in the ASSAY_CLASSIFICATION table

COLUMN_NAME DATA_TYPE COMMENT
ASS_CLS_MAP_ID NUMBER Primary key.
ASSAY_ID NUMBER assay_id is the foreign key that maps to the 'assays' table
ASSAY_CLASS_ID NUMBER assay_class_id is the foreign key that maps to the 'assay_classification' table


# Columns Removed:

ACTIVITIES:
PUBLISHED_TYPE (DEPRECATED in ChEMBL_24, now removed, replaced by TYPE)
PUBLISHED_RELATION (DEPRECATED in ChEMBL_24, now removed, replaced by RELATION)
PUBLISHED_VALUE (DEPRECATED in ChEMBL_24, now removed, replaced by VALUE)
PUBLISHED_UNITS (DEPRECATED in ChEMBL_24, now removed, replaced by UNITS)


Funding Acknowledgements:
Work contributing to ChEMBL_25 was funded by the Wellcome Trust, EMBL Member States, Open Targets, National Institutes of Health (NIH) Common Fund, EU Innovative Medicines Initiative (IMI) and EU Framework 7 programmes. Please see https://www.ebi.ac.uk/chembl/funding for more details.

The ChEMBL Team

Comments

Popular posts from this blog

SureChEMBL Available Now

Followers of the ChEMBL group's activities and this blog will be aware of our involvement in the migration of the previously commercially available SureChem chemistry patent system, to a new, free-for-all system, known as SureChEMBL. Today we are very pleased to announce that the migration process is complete and the SureChEMBL website is now online. SureChEMBL provides the research community with the ability to search the patent literature using Lucene-based keyword queries and, much more importantly, chemistry-based queries. If you are not familiar with SureChEMBL, we recommend you review the content of these earlier blogposts here and here . SureChEMBL is a live system, which is continuously extracting chemical entities from the patent literature. The time it takes for a new chemical in the patent literature to become searchable in the SureChEMBL system is 1-2 days (WO patents can sometimes take a bit longer due to an additional reprocessing step). At time of writi

New SureChEMBL announcement

(Generated with DALL-E 3 ∙ 30 October 2023 at 1:48 pm) We have some very exciting news to report: the new SureChEMBL is now available! Hooray! What is SureChEMBL, you may ask. Good question! In our portfolio of chemical biology services, alongside our established database of bioactivity data for drug-like molecules ChEMBL , our dictionary of annotated small molecule entities ChEBI , and our compound cross-referencing system UniChem , we also deliver a database of annotated patents! Almost 10 years ago , EMBL-EBI acquired the SureChem system of chemically annotated patents and made this freely accessible in the public domain as SureChEMBL. Since then, our team has continued to maintain and deliver SureChEMBL. However, this has become increasingly challenging due to the complexities of the underlying codebase. We were awarded a Wellcome Trust grant in 2021 to completely overhaul SureChEMBL, with a new UI, backend infrastructure, and new f

ChEMBL & SureChEMBL anniversary symposium

  In 2024 we celebrate the 15th anniversary of the first public release of the ChEMBL database as well as the 10th anniversary of SureChEMBL. To recognise this important landmark we are organising a two-day symposium to celebrate the work achieved by ChEMBL and SureChEMBL, and look forward to its future.   Save the date for the ChEMBL 15 Year Symposium October 1-2, 2024     Day one will consist of four workshops, a basic ChEMBL drug design workshop; an advanced ChEMBL workshop (EUbOPEN community workshop); a ChEMBL data deposition workshop; and a SureChEMBL workshop. Day two will consist of a series of talks from invited speakers, a few poster flash talks, a local nature walk, as well as celebratory cake. During the breaks, the poster session will be a great opportunity to catch up with other users and collaborators of the ChEMBL resources and chat to colleagues, co-workers and others to find out more about how the database is being used. Lunch and refreshments will be pro

ChEMBL 34 is out!

We are delighted to announce the release of ChEMBL 34, which includes a full update to drug and clinical candidate drug data. This version of the database, prepared on 28/03/2024 contains:         2,431,025 compounds (of which 2,409,270 have mol files)         3,106,257 compound records (non-unique compounds)         20,772,701 activities         1,644,390 assays         15,598 targets         89,892 documents Data can be downloaded from the ChEMBL FTP site:  https://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_34/ Please see ChEMBL_34 release notes for full details of all changes in this release:  https://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_34/chembl_34_release_notes.txt New Data Sources European Medicines Agency (src_id = 66): European Medicines Agency's data correspond to EMA drugs prior to 20 January 2023 (excluding vaccines). 71 out of the 882 newly added EMA drugs are only authorised by EMA, rather than from other regulatory bodies e.g.

RDKit, C++ and Jupyter Notebook

Fancy playing with RDKit C++ API without needing to set up a C++ project and compile it? But wait... isn't C++ a compiled programming language? How this can be even possible? Thanks to Cling (CERN's C++ interpreter) and xeus-cling jupyter kernel is possible to use C++ as an intepreted language inside a jupyter notebook! We prepared a simple notebook showing few examples of RDKit functionalities and a docker image in case you want to run it. With the single requirement of docker being installed in your computer you'll be able to easily run the examples following the three steps below: docker pull eloyfelix/rdkit_jupyter_cling docker run -d -p 9999:9999 eloyfelix/rdkit_jupyter_cling open  http://localhost:9999/notebooks/rdkit_cling.ipynb  in a browser