ChEMBL Resources

The SARfaris: GPCR, Kinase, ADME

Wednesday, 18 July 2012

ChEMBL 14 Released

We are pleased to announce the release of ChEMBL_14. This latest version of the ChEMBL database contains:
  • 1,384,479 compound records
  • 1,213,242 distinct compounds
  • 644,734 assays
  • 10,129,256 bioactivities
  • 9,003 targets
  • 46,133 documents
  • 10 data sources
As well as updates to the scientific literature and PubChem data sources, this release also includes data from 2 new sources:
  • DrugMatrix - in vitro pharmacology assays for 870 therapeutic, industrial and environmental chemicals against 132 protein targets.
  •  GSK Published Kinase Inhibitor Set - two data sets screening this compound library have been deposited by Nanosyn and the University of North Carolina.
On the interface, we have also added some new compound cross references to Gene Expression Atlas, Drugs of the Future (subset of PubChem), IUPHAR, NIH Clinical Collection and ZINC. On the target report card pages we have added cross references to CanSAR, Gene Ontology, IntAct, InterPro, IUPHAR, MICAD, Reactome and Wikipedia.

You download the ChEMBL_14 data from our ftpsite, but please refer to the chembl_14 release notes for a full list updates, changes and also details on planned schema changes in forthcoming ChEMBL releases.


Egon Willighagen said...

These new links out, how are they found/determined? Are they the same drug, the same chemical graphs, or ...? How is the equivalence determined?

John Overington said...

Largely on the basis of an identical InChI - these are established with UniChem, a very simple lookup of a large number of chemical structures.

We've got a webinar on UniChem soon....

Egon Willighagen said...

I'll join that UniChem webinar!