ChEMBL 14 Released

We are pleased to announce the release of ChEMBL_14. This latest version of the ChEMBL database contains:

1,384,479 compound records
1,213,242 distinct compounds
644,734 assays
10,129,256 bioactivities
9,003 targets
46,133 documents
10 data sources

As well as updates to the scientific literature and PubChem data sources, this release also includes data from 2 new sources:

DrugMatrix - in vitro pharmacology assays for 870 therapeutic, industrial and environmental chemicals against 132 protein targets.
GSK Published Kinase Inhibitor Set - two data sets screening this compound library have been deposited by Nanosyn and the University of North Carolina.

On the interface, we have also added some new compound cross references to Gene Expression Atlas, Drugs of the Future (subset of PubChem), IUPHAR, NIH Clinical Collection and ZINC. On the target report card pages we have added cross references to CanSAR, Gene Ontology, IntAct, InterPro, IUPHAR, MICAD, Reactome and Wikipedia.

You download the ChEMBL_14 data from our ftpsite, but please refer to the chembl_14 release notes for a full list updates, changes and also details on planned schema changes in forthcoming ChEMBL releases.

Comments

Egon Willighagen said…

These new links out, how are they found/determined? Are they the same drug, the same chemical graphs, or ...? How is the equivalence determined?

19 July 2012 at 19:44

jpo said…

Largely on the basis of an identical InChI - these are established with UniChem, a very simple lookup of a large number of chemical structures.

We've got a webinar on UniChem soon....

22 July 2012 at 06:17

I'll join that UniChem webinar!

22 July 2012 at 09:22

The ChEMBL-og

Search This Blog

ChEMBL 14 Released

Labels

Comments