The guys at the NCI Cactus blog have done a great job of rendering the new ChEMBL schema as released in ChEMBL_09. There are quite a few changes, and we have started to load/curate new data against this schema - so check with us if your analyses rely on something currently in there! Click on the above image for a large view.
Things will be fairly quiet at ChEMBL Manor next week - we have the annual ChEMBL training course which will keep us busy, and hopefully out of any trouble.
Comments
Now, ChEBI actually has a good mechanism for making the distinction (talk to Janna). What are the plans with ChEMBL in this respect? Will we see this corrected? It clearly affects QSAR modeling, as the assay activities are actually related to either one of the stereoisomers in the racemic mixture, or a mix of both. That said, QSAR descriptors will have to take either geometry to calculate 3D descriptors, and as such introduces needless uncertainty in the model.
(And, obviously, this also affects how I should represent things in RDF :)
Our initial focus will be on annotating the issues, to aid interpretation and curation.
A further complication we have come to in the past is for some of the 'neglected' stereocenters, like sulphones. Finally, an interesting clinical candidate case we have corresponded with ChemSpider recently over is flesinoxan - where there are ambiguous links between the +/- and R/S.
I think an interesting area of chemoinformatics science at the moment, with quite a lot more potential is in the area of reduced representation (in contrast to ever more explicit enumeration and calculation). The potential to develop robust landscapes at a lower 'resolution' is quite exciting.
For calculating 3D descriptors, with undefined stereochemistry in lots of cases, or large numbers of possible enantiomers, coupled with large numbers of tautomers, and the problem of pKa prediction and assignment. I wish you the very best of luck.
Example:
* What does assays.assay_type={A,B,F,U} stand for?
* Is is possible to expose a few example SQL queries somewhere?