ChEMBL Compound Clean Up

For the last three months, I've been busy working my way through a 9000 long (sometimes headache-inducing) set of ChEMBL compound ids. These had been highlighted for curation for the reason that for each ChEMBL_id in the list, there were two or more compound keys from the same paper. This implied that either there were two indistinguishable using InChI representation compounds described in the paper or they were different compounds that had been somehow merged together in the database. Each ChEMBL_id was individually checked against the data in the original paper to see if there were indeed two compound keys for the same structure. The outcome of this check gave rise to one of four cases: The structure(s) was found to be incorrect and was redrawn. The structure was correct for some records but not others, so a new compound was created for those selected records. The structure required the definition of stereochemistry or a salt. The structure was le

New Drug Approvals 2013 - Pt. III - Pomalidomide (PomalystTM)

ATC Code: L04A (partial) Wikipedia: Pomalidomide On February 8 th , the FDA approved Pomalidomide (Tradename: Pomalyst ; Research Code: CC-4047, IMiD 3), a thalidomide analogue, indicated for the treatment of multiple myeloma in patients who failed to respond to previous therapies (e.g. lenalidomide and bortezomib ). Multiple myeloma is a form of blood cancer that primarily affects older adults, and arises from the accumulation of abnormal plasma cells in the bone marrow. These abnormal plasma cells produce large amounts of unneeded antibodies, which are then deposited in various organs, causing renal failure, polyneuropathy and other myeloma-associated symptoms. Pomalidomide, an analogue of thalidomide, is an immunomodulatory agent with antineoplastic activity. Like other thalidomide analogs, the exact mechanism of action is yet not fully understood, however in vitro assays demonstrated that pomalidomide inhibited proliferation and induced apoptosis of hematopoietic

Save the date: 2nd RDKit UGM, 2-4 October

We'll be organising the 2nd RDKit Users Group Meeting which will be held from the 2nd until  the 4th of October 2013 here at the EMBL-EBI in Hinxton. In addition to  two days of talks, tutorials and  discussions, the last day  will be dedicated to a coding/documentation sprint. Stay tuned for more information, as well as a call for presenters, which will come over the  next few weeks, but, in the meantime, please go ahead and block the dates in your  busy calendars! George

A couple of weeks ago, I created a Doodle Poll to gauge interest in  hosting another series of Webinars, after the success of the ones we hosted last year. After a good response, these Webinars have now been organised and those who are interested in signing up to them, can do so here . Most of the webinars will only take 45mins and will give a good overview of the topic that they are talking about.

Latest activities on the Activities table in ChEMBL_15

For the recent ChEMBL_15 release, a considerable part of our efforts was focussed on the standardisation and harmonisation of the data in the Activities table. The latter holds all the quantitative and qualitative experimental measurements across compounds, assays and targets; needless to say that without it there's no ChEMBL ! This is a summary of what we've incorporated so far: Flag missing data: Records with null published values and null activity comments were flagged as missing. Standardise activity types and units: Conversion of heterogenous published activity type descriptions and units to a standard_type and set of standard_units (e.g., for IC50 convert mM/uM/pM measurements to nM). Flag unusual units: Records with unusual published units for their respective activity types were flagged as 'non standard'. For example, a hypothetical record with IC50 type and units in kg would be flagged! Convert the log values: The records with activity types

New Drug Approvals 2013 - Pt. II - Mipomersen (KynamroTM)

ATC Code: C10AX11 Wikipedia: Mipomersen On January 29 st , the FDA approved Mipomersen (Tradename: Kynamro ; Research Code: ISIS-310312), an oligonucleotide inhibitor of apolipoprotein B-100 (apo B-100) synthesis, indicated as an adjunct to lipid-lowering medications and diet to reduce low density lipoprotein-cholesterol (LDL-C) , apolipoprotein B (apo B), total cholesterol (TC) , and non-high density lipoprotein-cholesterol (non HDL-C) in patients with homozygous familial hypercholesterolemia (HoFH) . Familial hypercholesterolemia is a genetic disorder, characterised by high levels of cholesterol rich low-density lipoproteins (LDL-C) in the blood. This genetic condition is generally attributed to a faulty mutation in the LDL receptor (LDLR) gene, which mediates the endocytosis of LDL-C. Mipomersen is the first antisense oligonucleotide that targets messenger RNA (mRNA) enconding apolipoprotein B-100 (Apo B-100), the principal apolipoprotein of LDL and its metabolic pre

Future Webinars

After the success of the last round of webinars , we have decided to run another set. However, we would like to gauge the interest in which topics would be most useful. The topics that have been suggested so far are: ChEMBL Overview - Basic interface walkthrough and searching ChEMBL Schema - Basic overview and ChEMBL changes ChEMBL Schema - Changes to ChEMBL target data model UniChem - Basic overview and searching Drug and USAN data content

New Drug Approvals 2013 - Pt. 1 - Alogliptin (NesinaTM)

ATC Code: A10BH04 Wikipedia: Alogliptin On January 25th 2013, FDA approved Alogliptin (as the benzoate salt; tradename: Nesina ; research code: SYR-322, TAK-322; CHEMBL: CHEMBL376359 ), a dipeptidyl peptidase-4 (DPP-4) inhibitor indicated as an adjunct to diet and exercise to improve glycemic control in adults with type 2 diabetes mellitus (also known as noninsulin-dependent diabetes mellitus (NIDDM)). NIDDM is a chronic disease characterized by high blood glucose. In response to meals, increased concentrations of incretin hormones such as glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) are released into the bloodstream from the small intestine. These hormones cause insulin release from the pancreatic beta cells in a glucose-dependent manner but are inactivated by the DPP-4 enzyme within minutes. GLP-1 also lowers glucagon secretion from pancreatic alpha cells, reducing hepatic glucose production. In patients with NIDDM, concentrat

Japan - Here I Come (in October)!

I'm out in Japan at the end of October this year - the week of October 28th 2013 for a scientific conference (the CBI Annual Conference). Japan is one of my favorite places in the whole world, and I have a routine of... browsing vintage hi-fi shops hunting out high-end capacitor and choke components,  eating eel  うなぎ  bento,  visiting CD stores (at least the format lives on in Japan, and the Obi Strips and enhanced content makes compelling browsing for an obsessive completist like me) going to  Bic Camera   株式会社ビックカメラ visiting Akihabara  秋葉原 . My schedule is currently empty for Wednesday 30th and Thursday 31st. I'd be delighted to visit and give a seminar, or maybe run a workshop on ChEMBL, so if you are interested in meeting up, or a visit, or an evening meal, let me know. jpo

Updated Drug Icons

In the recent release of CHEMBL_15, we have revisited the information displayed in the drug icons used in the ChEMBL interface and in the ChEMBL-og New Drug Approvals monographs and we have made a few changes. The following images show the main changes (in this example, for the case of an oral synthetic small molecule): 1. We have visually separated the ingredient-specific information (icons in green) from the product-specific information (icons in blue) . 2. The chirality icon will now also show if the ingredient is dosed as a racemic mixture (an image of two human hands). 3. An extra icon has been added to indicate the marketing status of a drug product. The product can be available as prescription (an image of the letters RX), over-the-counter (an image of the letters OTC) or discontinued (an image of the letters of RX with a stripe across it). In summary... The ingredient icons (in green) display the following information (from left to right) Drug class th

Searching ChEMBL with GO terms

Here is a little new tip/trick within the ChEMBL interface . It's possible to search by GO term - for example, if you wanted to retrieve targets (and then easily get compounds that bind to these targets) with a particular GO annotation, it's really easy to do. So, imagine you wanted targets that were  GO:0008270  (which is zinc ion binding), type this in to the search box, select the "target" search button, and you get targets retrieved that have this GO term assigned. This is really cool! PS One issue is that the leading 0s in the GO term are significant