ChEMBL Resources


Tuesday, 25 September 2012

New Drug Approvals 2012 - Pt. XX - Bosutinib (Bosulif®)

On September 4th, the FDA approved bosutinib (marketed as Bosulif) for the treatment of patients with previously treated Philadelphia Chromosome-Positive (Ph+) Chronic Myelogenous Leukemia (CML - cancer of the white blood cells).

Chronic myelogenous leukemia is one of the four most common types of leukemia and is often associated with treatment with imatinib (CHEMBL941) as an initial therapy. However, approximately one-third of patients do not achieve an optimal response with this standard treatment. In such cases, second generation Tyrosine Kinase Inhibitors are required, but only half of the treated patients show acceptable outcomes. The patients with poor responses to either of these treatments have now the possibility to receive bosutinib (CHEMBL288441) as alternative therapy.

As suggested by the '-tinib' prefix (USAN stem), bosutinib is a protein kinase inhibitor (ATC:L01XE14). The molecule has a calculated logP of 3.88 and relative molecular weight of 530.4. It is therefore too heavy to satisfy the rule of five (maximum molecular weight of 500). The drug is taken orally with food with a recommendation of 500 mg per day. BOSULIF is available as tablet of 100 and 500 mg.
Canonical SMILES: COc1cc(Nc2c(cnc3cc(OCCCN4CCN(C)CC4)c(OC)cc23)C#N)c(Cl)cc1Cl
Standard InChI: 1S/C26H29Cl2N5O3/c1-32-6-8-33(9-7-32)5-4-10-36-25-13-21-18(11-24(25)35-3)26(17(15-29)16-30-21)31-22-14-23(34-2)20(28)12-19(22)27/h11-14,16H,4-10H2,1-3H3,(H,30,31)

As the drug is metabolized by CYP3A4 (UNIPROT:P08684) it could therefore interact with other compounds acting on the enzyme, such as P-glycoprotein inhibitors or CYP3 inducers. Proton pump inhibitors can also decrease the drug concentration in the human body.

Most common adverse reactions (incidence greater than 20%) are diarrhea, nausea, thrombocytopenia, vomiting, abdominal pain, rash, anemia, pyrexia and fatigue.

The product website is, full prescribing information is here.

Wednesday, 19 September 2012

New Drug Approvals 2012 - Pt. XIX - Enzalutamide (Xtandi capsulesTM)

On August 31, the FDA approved Enzalutamide for the treatment of castration-resistant prostate cancer. Prostate cancer affects predominantly men aged 50 years and older and is the sixth most frequent source of cancer-related deaths in men world-wide.

The prostate is a gland located below the bladder that surrounds the urethra and secretes simple sugars, citrate, zinc and other constituents of liquid semen. Prostate cancer in many cases has only mild symptoms, even without treatment. Prostate cancer can be detected by measuring concentrations of the biomarker prostate specific antigen. Its progression stage is assessed by the widely established Gleason grading scheme. In many cases it is sufficient to monitor cancer progression without treatment.
For aggressive tumors, various treatment options are available and include surgery, irradiation, cryosurgery, chemotherapy and hormonal therapy. Hormonal therapy relies on the tumor's dependence on androgen signalling, which can be ablated using the antiandrogens flutamide (CHEMBL806) and bicalutamide (CHEMBL409). However, after about two to three years, many prostate cancers become refractory to hormone therapy, even though they still rely on androgen signalling. These so-called castration resistant cancers can be treated with docetaxel (CHEMBL92) and, as a second line of defense, the newly approved Enzalutamide.

Enzalutamide and its primary metabolite N-desmethyl enzalutamide competitively inhibit androgen binding to the androgen receptor (Uniprot P10275).

Enzalutamide is a small molecule with molecular weight 464.44 and calculated logP of 3.88. It is practically insoluble in water and is administered in liquid-filled soft gelatin capsules.

IUPAC: 4-{3-[4-cyano-3-(trifluoromethyl)phenyl]-5,5­ dimethyl-4-oxo-2-sulfanylideneimidazolidin-1-yl}-2-fluoro-N-methylbenzamide
SMILES: CNC(=O)c1ccc(N2C(=S)N(c3ccc(C#N)c(C(F)(F)F)c3)C(=O)C2(C)C)cc1F

Enzatulamide is administered in a daily dose of 160mg, which equates to four 40mg capsules. It has a Cmax of 16.6µg/mL that is reached after about one hour and is 97% bound to plasma proteins.

Enzatulamide is metabolised primarily by CYP2C8 (P10632) and CYP3A4 (P08684). A major metabolite, N-desmethyl enzalutamide has similar bioactivity as enzatulamide.

Adverse reactions include asthenia/fatigue, back pain, diarrhea and others.

Enzatulamide is marketed by Medivation under the trade name Xtandi.

Tuesday, 18 September 2012

Japanese Webinar Now Available to Watch

For anyone who wasn't able to attend the Japanese language webinar hosted by Kaz Ikeda, we have provided a link to its recording. This webinar covers the basic use of the ChEMBL database with a particular focus on the interface and searching.

The YouTube clip can be found here: Japanese Webinar

Any questions, please feel free to contact

International Chemical Biology Society - Free Membership

On the occasion of the 3rd European Chemical Biology Society (ECBS) in Vienna, Rathnam Chaguturu, founding president of the International Chemical Biology Society (ICBS), announced the launch of this new society.

You are cordially invited to become founding members by using the online registration at

ICBS is offering free membership to all chemical biologists until the end of this year (whooooo!).

Between October 4-5, ICBS is holding its first official conference in Cambridge MA:

Monday, 17 September 2012

CINF Session on Bioinf and Chemoinf Data at the ACS National Meeting in New Orleans - April 2013

Abstract submission is now open for the CINF sessions of the ACS National Meeting in New Orleans, LA, next April. Ian Bruno of the CCDC and I are chairing a session of Linkage of Bioinformatic and Chemoinformatic Data.

Check it out, and get those abstracts in!

Saturday, 15 September 2012

Query Privacy in ChEMBL

We have been asked several times for all the user-generated queries of ChEMBL - i.e. the structures sketched in to the interface that are then searched against the database. We will not (and in fact, physically can't) share these. Sorry. It is against both our institutional privacy policy, and standard Terms of Use, and also we've engineered the app to avoid us 'storing' any of this information where at all possible (e.g. in avoiding /tmp type fluff, minimizing residency time in caches, etc.).

There are clearly some advantages in pooling or analysing website search data - it highlights interesting trends, something becoming more interesting to a user community can spot emerging events, etc. It can alert to flu outbreaks (there was a Science paper from google on this, don't have the reference handy though - you may be able to find it with google though.....). There is a huge interest in many sites that I use in tracking and analysing query terms and usage patterns, and in some contexts this is just the thing to do - like when ebay teases me (and surely of all the tortured obsessive souls on the planet, it is just me and me alone) with a rare phosphor or perforation machin variant I don't have.

The types of query that people perform can clearly also be used to develop ways of improving a website, or specifically the performance of search queries - and for algorithm development this information can be like gold-dust. There are now many chemical fingerprint systems available, and adapting the features/structures of these to typical user queries is really valuable in their development.

There are essentially two distinct aspects to user's expectations/rights of privacy when using a website like ChEMBL.

  • There is a personal privacy issue - 'why is John Overington interested in compounds for the treatment of obesity?'. This is an primarily an embarrassment sort of thing ('hey, is this guy a bit chubby?'), or maybe a commercially sensitive thing ('he's interested in obesity stuff; heh, let's raise the price for him', or 'let's show him some adverts for chips', or 'let's contact his rival and let them know he's interested in his weight'). These latter things are behind the feature where you first search for a flight and the price is great, then the next time you look, it's gone up - allegedly.
  • There's a more fundamental IP issue though -  The simple disclosure of a search term can be commercially damaging, and potentially stop the development of life-saving therapies. The simplest case is chemical structure and drug patents. The most important patent claim in drug discovery is to have composition of matter (and don't get all hissy over pharma misusing the patent system, since patents are absolutely essential for the development of new medicines, the treatment of disease, improvement of food supplies, for funding future R&D, for a source of employment, license revenues to Universities, and taxation revenues, etc). This composition of matter is a claim of a novel chemical structure, that no-one has disclosed before, and it is useful for something. If the structure is not novel, then the patent can be readily invalidated.
Hopefully, you'll understand our reasons for maintaining both user and query privacy.

For an extra clear clarification - we do not, and cannot examine queries of users ourselves within the development team here at the EBI. In case you read the above text as sharing stuff solely with third parties.

Your use of ChEMBL is private, and always will be.

Friday, 14 September 2012

New Drug Approvals 2012 - Pt. XVIII - Teriflunomide (AubagioTM)

ATC Code: L04AA13
Wikipedia: Teriflunomide

On September 12th the FDA approved Teriflunomide (tradename AUBAGIO, ChEMBL973), an orally administered drug for the treatment of relapsing forms of Multiple Sclerosis (MS). Teriflunomide is an inhibitor of of pyrimidine synthesis by dihydroorotate dehydrogenase (DHODH, Uniprot: Q02127) but is it not certain if this explains the effect of the drug on MS lesions. Teriflunomide inhibits rapidly dividing cells, which includes activated T lymphocytes thought to drive the MS disease process. The net effect of the inhibition of DHODH is that lymphocytes cannot accumulate sufficient pyrimidines for DNA synthesis. Additionally, Teriflunomide has been shown to inhibit the activation of nuclear factor kappaB and tyrosine kinases, but at doses higher than needed for the observed anti-inflammatory effects. Teriflunomide is the active metabolite of an already approved drug Leflunomide (tradename Arava, ChEMBL960) indicated for the treatment of rheumatoid and psoriatic arthritis.

MS is an inflammatory disease characterised by damaging of the myelin sheaths surrounding the axons of the brain and spinal cord. This demyelation results in a broad number of symptoms scarring. The prevalence ranges between 2 – 150 per 100.000 and the disease onset usually occurs in young adults. MS cannot currently be cured and the prognosis is difficult to predict, depending on the subtype of the disease. The United States National Multiple Sclerosis Society characterised four clinical courses, two of which are classified as relapsing forms of MS namely 'relapsing remitting' and 'progressive relapsing'.

Currently there are six other disease-modifying treatments for MS approved by regulatory agencies. These are: Fingolimod (trade name Gilenya, CHEMBL314854), interferon beta-1a (trade names Avonex, CinnoVex, ReciGen and Rebif, CHEMBL1201562) and interferon beta-1b (U.S. trade name Betaseron, in Europe and Japan Betaferon, CHEMBL1201563), glatiramer acetate (trade name Copaxone, CHEMBL1201507), mitoxantrone (trade name Novantrone, CHEMBL58) and natalizumab (trade name Tysabri). Of these drugs, only Fingolimod is orally administered, the others are injected intravenously or subcutaneously, hence Terfiflunomide is the second oral treatment option for MS.

Terfiflunomide is a small molecule drug with a molecular mass of 270.20 g/ml, an AlogP of 2.09 , 3 rotatable bonds and does not violate the rule of 5.
 Canonical SMILES : C\C(=C(/C#N)\C(=O)Nc1ccc(cc1)C(F)(F)F)\O
 InChi: InChI=1S/C12H9F3N2O2/c1-7(18)10(6-16)11(19)17-9-4-2-8(3-5-9)12(13,14)15/h2-5,18H,1H3,(H,17,19)/b10-7-

The structure of the drug can interconvert between Z and E stereoisomers with the Z enol being the most stable and the active form.

DHODH (EC:, Uniprot: Q02127, PDB: 1D3G, CHEMBL: ChEMBL1966, IntAct: EBI-3928775 ), is a 395 amino acid monomer located at the mitochondrion inner membrane. The protein is a single-pass membrane protein with the catalytic site located in the mitochondrial inter-membrane space.

>sp|Q02127|PYRD_HUMAN Dihydroorotate dehydrogenase (quinone)

The recommended dose of AUBAGIO is 7 mg or 14 mg orally once daily. AUBAGIO can be taken with or without food.

The median time to reach maximum plasma concentrations is between 1 and 4 hours post-dose following and oral administration. The half life is approximately 18-19 days after repeated doses of 7 mg and 14 mg respectively. It takes approximately 3 months respectively to reach steady-state concentrations.

Teriflunomide is mainly eliminated through direct biliary excretion of unchanged drug and renal excretion of metabolites.

The drug comes with a box warning to alert prescribers to the risk of liver problems, including death, and a risk of birth defects. Physicians are advised to do a blood test for liver function prior to prescribing Terfiflunomide and periodically during the course of treatment. Based on animal studies, the drug may cause fetal harm.

The license holder is the Genzyme Corporation and the full prescribing information can be found here.

Monday, 10 September 2012

ChEMBL RESTful Web Service API Release 1.0.5

We are pleased to announce that we have updated the ChEMBL RESTful Web Service API (application programming interface) with some more of the features that you - the ChEMBL users - have requested. 

In particular, we have added support for the:
  • Retrieval of compounds by Canonical SMILES string using HTTP POST *.
  • Retrieval of compounds containing a particular substructure, as given by a Canonical SMILES string using HTTP POST *.
  • Retrieval of a list of compounds similar, at a given cutoff percentage Tanimoto similarity, to one represented by a given Canonical SMILES string using HTTP POST *.
  • Retrieval of larger compound images, as given by a compound ChEMBLID. The retrieved image can be easily re-sized using the 'dimensions' attribute of the endpoint. See the example URLs below.
  • Inclusion of a 'synonyms' property on ChEMBL compound resources. This property will be set for compounds for which there are synonyms available.

Sample urls:

In addition to the API changes we have also updated the ChEMBL Java client to take advantage of the new features provided by the API. These updates include:
  • Methods to invoke the additional HTTP POST API endpoints (searching for compounds based on SMILES matches, common substructures and similarity to a given percentage Tanimoto similarity). Examples of the new client methods in use are available on the class on the API documentation page.

As always, you're feedback and suggestions for improving the API are most welcome. Please e-mail:


*  These additions are in response to a bug in sending SMILES data via the URL - some SMILES instances, such as those containing triple bonds, make use of characters which are reserved characters in the specification for Uniform Resource Locators (URLs). For API requests involving SMILES, API user's can choose to either URL encode their SMILES input before submitting the request to the HTTP GET endpoint or use the new HTTP POST endpoint and send the SMILES data in the body of the HTTP request rather than in the URL.

Saturday, 8 September 2012

New Drug Approvals 2012 - Pt. XVII - Linaclotide (LinzessTM)

ATC Code: A03A (incomplete)
Wikipedia: Linaclotide

On Agust 30, the FDA approved Linaclotide (tradename: Linzess; Research Code: MD-1100, ASP-0456), a novel, first-in-class Guanylate Cyclase-C (GC-C) agonist indicated for the treatment in adults of irritable bowel syndrome with constipation (IBS-C), and chronic idiophatic constipation (CIC). CIC is a diagnosis given to people who experience persistent constipation and do not respond to standard treatment. IBS-C is a subtype characterized by chronic abnominal pain, discomfort, bloating and alteration of bowel habits. Linaclotide exherts its therapeutic action by binding to GC-C, resulting in an increase in both intracellular and extracellular concentrations of cyclic guanosine monophosphate (cGMP). Increase in intracellular cGMP stimulates secretion of chloride and bicarbonate into the intestinal lumen, mainly through activation of the cystic fibrosis transmembrane conductance regulator (CFTR) ion channel, resulting in increased intestinal fluid and accelerated transit. Linaclotide has been shown, in animal models, to not only accelerate gastrointestinal (GI) transit, but also to reduce intestinal pain, which is thought to be mediated by increased extracellular cGMP.

Other treatments for IBS have been already in the market and these include treatments with antimuscarinic drugs, such as Dicyclomine (approved in 1950; tradename: Bentyl; ChEMBL: CHEMBL1123), Methantheline (approved in 1951, tradename: Banthine; ChEMBL: CHEMBL1201264), a serotonin agonist, such as Tegaserod (approved in 2002; tradename: Zelnorm; ChEMBL: CHEMBL1201332) and a serotonin antagonist, such as Alosetron (approved in 2000; tradename: Lotronex; Chembl: CHEMBL1110) and Lubiprostone (approved in 2006; tradename: Amitiza; ChEMBL: CHEMBL1201134), a chloride channel activator. While these drugs act by either inhibiting the muscarinic action of acethylcholine, or through the activation of the serotonin receptors of the nervous system in the GI tract, or by activating the chloride channels on the GI epithelial cells, Linaclotide represents the first GC-C agonist to ever reach the market.

GC-C (ChEMBL: CHEMBL1795197; Uniprot: P25092) is a 1073 amino-acid long enzyme, which has an extracellular ligand binding domain (PFAM: ANF_receptor), a domain similar to that of protein tyrosine kinases (PFAM: Pkinase_Tyr) and a adenylate and guanylate cyclase catalytic domain (PFAM: Guanylate_cyc).

>GUC2C_HUMAN Heat-stable enterotoxin receptor

Linaclotide is an oral peptide drug, comprised of 14 amino acids and with disulfide bonds between cysteines (1-6), (2-10) and (3-15). Linaclotide has a molecular weight of 1526.8 Da. (Name: L-cysteinyl-L-cysteinyl-L-glutamyl-L-tyrosyl-L-cysteinyl-L-cysteinyl-L­-asparaginyl-L-prolyl-L-alanyl-L-cysteinyl-L-threonyl-glycyl-L-cysteinyl-L-tyrosine, cyclic (1-6), (2-10), (5­-13)-tris (disulfide); CanonicalSmiles: C[C@@H](O)[C@@H]1NC(=O)[C@@H]2CSSC[C@@H]3NC(=O)[C@@H](N)CSSC[C@H](NC(=O)[C@H](CSSC[C@H](NC(=O)CNC1=O)C(=O)N[C@@H](Cc4ccc(O)cc4)C(=O)O)NC(=O)[C@H](Cc5ccc(O)cc5)NC(=O)[C@H](CCC(=O)O)NC3=O)C(=O)N[C@@H](CC(=O)N)C(=O)N6CCC[C@H]6C(=O)N[C@@H](C)C(=O)N2; InChI: InChI=1S/C59H79N15O21S6/c1-26-47(82)69-41-25-101-99-22-38-52(87)65-33(13-14-45(80)81)49(84)66-34(16-28-5-9-30(76)10-6-28)50(85)71-40(54(89)72-39(23-97-96-20-32(60)48(83)70-38)53(88)67-35(18-43(61)78)58(93)74-15-3-4-42(74)56(91)63-26)24-100-98-21-37(64-44(79)19-62-57(92)46(27(2)75)73-55(41)90)51(86)68-36(59(94)95)17-29-7-11-31(77)12-8-29/h5-12,26-27,32-42,46,75-77H,3-4,13-25,60H2,1-2H3,(H2,61,78)(H,62,92)(H,63,91)(H,64,79)(H,65,87)(H,66,84)(H,67,88)(H,68,86)(H,69,82)(H,70,83)(H,71,85)(H,72,89)(H,73,90)(H,80,81)(H,94,95)/t26-,27+,32-,33-,34-,35-,36-,37-,38-,39-,40-,41-,42-,46-/m0/s1)

The recommended dosage of Linaclotide is 290 mcg orally once daily for the case of IBS-C, and 145 mcg orally once daily for the treatment of CIC, on empty stomach at least 30 minutes prior to first meal of the day.

Linaclotide is minimally absorbed with low systemic availability following oral administration. Concentrations of Linaclotide and its active metabolite in plasma are below quantitation after oral doses of 145 mcg and 290 mcg were administrated. Therefore Linaclotide is expected to be minimally distributed to tissues. Linaclotide is metabolised within the GI tract to its active metabolite by loss of the terminal tyrosine moiety. Both Linaclotide and the metabolite are proteolitically degraded within the intestinal lumen to smaller peptides and naturally occuring amino acids. Following the daily administration of 290 mcg of Linaclotide for seven days, about 5% and 3% were recovered in the feces of fasted and fed subjects, respectively, and virtually all as the active metabolite.

The license holder is Ironwood Pharmaceuticals, Inc. and the full prescribing information of Linaclotide can be found here.

Monday, 3 September 2012

Antibody Drugs To Have Reached Clinical Trials By Company

Similar to the previous kinase post, this time for antibody containing therapeutics. If you'd like the data, let me know....