ChEMBL Resources


Monday, 30 November 2009

For those in the Cambridge UK area....

We have Jean-Louis Reymond visiting us on the morning of Friday the 4th of December, he is going to give a seminar on his group's GDB databases (see previous blog posts, I think this is a very exciting area of science at the moment). If you are interested in coming, please mail me for details and to arrange access to campus.

Again xkcd is the source for the cartoon.

Friday, 27 November 2009

New Drug Approvals - Pt. XXII - Romidepsin (Istodax)

Approved on November 5th 2009 was Romidepsin (trade name Istodax). Romidepsin, previously known by the research codes FK-228, FR-901228 and NSC-630176, is a histone deacetylase (HDAC) inhibitor indicated for the treatment of cutaneous T-cell lymphoma (CTCL) in patients who have received at least one prior systemic therapy. CTCL is a slow-growing cancer of infection-fighting white blood cells called T-lymphocytes. Romidepsin binds directly to the HDAC active site blocking substrate access. HDACs catalyze the removal of acetyl groups from acetylated lysine residues in histones, resulting in the modulation of gene expression. Romidepsin causes the accumulation of acetylated histones, and induces cell cycle arrest and apoptosis of cancer cells. Romidepsin is the fourth drug to be approved for CTCL, after Vorinostat (trade name Zolinza), Bexarotene (trade name Targretin) and Denileukin Difitox (trade name Ontak). Vorinostat is a small-molecule drug which also inhibits HDACs, whereas Bexarotene is a small-molecule retinoid selective for retinoid X receptors and Denileukin Difitox is an engineered protein combining interleukin-2 and diphtheria toxin, which binds to interleukin-2 receptors and introduce the diphtheria toxin into cells that express those receptors, killing the cells. Romidepsin is a Natural Product drug discovered from a fermentation broth of the bacteria Chromobacterium violaceum. It is a cyclic peptide with a molecular weight of 540.71 g.mol-1. It is highly protein bound in plasma (92% to 94%), with alpha1-acid-glycoprotein being the principal binding protein. Romidepsin undergoes extensive metabolism primarily by CYP3A4 with minor contribution from CYP3A5, CYP1A1, CYP2B6 and CYP2C19. It has a terminal half-life of ~3 hours. Among one of the potential adverse events is the propensity for the compound to increase QT interval. The recommended dose of Romidepsin is 14 mg.m-2 administrated intravenously over a 4-hour period on days 1, 8 and 15 of a 28-day cycle. The full prescribing information can be found here. Romidepsin's chemical structure is (1S,4S,7Z,10S,16E,21R)-7-ethylidene-4,21-bis(1-methylethyl)-2-oxa-12,13-dithia-5,8,20,23-tetraazabicyclo[8.7.6]tricos-16-ene-3,6,9,19,22-pentone. It is a cyclic depsipeptide (peptide in which one or more amide bonds are replaced by ester bonds) with four component amino acids and a beta-hydroxyamide moiety (an amide and an hydroxy functional groups separated by two carbons atoms), which collectively form a 16-membered lactone with a disulfide bridge. The disulfide bond is reduced in the cellular environment, releasing the free thiol analogue as the active species.
The manufacturer of Romidepsin is Gloucester Pharmaceuticals, Inc. and the product website is

Thursday, 26 November 2009

FAQ: Is there a license agreement I need to sign for chembl?

There is no need to sign a license agreement for any of the chembl data or applications. Nor is there the requirement for any payment. The data/software is covered by a creative commons licence - Creative Commons Attribution-Share Alike 3.0 Unported License. If you have any detailed questions about licensing please get in contact with us.

Wednesday, 25 November 2009

Conference: TACBAC 2010 - an update

A brief update of the forthcoming TACBAC 2010 conference. The schedule is now online and it looks very, very good (conflict of interest - I am motivated to say that). But really it does! Have a look for yourself if you don't believe me.

TACBAC 2010 schedule

The picture above is the top hit for "TACBAC" in google images. TACtical BACon.

FAQ: Where can I download StARlite?

Bad news and Good news: You cannot download StARlite. StARlite was a registered trademark for a database developed and marketed by Inpharmatica Ltd. Some of the databases and intellectual property of Inpharmatica Ltd. were licensed to the EMBL-EBI. What used to be known as StARlite is now part of chembl. chembl downloads accessible from here.

Tuesday, 24 November 2009

Chembldb schema web meeting - 3pm GMT, 27th November 2009

Following unanticipated child illness last week, we will have a web-meeting walkthrough of the chembldb core schema on Friday 27th November at 3pm GMT. The excellent webhuddle will be used, so if you are interested it may be worth checking this out, setting up an account, checking it works on your machine, in advance of the meeting. Mail us if you want links to the phone number and webhuddle link.

We are aware that the 27th is over the forthcoming Thanksgiving weekend for our friends located in the United States of America, but we will do a further web-meeting the following week, for those unable to make it on the 27th.

Please, please, please use the above link, it is just too complex (for me) to manage emails with all sorts of titles, content and so forth! Also, it causes confusion (for me) if you forward the access details, and then I get messages, phone calls, smoke signals, etc. about the meeting if we need to change things.

I have just joined Google wave, and have a few invitations left if anyone is interested in getting one.

Friday, 20 November 2009

Chembldb interface question

We have received a couple of questions about the 'privacy' of chemical structure searching of our chembldb interface. The root of these questions seems to be 'Is it safe for me to search with a proprietary structure?'. There are probably two components to this - privacy over the route to our servers, and the privacy of the query once it gets on to our site. Basically, we're not interested in what you are searching for, we don't store the queries at all, and have no desire to disclose this sort of data to anyone; however, there are a number of things we could do in the short-term to address some potential corporate/IP concerns.

What should be top priority for chembldb interface chemical searching functionality?
It's perfect, leave it alone!
Add tls encryption (previously generally refered to as https/ssl)
Distribute code for the interface and make an install package free polls

We would also welcome contact/discussion to help us develop our longer term 'privacy' strategy for chembldb.

The image is from the excellent xkcd.....

EBI Interfaces blog

There are only two issues with website implementation - 'interface' and 'content'. We have a project here at the EMBL-EBI connected with interface and UI principles, and they have a blog, here is a link to that blog.

Wednesday, 18 November 2009

ChEMBL Molecular Interactions

ChEMBL small molecule-protein interactions are now available in PSI-MI TAB and XML formats, thanks to the
Proteomics Services group. This dataset includes ChEMBL interactions identified via binding assays with IC50/Ki/EC50/Kd values below 10uM - just under 500,000 interactions in total (with negative/weaker interactions also included in the XML export).

The data can be accessed via the
PSICQUIC project (Proteomics Standards Initiative Common QUery InterfaCe), which provides programmatic access to a wide range of molecular interaction databases via SOAP and REST web services. For example, this URL retrieves all ChEMBL interactions relating to imatinib (Gleevec).

Mail us
if you need more info.

Monday, 16 November 2009

ようこそ、ケンブルへ! - Welcome to 剣舞瑠 ! -

The following is written in Japanese....
ケンブルチーム(ChEMBL Team)は、欧州バイオインフォマティクス研究所(EMBL-EBI)にあり、創薬研究に有用な化合物やターゲット情報を提供するデータベースを開発しています。


ケンブルチームでは、キナーゼに特化したカイネースサファリ(Kinase SARfari)のサービスも開始しました。


Sunday, 15 November 2009

Conference: Rocky '09

My kids all think I have a really easy life - international travel, 'holidays' all around the world, and they regularly come out with the line 'hard day at the ice-cream factory?' when I say how busy or stressed at work I am. The latest conference we are presenting at does not help, but why oh why do they never hold the conferences I am invited to in somewhere like the Faroe Islands, where the cod fishing is good.

Anyway, we are speaking at Rocky '09, an ISCB conference held in Aspen, CO from 10th to 12th December. It looks a really, really interesting conference. Here is a link to the schedule.

Tuesday, 10 November 2009

New Drug Approvals - Pt. XXI - Ofatumumab (Arzerra)

The latest approval this month, on October 26th, was Ofatumumab (trade name Arzerra). Ofatumumab is a CD20-directed cytolytic monoclonal antibody indicated for the treatment of patients with refractory chronic lymphocytic leukemia (CLL) who have inadequately responded to both Fludarabine and Alemtuzumab. CLL is characterized by an abnormal proliferation of lymphocytes so-called B-cells. B-cells originate in the bone marrow and are involved in fighting infection. In CLL, the DNA of a B-cell is damaged and so it can not produce antibodies in order to fight infection. Moreover, they grow out of control and accumulate in the bone marrow and blood.

Ofatumumab is an IgG1k human monoclonal antibody which binds specifically to both the small and large extracellular loops of CD20. CD20 is a non-glycosylated phosphoprotein expressed on normal B lymphocytes and on B-cell CLL. Since it is not shed from the cell surface, it allows for antibody binding, and when so, it sends a signal across the membrane to control growth and trigger death of certain tumor cells. The Fab domain of Ofatumumab binds to the CD20 molecule, whereas the Fc domain mediates immune effector functions that result in B-cell lysis.
Ofatumumab has a molecular weight of ca. 149 kDa. The dosing is typically 12 doses administered as an initial 300mg dose, followed 1 week later by a 2,000mg dose weekly for 7 doses, followed 4 weeks later by a further 2,000 mg every 4 weeks for 4 doses (a 2g dose is equivalent to ca. 134umol). It has a volume of distribution ranging from 1.7 L to 5.1 L and its elimination occurs through both a target-independent route and a B-cell mediated route. Ofatumumab clearance is approximately 0.01 L/hr and mean half-life is ca. 14 days. The recommended dosage and full prescribing information can be found here.

<DRUG_NAME="Ofatumumab" TRADEMARK_NAME="Arzerra">
The license holder is GlaxoSmithKline and the product website is

Web seminar on chembldb schema

We will have a web-meeting walkthrough of the chembldb core schema on Friday 20th November at 3pm GMT. The excellent webhuddle will be used, so if you are interested it may be worth checking this out, setting up an account, checking it works on your machine, in advance of the meeting. Mail us if you want links to the phone number and webhuddle link.

Recruitment: Data Integration Position for ChEMBL

Details for a new position within ChEMBL, available for a three year period, are now listed on the EMBL recruitment website.

The ChEMBL job is (W/09/087/EBI). Closing date is the 30th November 2009.

The image above is a visualisation of the continental United States of America, visualised by distance to the nearest McDonald's restaurant. Further details and attribution are in the image itself.

New Drug Approvals - Pt. XX - Pazopanib (Votrient)

Another drug onto the market this month is Pazopanib, marketed as Votrient, which was approved on October 19th. Pazopanib Hydrochloride (previously known as GW-786034-B) is the sixth drug to be approved for kidney cancer, after Sorafenib (trade name Nexavar), Sunitinib (trade name Sutent), Temsirolimus (trade name Torisel), Everolimus (trade name Afinitor) and Bevacizumab (trade name Avastin). Sorafenib and Sunitinib are both orally dosed small molecule inhibitors of tyrosine protein kinases, which interfere with tumor growth by inhibiting angiogenesis as well as tumor cell proliferation; Temsirolimus and Everolimus are specific inhibitors of mTOR (mammalian target of rapamycin), a serine-threonine kinase, which interfere with the synthesis of proteins that regulate proliferation, growth, and survival of tumor cells; Bevacizumab is a monoclonal antibody that recognizes and blocks VEGF, which is a chemical signal that stimulates angiogenesis. Pazopanib is a small-molecule drug (Molecular Weight is 437.5 g.mol-1 for Pazopanib itself and 474.0 g.mol-1 for the HCl salt), fully Rule-of-Five compliant, lipophilic and practically insoluble in aqueous media. It is orally absorbed, has a high plasma protein binding of >99% and is metabolized by CYP3A4 (and therefore has many drug-drug interactions with substrates, inhibitors and inducers of CYP3A4) with minor contribution from CYP1A2 and CYP2C8. Pazopanib has a mean half-life of 30.9 hours and elimination is primarily through feces (>96% of dose). The recommended dosage is 800mg once daily (equivalent to ca 1.8 mmol). Among one of the potential adverse events is the propensity for the compound to increase QT interval. Full prescribing information can be found here. Pazopanib has a boxed warning. The structure 5-[[4-[(2,3-dimethyl-2H-indazol-6­-yl)methylamino]-2-pyrimidinyl]amino]-2-methylbenzenesulfonamide. Pazopanib is largely planar and and mimics the adenine ring of the enzyme cofactor ATP. Of additional note is the presence of an aryl-sulphonamide (in the bottom left of the image) - these are often weakly acidic.
<InChI="InChI=1/C21H23N7O2S.ClH/c1-13-5-6-15(11-19(13)31(22,29)30)24-21-23-10-9-20(25-21)27(3)16-7-8-17-14(2)28(4)26-18(17)12-16;/h5-12H,1​-4H3,(H2,22,29,30)(H,23,24,25);1H" >
The manufacturer of Pazopanib is GlaxoSmithKline and the product website is

Sunday, 8 November 2009

Guidance for people interested in developing against the chembldb schema/downloads

The chembldb SAR data and an initial front end is now available (see earlier posts), over the next few months we will be making quite a few changes to the data, the database, and so forth. Most excitingly, there is a large bolus of 'new' data to add. Most of the changes we make will be evolutionary, but there will be a few major things as well.

With this in mind, please mail us if you are building any systems that rely on the data, this way we can tell you in advance specifically what we are intending to do, and secondly, you could request features, cross-references, and so forth that could make your life so much easier.

I can't remember where I found the image above, but it is a little bit funny, for those of us that travel in economy/coach.