Skip to main content


Showing posts from 2012

Pipeline Pilot Cambridgeshire UGM

We will be organising the 2nd Cambridgeshire Pipeline Pilot Users Group meeting on Thursday 17th January 2013, at 3pm here at the ChEMBL HQs. This is provided that the Mayans were actually wrong.  This is a preliminary agenda for the meeting: 1. Welcome and Host talk:  George Papadatos + Gerard van Westen:       Cool things with Pipeline Pilot and ChEMBL 2. Peter Woollard (GSK):       Using Pipeline Pilot for computational biology capabilities, where it has helps the most and where it is less used 3. Richard Carter (ONT):        Pipeline Pilot on a memory stick 4. Mike Cherry (Accelrys):         Repetitive Data Flow 5. Question and Answer session, including:    - how people have found Next Generation Sequencing components  and the Text Analytics components    - using Pipeline Pilot for running command line software on remote linux servers and retrieving results 6. Adrian Stevens (Accelrys)       Upcoming chemistry components in PP9.0 If you

Paper: Mapping small molecule binding data to structural domains

Our interacting domains paper is out in pdf form. Here's the link . %T Mapping small molecule binding data to structural domains %A F.A. Kruger %A R. Rostom %A J.P. Overington %J BMC Bioinformatics %D 2012 %V 13(Suppl 17) %P S11 %O doi:10.1186/1471-2105-13-S17-S11 jpo

Paper: Automated design of ligands to polypharmacological profiles

Another great paper in Nature this week, making extensive use of ChEMBL . It's by our long-term collaborators up at Dundee - Jeremy, Richard and Andrew - well done, great stuff! Basically it combines a knowledge-base of SAR data (ChEMBL), some predictive models for affinity/properties, and extracts a set of reasonable transforms (chemical conversions) from the same knowledge-base. I'll ask Jeremy/Andrew to do a guest post on the ChEMBL-og on the paper - they're probably pretty busy with press-releases, etc . ;) Here's a link to the paper. Have a read, it will keep you busy for a few hours. %A J. Besnard %A G.F. Ruda %A V. Setola %A K. Abecassis %A R.M. Rodriguez %A X.-P. Huang %A S. Norval %A M.F. Sassano %A A.I. Shin %A L.A. Webster %A F.R.C. Simeons %A L. Stojanovski %A A. Prat %A N.G. Seidah %A D.B. Constam %A G.R. Bickerton %A K.D. Read %A W.C Wetsel %A I.H. Gilbert %A B.L. Roth %A A.L. Hopkins %T Automated design of ligands to polypharmacological profile

New Drug Approvals 2012 - Pt. XXVII - Choline C-11

On September 12, FDA approved Choline C-11, an intravenous  radioactive diagnostic agent to be used as tracer during Positron Emission Tomography ( PET ) scan to help detect sites of recurrent Prostate Cancer  (OMIM :  176807  ; MeSH :  D011471 ) . Prostate cancer is the most common cause of death from cancer in men over age 75, and is rarely found in men younger that 40. Unlike many other cancers, prostate cancer usually progresses very slowly. Sometimes the cancer cells may metastasize from the prostate to other parts of body. Overall, it is estimated to be the sixth leading cause of cancer-related death in men. Choline is a naturally occurring component of the numerous Vitamin-B complex, and is necessary for normal cell structure and signalling. Choline C-11 is a radiolabeled synthetic analog of choline that releases a positron by beta decay which can be visualised by PET. Choline is rapidly taken up by the prostate cells and this allows the prostate to be imaged

Browsers and Bugs

We had a support email recently that some things on the interface didn't work with chrome (an export function) - we couldn't repeat the issue with the equipment we have here at ChEMBL Towers . But there are a lot of OS's and a lot of browsers out there, and we can't recreate every possible environment - interestingly, chrome is really popular amongst you people (the image above is a google analytics report of a weeks access of this very blog). I'm a safari man myself.... So as a reminder, we love hearing about bugs and issues, we really do, so send them to!

ChEMBL Cross Reference Links Now In UniProt

So, some great news for those of you that use UniProt - there are now links to the corresponding target pages in  ChEMBL in there. Here's the link ( ) to the list of ChEMBL targets that are in Uniprot. And there are links to ChEMBL in the Cross References section. jpo

A 101 Thankyou's!

This week, our ChEMBL NAR Database paper made the milestone of over a hundred citations (in less than a year too). This made us all very, very happy, and for a few moments, we rested our fingers from our keyboards, and used our them instead to grasp a mug of coffee/tea; but only for a few seconds, before we got back to mixing and baking and cooking ChEMBL 15 for you all. Here's a list to the current citations of Gaulton et al., NAR Database, 40 , D1100-D1107, 2012 . Remember this is an Open Access paper. Please keep, keeping us happy by using our work, it's probably the biggest satisfaction we can get :)


As part of the ChEMBL groups involvement in the OpenPhacts project, a representative from the ChEMBL team will be attending SWAT4LS next week. As well as hacking and learning about new Semantic technologies there may be time to catch up with ChEMBL users also attending the workshop. So if you would like to hear about what we are doing with the Semantic Web, RDF or just have a general chat about ChEMBL, please get in touch .

New Drug Approvals 2012 - Pt. XXV - Tofacitinib citrate (XELJANZ®)

On November 6, the  FDA approved   T ofacitinib  citrate (Trade Name:  XELJANZ® ; Research code: CP-690550, ChEMBL :  CHEMBL221959 , PubChem:  CID9926791 , DrugBank:  DB08183 , ChemSpider:  8102425 )  to treat moderately to severely active  Rheumatoid Arthritis  (RA). It is orally administered and may be used as monotherapy agent or in combination of non-biologic  DMARDs . About 1% of the world-wide population is affected by  rheumatoid arthritis .  RA affects predominantly women ( three times more susceptible than men ) and is more frequent between ages 40 and 50, but people of any age can be affected . Other approved drugs in this commercially competitive sector include  Adalimumab (Trade Name:   Humira , ChEMBL: CHEMBL1201580 , DrugBank: DB00051 ) , Etanercept (Trade Name:  Enbrel ,  ChEMBL: CHEMBL1201572 , DrugBank: DB00005 ) , Infliximab (Trade Name:  Remicade , ChEMBL: CHEMBL1201581 , DrugBank: DB00065 ). IUPAC Name:  3-(4-methyl-3-(met


The testing and QC pixies have been really busy with Open ChEMBL - the OSDODS virtual machine appliance - this has made us aware of the support questions we're likely to get, and so we'd like to build our knowledge-base of support issues with a small pilot release, prior to facing a lot of queries about VirtualBox , ifconfig etc. If you are interested in getting early access to Open ChEMBL, and have experience in configuration of vm's in heterogeneous environments - please get in touch .

DNDi screens MMV’s open access Malaria Box

The Drugs for Neglected Diseases initiative (DNDi) and Medicines for Malaria Venture (MMV) announce today the identification of three chemical series targeting the treatment of deadly neglected tropical diseases (NTDs), through DNDi’s screening of MMV’s open access Malaria Box. The resulting DNDi screening data are among the first data generated on the Malaria Box to be released into the public domain, exemplifying the potential of openly sharing drug development data for neglected patients. The open access Malaria Box is an MMV initiative launched in December 2011 to catalyse drug discovery for malaria and neglected diseases. It contains 400 molecules, selected by experienced medicinal chemists to offer the broadest chemical diversity possible and is available free of charge. In return, MMV requests that any data gleaned from research on the Malaria Box are shared in the public domain within two years. To date, more than 100 Malaria Boxes have been delivered to over 20 count

SMS-DrugNet Allosteric Regulators Workshop, Edinburgh, December 2012.

For many classically 'undruggable' targets, there is sometimes the prospect of the discovery and optimisation of allosteric regulators, these can offer advantages in more selective target regulation, or improve the drug-like properties of compounds that bind to the allosteric site. However, allosteric regulators are often discovered via serendipity, and many screens are not configured optimally to identify allosteric regulators. As part of the grant we are involved in, there is an Allostery Workshop taking place at the University of Edinburgh on 4th December 2012 . The Workshop, sponsored by the British Council, involves an extensive delegation of scientists from Turkey led by Burak Erman, Koc University , Istanbul and will bring together a diverse area of disciplines including Biology, Chemistry, Computer Science, Informatics, Mathematics and Medicine. The program for the day will include presentations and poster session. Gerard Van Westen from the group

ChEMBL Virtual Machine

Next week we will be releasing ChEMBL Virtual Machine. We have referred to it in a previous post and had hoped to make it available this week, but as always with best laid plans.... So, we are using this post to generate some pre-release excitement :) and also to acknowledge the hard work of Rodrigo Ochoa who worked on this project during his 5 month internship with the ChEMBL group. We will be providing a lot more detail in next weeks blog post, but as a quick summary the VM is based on a Ubuntu linux build and comes preloaded with ChEMBL_14 (in PostgreSQL ), RDKit and a web application, which makes use of Marvin and allows users to easily get started with querying the ChEMBL data.   

New Drug Approvals 2012 - Pt. XXIV - Ocriplasmin (JetreaTM)

On October 17, the FDA approved Ocriplasmin (tradename: Jetrea ; Research Code: Microplasmin), a proteolytic enzyme indicated for the treatment of symptomatic vitreomacular adhesion (VMA) . VMA is a condition of the eye that results from the liquefaction of the vitreous gel within the human eye and consequent adhesion to the retina . As the eye ages, the vitreous humor can naturally separate from the retina. However, if the separation is not complete, areas of adhesion can occur. The traction from these adhesion areas on the retinal surface is the underlying pathology of symptomatic VMA, which can lead to ocular damage. Ocriplasmin is the first drug approved to treat this condition and it exherts its therapeutic action by selectively breaking down the three major protein components, fibronectin , laminin and collagen , of the vitreous body and vitreoretinal interface, and thereby dissolving the protein matrix responsible for VMA. The only alternative treatment is a surgica

New Drug Approvals 2012 - Pt. XXIII - Omacetaxine mepesuccinate (SYNRIBOTM)

ATC code:  L01XX40 Wikipedia: Omacetaxine_mepesuccinate On October 22nd 2012 the FDA approved omacetaxine mepesuccinate (research code: CGX-635, trivial name: Homoharringtonine, trademark: Synribo TM ) for the treatment of chronic or accelerated phase chronic myeloid leukaemia ( CML ) in adults with resistance to two or more tyrosine kinase inhibitors. Omacetaxine is an old drug identified 35 years ago and known to have activity in CML, but its clinical development was previously halted due to the discovery of BCL-ABL and other targeted kinase inhibitors Pubmed: 21294709 . The rapid development of tyrosine kinase inhibitor resistant tumors has led to the need for agents that can act in these treatment-derived drug-resistant patients. Omacetaxine mepesuccinate has been approved based on observed major cytogenetic response rather than on improvement in disease-related symptoms or increased survival. Omacetaxine mepesuccinate/homoharringtonine is a cephalotaxine

Paper: Mapping small molecule binding data to structural domains

We've just published a paper on mapping the sites of small molecule binding in complex multidomain proteins ( pdf here - this link doesn't seem to work at the moment, sorry ). The resolution of the mapping is at the level of Pfam domains. We love Pfam, and love it even more that the Pfam team is moving to the EBI this week. The motivation for this work is multifold, and it addresses a pretty big problem in chemogenomics. Firstly the issue of domain frustration - you search a protein containing a series of distinct domains looking for homologues in ChEMBL. If your protein contains a common and uninteresting domain, something like a zinc finger or EGF domain (our interest is for small molecule binding remember, we're not saying that these domains are completely boring, they're just a lot less interesting from a chemical biology/drug discovery perspective) you'll retrieve a whole bunch of sequence related, but small molecule binding unrelated data. It's j

New Drug Approvals 2012 - Pt. XXII - Perampanel (FycompaTM)

ATC Code : N03AX22 Wikipedia : Perampanel On October 22nd 2012 the FDA approved Perampanel (research code: E2007, ER-155055-90, trade name Fycompa, CHEMBL1214124 ). Perampanel is an orally administered drug to be used as an adjunctive therapy for the treatment of partial-onset seizures with or without secondary generalized seizures in patients with epilepsy . Epileptic seizures are defined as "abnormal excessive or synchronous neuronal activity in the brain". The net symptoms can be very diverse, from severe thrashing movements to a very mild brief loss of awareness. Approximately 4% of the population will have experienced a unprovoked seizure by the age of 80, with a 30-50% chance of repeat in this group. Seizures can last from a few seconds to a state of life threatening persistent seizure (known as status epilepticus ). Approximately 25 % of the people suffering from a seizure or  status epilepticus will be diagnosed to

Wellcome Trust Courses - Computational Resources For Drug Discovery 2013

Those of you who went on the course we ran this year will know how much fun it was - and from our perspective we're gonna keep on doing it till we get it right! So, once more, there is another chance to attend the course in 2013 - December 9 to 13th 2013 to be precise. So if you are interested, pencil the dates in your diaries now, and set an automatic alarm for four months before, and check out the full course details then. Of course, there are lots of other excellent courses in the same series, and the poster is available for download to display on your office wall here .

The First Rule of Security Club is that you do not talk about Security Club

We worry about data security and privacy, a lot. I fret and sweat over this, and it is one of the things (alongside being late with EU reports) that genuinely keeps me awake at night, and that you can never know too much about (again a bit like the EU). We have started to collect examples of security and data privacy issues and vulnerabilities in online chemistry-related resources. Firstly, to build a set of real world examples, and to establish best practice for our own developers. It also allows us to potentially create an environment in which security and privacy matters can be privately discussed without the world being unnecessarily alerted to them; allowing fixes to be made, and generally keep the online chemistry world a better safer place. As would be expected for this sort of thing, the list will not be open, and not indexed in google (if it is right now, we’ve failed at step one!), so if you’re interested in joining the list, and your job involves the buil

Clinical Development Candidate Annotatathon - July 2013

We are thinking of holding an annotatathon for clinical development stage compounds next July, here on campus at the EBI in Hinxton. At this event we will assign/curate efficacy targets for all the clinical stage compounds we have by then identified, simplifying the work by pre-clustering by chemical class/therapeutic area. Data generated during the event will be placed online immediately, and would of course be fully Open (none of this frustrating, online access only for us!). If there is interest in taking part, and contributing to this effort, let me know ! Depending on the level of interest, I may apply for funding to help with travel/accommodation. If you are interested in funding this we'd be delighted to help with this

PubMed² - Experimenting with biomedical literature for tablets and smart phones

We're still playing around with data visualisation, and the experiment of this week focuses on the scientific literature and is designed with tablet devices (such as the iPad or the Nexus 7) and smartphones in mind. The application is a re-thinking of PubMed's search interface and you can get to play with it here at Let us know in the comments what you think.

Masters Project - Ion-channel structural pharmacology

We have a position in the group in the area of ion-channel structural pharmacology - mapping known ion-channel modulators to sequences and binding sites. This will be in partnership with Pfizer, and the role will involve time spent both at the EBI and at Pfizer's labs in the Cambridge UK area - so a great opportunity to pick up some industrial experience. If you are interested, please get in touch by December 15th 2012 , when we will shortlist candidates for interview.

Random Notes on Open Drug Discovery/Data Sharing: Part 1

There are some fantastic initiatives in Open Drug Discovery going on at the moment. I for one, are convinced that we are on the cusp of a large structural change in drug discovery, and like at the beginning of all revolutions, the future is not clear, and we all a little bit excited and nervous at the same time. One of the commonly quoted benefits of an Open strategy is that it can avoid duplication, and if you avoid duplication, it means that you get to the goal, faster and cheaper (since other researchers can explore alternative approaches), and there is no repetition. There, you've just read it, and it's quite seductive isn't it? I've never quite bought this "avoid duplication" argument for the following three reasons. (I should declare my political/philosophical hand here, I have a very deep rooted empathy with the concept of The Free Market. Not the goofy, fudged form that we've had in Western Economies for some time - but that really is a diffe

Paper: Cheminformatics - Communications of the ACM

Here is a review article on cheminformatics, written as an orientation piece for people from a computational sciences background. %T Cheminformatics %A J.K. Wegner %A A. Sterling %A R. Guha %A A. Bender %A J.-L. Faulon %A J. Hastings %A N. O'Boyle %A J. Overington %A H. Van Vlijmen %A E. Willighagen %J Communications of the ACM %V 55 %I 11 %P 65-75 %O DOI:10.1145/2366316.2366334

ChEMBL - now with added DOIness

In order to provide ChEMBL users with a persistent and citable link to datasets that have been deposited in ChEMBL we have started registering  DOIs  (Digital Object Identifiers) for these datasets. Many of you will be familiar with the use of DOIs as identifiers for journal articles but they can be used for any document that you want to permanently identify and share with others. By doing this we are providing people with a way of citing a deposited dataset in exactly the same way as you would a scientific publication. We are also hoping that by issuing DOIs for deposited data we will encourage people to contribute additional data to the ChEMBL database as the DOI will provide them with a permanent way to reference their contribution, for example by using the DOI in a subsequent publication. At the moment we have DOIs for four of the deposited datasets in the ChEMBL database.  Two are results from screens on the GSK PKIS set and two are datasets measured as part of