Skip to main content


Showing posts from 2013

New Year, New Job? Research Associate in Epidemiology at UCL (ChEMBL related)

As part of a collaboration the ChEMBL group are involved in with UCL, we are looking to appoint to a full time, three-year position in the Genetic Epidemiology Group, Institute of Cardiovascular Science, one of the component institutes of the UCL Faculty of Population Health Sciences. The appointee will join an exciting programme of work funded by the UCL National Institute of Heath Research Biomedical Research Centre, through its High Impact Award scheme. The appointee will apply bioinformatic expertise to the late stage development of a new high density genotyping array designed to support drug target validation and related drug development issues; co-ordinate deployment of the array in a large consortium of highly-phenotyped cohort studies (the University College-London School of Hygiene-Edinburgh-Bristol consortium); undertake statistical analysis of the data; and play a leading role in writing manuscripts reporting findings arising from this work. The post is based at UCL, bu

Conference: CEADD2014 - Modelling water in biological systems, London, March 2014

A one day conference entitled ' Modelling Water in Biological Systems ' will be held at the School of Oriental and African Studies (SOAS) in London on  Friday, 28 March, 2014 . This meeting, organised by the MGMS , is the latest in the 'Cutting Edge Approaches to Drug Design (CEADD)' series. In recent years, significant progress has been made in probing the role of water molecules in protein-ligand binding.  Hydration is a crucial factor in understanding binding modes, ligand affinities and kinetics.  Modelling tools are becoming available which may offer new insights in this exciting and evolving area of current research. This conference provides a timely overview of some of the main research avenues in this important field.

ChEMBL Web Service Update 4: A Reminder

This post is to remind users of the ChEMBL Web Services that we will soon be changing the backend to use the new ChEMBL API. Since our initial announcement about the changes, which you can read about here , here and here , we have made some more changes and optimisations, which speed up the services significantly. We thank everyone for feedback to date and urge anyone else who makes use of the ChEMBL Web Services to test the new version. Remember they are simple to test, just use the following temporary base URL and everything should work as if you are using the current live Web Services: We would like to make the change in January, so please get in touch if you have any questions or experience any problems. Once we have made the technology switch and happy that it is working in the wild as expected, we will be doing a complete review of the functionality offered by the current ChEMBL Web Services. So expected some big changes in 2014.

UniChem: A resource for compound mapping - use in BioMedBridges

Unichem is a simple database and web service for the InChI -based linkage of chemical structures across various resources. It was initially developed under the EU-OPENSCREEN ESFRI as an approach to link screening data from the planned screening collection to other chemistry resources. The development was then extended under the BioMedBridges project - which spans across various Biomedical Sciences (BMS) ESFRIs (such as ELIXIR , BBMRI , etc .) It's proved to be remarkably useful to us as well, and will be the future home of regularly updated feeds of compound structures from SureChEMBL - and will allow rapid novelty checking of patent structure novelty, across component datasources. A side-effect of this, is of course, that immediately the compounds in any of the BioMedBridge partner ESFRIs immediately have patent data integration. For us, this synergy, and snowball effect of binding resources together using simple open standards is one of the great joys of our work! Foll

Notes from Rita's Talk Yesterday.

Rita gave a talk on her recent drug target work yesterday on campus, and Jenny Cham took notes; aren't they great? jpo

SureChEMBL - Chemical Structure Information in Patents

Today we have announced that we are taking over the running of the SureChem system from Digital Science . We have renamed this SureChEMBL to reflect the history and provenance of the technology and engineering, but also to align it with it's new home and future, we like the name, and hope you do. We are delighted that this has happened - Nicko and the team at Digital Science have been great, and the more we have dug in to how it works, the more we have appreciated the design and vision that they had. If there is one consistent piece of feedback we get about ChEMBL it is in encouraging us to add patent data to what we do. So now we have, but because the data from patents is different in detail from that reported in the published literature, we will keep the databases separate, but closely integrated. For those of you that are already SureChem users you will be familiar with the functionality and how it works; but for those that weren't SureChEMBL takes feeds of full te

A call for new MMV Malaria Box screening data depositions

Last year, MMV released the MMV Malaria Box , a physical set of 400 probe- and drug-like compounds with confirmed anti-malarial activity. The 'Box' has been since distributed to a large number of academic labs around the world, where the compounds are screened against other plasmodia strains and pathogens such as schistosoma and mTB. The assay results have started coming back in the form of data depositions and, we, as MMV partners, are doing our best to integrate them with both the malaria-data database, as well as the main ChEMBL one. Recent examples of such MMV Malaria Box screening data depositions include: An mTB screen by the Nathan lab in Cornell A schistosoma screen by  Conor Caffrey  and colleagues in UCSF A  plasmodium apicoplast screen by the Derisi lab in UCSF, as reported in our post last week In addition, we curate and integrate the bioactivity data produced by the excellent  Open Source Malaria project. The value of sharing screenin

Paper: myChEMBL - a virtual machine implementation of open data and cheminformatic tools

We have just had a paper published in Bioinformatics on myChEMBL - the Linux VM that contains a fully functional version of the ChEMBL database. The paper is here . myChEMBL is available for download at: A warning, it is a fairly big download ( ca. 18GB, so try and do this over a fast stable connection) Source code is available here: %T myChEMBL: A virtual machine implementation of open data andcheminformatics tools %J Bioinformatics %D 2013 %O DOI:10.1093/bioinformatics/btt666 %A M. Davies %A G. Papadatos %A F. Atkinson %A J.P. Overington

Job: Chemoinformatician at the Karolinska

Some of our collaborators at the Karolinska have a great job available - the advert is here . Division Chemical Biology Consortium Sweden (CBCS) are looking for a highly motivated and talented Cheminformatics Scientist to support and coordinate a wide scope of research informatics applications and data analysis at our Stockholm facilities. Duties The desired candidate will have a demonstrated track record in managing large volumes of scientific data in support of basic research and/or drug discovery projects and should have significant experience with in-house and commercial software solutions that facilitate data capture, analysis and visualization in small molecule research and drug design. Responsibilities include: Evaluation and implementation of a nationally encompassing chemoinformatics system for the SciLifeLab community. Maintainance, configuration, monitoring, and/or troubleshooting scientific applications and underlying software.  Partnering and interaction with

Competition Time - Teach-Discover-Treat 2014

Teach-Discover-Treat (TDT) is excited to announce our 2014 Competition. We have four exciting challenges that focus on developing and disseminating computational workflows for drug discovery of neglected diseases with a premium on reproducibility. Three cash prizes - plus partial reimbursement of travel - will be awarded! Winners are required to present their work at the TDT Award symposium during the Fall 2014 ACS National Meeting in San Francisco, California. Create and submit computational workflows that inspire drug discovery activities using freely available software tools. Detailed informationabout the 2014 Competition can be found here: 2014-competition.html Submissions deadline is  February 3, 2014 . The TDT Steering Committee Hanneke Jansen, Rommie Amaro, Jane Tseng, Wendy Cornell, Patrick Walters and Emilio Xavier Esposito @TeachDiscoTreat

New Drug Approvals 2013 - Pt. XIX - Ibrutinib (ImbruvicaTM)

ATC Code: Wikipedia: Ibrutinib On November 13, 2013, the FDA approved Ibrutinib (Imbruvica TM ) for the treatment of patients with mantle cell lymphoma (MCL) who have received at least one prior therapy. MCL is a subtype of B-cell lymphoma and accounts for 6% of non-Hodgkin's lymphoma cases. In an open-label, multi-center, single-arm trial of 111 previously treated patients, Ibrutinib showed a 65.8% response rate. Ibrutinib is an irreversible inhibitor of the Tyrosine-protein kinase BTK (Uniprot:Q06187; ChEMBL: CHEMBL5251 ; canSAR target synopsis ) and is the first approved targeted BTK inhibitor. It forms a covalent bond with a cysteine residue via a Michael acceptor mechanism, in the BTK active site, leading to inhibition of BTK enzymatic activity Ibrutinib (ChEMBL: CHEMBL1873475 ; canSAR drug synopsis ; also known as CRA-032765 and PCI-32765) has the formula C25H24N6O2 and a molecular weight 440.50. It is absorbed after oral administration with a median Tma

New Drug Approvals 2013 - Pt. XVIII - Obinutuzumab (GazyvaTM)

ATC Code:  L01XC15 Wikipedia: Obinutuzumab On November 1, 2013 the FDA approved obinutuzumab (Gazyva TM ) for use in combination with chlorambucil  (a nitrogen mustard alkylating agent) for the treatment of patients with previously untreated chronic lymphocytic leukemia (CLL). CLL is the most common type of Leukaemia accounting for 35% of all reported Leukaemias (See CRUK CLL page). In a randomized three-arm clinical study, the combination of obinutuzumab (in combination with chlorambucil) improved the progression-free survival (PFS) of patients to 23.0 months compared to 11.1 months for chlorambucil alone. Obinutuzumab ( CHEMBL1743048 ) is a humanized anti-CD20 monoclonal antibody of ca . 150 kDa molecular weight. Its target, the B-lymphicyte antigen CD20, is the product of the gene MS4A1 (Uniprot: P11836; ChEMBL: CHEMBL2058 ; canSAR target synopsis . The CD20 antigen is expressed on the surface of pre B- and mature B-lymphocytes. Obinutuzumab mediates B-cell lysis

New ChEMBL-NTD Depositions

We are very pleased to announce the release of two new datasets on the ChEMBL-NTD portal. The first dataset is provided by the Drug for Neglected Diseases initiative (DNDi) and is focused on the selection and optimization of hits from a high-throughput phenotypic screen against Trypanosoma cruzi . The paper describing the dataset in more detail can be accessed here and the data can be downloaded from here .   The second dataset from the DeRisi Lab UCSF and is focused on the screening of MMVs Malaria Box compounds in Plasmodium falciparum , to understand if anti-malarial compounds target the apicoplast organelle. More details about the dataset can be found here and the data can be downloaded from here .   Both datasets will be loaded into the next version of ChEMBL, which will be due out early next year. The ChEMBL - Neglected Tropical Disease portal is a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases. If you wo

RDKit and Raphael.js

The ChEMBL group had the honour of hosting the second RDKit UGM . It was a great way to catch up with the RDKit community, find out about what they are working and learn about new features the toolkit offers. We gave two talks during the meeting, so if you want to know how Clippy can make interacting with different chemical formats on your desktop easier, go here , and if you want to learn about wrapping RDKit up in a RESTful Web Service a.k.a. Beaker (to be described in future blog post), go here . Many discussions about new features RDKit could offer were had throughout the meeting and one which caught my attention was support for plotting compound images on HTML5 Canvas . Unable to participate in a hackathon held on the final day, I set about hosting my own small hackathon during the weekend (only 1 attendee). The result of this weekend coding effect was a pull request made against RDKit github repo , introducing the new class called JSONCanvas . Technical Details As a ge

USAN Watch: September 2013

The USANs for September, 2013 have recently been published. We actually missed September, due to switch over in service for the INNs, but now they're here. USAN Research Code InChIKey (Parent) Drug Class Therapeutic class Target aducanumab BIIB-037 n/a monoclonal antibody therapeutic beta amyloid aptorsen-sodium OGX-427 n/a oligonucleotide therapeutic HSP27 asfotase-alfa ALXN-1215, ENB-0040 n/a enzyme therapeutic n/a batefenterol ,  batefenterol-succinate GSK-961081A URWYQGVSPQJGGB-DHUJRADRSA-N synthetic small molecule therapeutic Muscarinic

Paper: The ChEMBL bioactivity database: an update

An update to what has happen to the Wellcome Trust funded database  ChEMBL  over the past few years has just been published - it seems odd, that we've been around long enough to achieve our 2nd NAR Database paper - so much more to do though! This paper contains features and content up to ChEMBL 17. This could put you in a difficult position which NAR paper to cite in your own publications using ChEMBL; so we suggest both! ;) Oh, and it's Open Access, of course. %J Nucleic Acids Research %D 2013 %P 1–8 %O doi:10.1093/nar/gkt1031 %T The ChEMBL bioactivity database: an update %A A.P. Bento %A A. Gaulton %A Anne Hersey %A L.J. Bellis, %A J. Chambers %A M. Davies %A F.A. Krueger %A Y. Light %A L. Mak %A S. McGlinchey %A M. Nowotka %A G. Papadatos %A R. Santos %A J.P. Overington jpo

Paper: The Functional Therapeutic Chemical Classification System

Here 's an Open Access paper from Samuel in the group. Drug repositioning is the discovery of new indications for compounds that have already been approved and used in a clinical setting. Recently, some computational approaches have been suggested to unveil new opportunities in a systematic fashion, by taking into consideration gene expression signatures or chemical features for instance. We present here a novel method based on knowledge integration using semantic technologies, to capture the functional role of approved chemical compounds. In order to computationally generate repositioning hypotheses, we used the Web Ontology Language (OWL) to formally define the semantics of over 20,000 terms with axioms to correctly denote various modes of action (MoA). Based on an integration of public data, we have automatically assigned over a thousand of approved drugs into these MoA categories. The resulting new research resource is called the Functional Therapeutic Chemical C

Magic methyls and magic carpets

A few days ago, there was this post by Derek Lowe, reviewing a recent paper on magic methyls and their occurrence and impact in medicinal chemistry practice. They're called 'magic' because, although methyls are relatively insignificant in terms of size, polarity or lipophilicity, the addition of one in a compound can  sometimes  have a dramatic impact in its potency - much more that it would be attributed to any simple desolvation effects. More generally, the 'magic methyl' phenomenon pops up in discussions about the validity of the  molecular similarity  principle, descriptors, QSAR - almost everything in the applied Chemoinformatics field - and belongs to the general class of ' activity cliffs '.  Methylation is a chemical transformation, and transformations along with their impact on a property of choice can be easily mined and studied using the so-called Matched Molecular Pairs analysis ( MMPA ). We already have a comprehensive database

New Drug Approvals 2013 - Pt. XVII - Flutemetamol F18 (VizamylTM)

ATC Code: V09AX04 On October 25 th , the FDA approved Flutemetamol F18 (Tradename: Vizamyl ; Research Code: [ 18 F]AH110690 ), a radioactive diagnostic agent, for intravenous (i.v.) use in Positron Emission Tomography (PET) imaging of the brain in adult patients with cognitive impairment , who are being evaluated for Alzheimer’s disease (AD) and dementia. Alzheimer's disease is a non-treatable, progressively worsening and fatal disease, characterised by a decrease in cognitive functions, such as memory, and is usually associated with an accumulation of β amyloid (Uniprot: P05067 ) plaques in several brain regions. These deposits are believed to be responsible for cellular damage and ultimately cell death. Flutemetamol F18 is the second approved diagnostic drug to estimate β-amyloid neuritic plaque density, after the approval of Florbetapir F18 in 2012. Like Florbetapir F18, Flutemetamol F18 binds to β amyloid plaques in the brain where the F-18 isotope produces

EU-OPENSCREEN 3rd Stakeholder Meeting, Oslo, Norway

Dear future user, partner, collaborator or supporter! The ESFRI project EU-OPENSCREEN is an academic infrastructure initiative in Chemical Biology to serve your research needs. We are currently preparing the implementation of this pan-European infrastructure of open screening platforms to support basic and applied research. EU-OPENSCREEN will offer access to a unique compound library representing the know-how of European chemists, to a broad range of cutting-edge screening technologies, to valuable tool compounds for research, and to the knowledge that emerges from validated output of hundreds of screens stored and made publically available in a central database. We cordially invite you to join us in Oslo for an exciting science day where we inform about the progress of the project and the planned services with reports on the design of the joint European Compound Library, the screening services and the database. In particular, we would like to share with you your own experience

Competition Time - Win a Raspberry Pi with ChEMBL - chempi

Here's a free to enter competition for a brand new, fully working raspberry pi running the brand new chempi implementation . It includes everything you need to get started at home with ChEMBL - a sort of in silico Breaking Bad maybe (hopefully not, thinking about it). It includes everything you need, with the exception of a power supply and ethernet cable. We have run out of our creative juices, and cannot think of a suitable poem to mark the release of chempi - so the competition is for you to finish a limerick for us, starting with the line. There once was a hacker with chempi.... Entries must be posted in the comments section. Obscene or defamatory entries will be removed (all comments are moderated, so it may take a few hours for you entry to appear, so do not repost twenty times!). We haven't really decided how to pronounce chempi (with a hard 'k' start or a soft 'sh' start, just as with ChEMBL, both are used in the wild; and also does it rhyme

Tastypie & Chempi

One of the immediate consequences of refactoring our webservices using Django , Tastypie and related approaches (as described here ) is that we can run them on almost any database backend. Django abstracts communication with database and using custom QueryManagers we were able to implement chemisty-specific opererations, such as substructure and similarity search in a database agnostic manner. This means, that if we want, we can use only Open Source components (such as Postgres and RDKit ), or elect to use optimised commercially sourced software as appropriate. However, what if we go one step further and try to use Open Hardware as well? This is exactly what we've just done! We managed to install full ChEMBL 17 on raspbery pi . Some frequently asked questions (at lease those that have been asked internally) and technical details are below: 1. How much space does it take? 12 Gb, including OS, data and all relevant software. Unfortunately we a used 32 Gb SD card so this

Usan Watch: October 2013

The USANs for October 2013 have recently been published. We have modified the sourcing of this data - using the new ChEMBL API to automatically parse the documents, extract and validate the mol files for the compounds. So in future, these reports should be more timely, complete and fun! USAN Research Code InChIKey (Parent) Drug Class Therapeutic class Target alectinib AF-802; CH-5424802 KDGFLJKFZUIJMX-UHFFFAOYSA-N synthetic small molecule therapeutic ALK apitolisib GDC-0980.1, G-038390, G-038390.1, RG-7422 YOVVNQKCSKSHKT-HNNXBMFYSA-N synthetic small molecule therapeutic MTOR,PI3K cimaglermin-alfa GGF2, rhGGF2 n/a protein

New Drug Approvals 2013 - Pt. XVII - Macitentan (Opsumit ®)

ATC Code:  C02KX   (incomplete) Wikipedia:   Macitentan ChEMBL:  CHEMBL2103873 On October 13th the  FDA approved   Macitentan  (trade name Opsumit  ® ) for the treatment of pulmonary arterial hypertension (PAH). Macitentan is an endothelin receptor antagonist (with affinities to both Endothelin ET-A (ETA) and Endothelin ET-B (ETB) receptor subtypes, similar in mechanism of action to the previously licensed drug Bosentan , CHEMBLID957 ). Target(s) The Endothelin receptor ET-A (ETA, CHEMBLID252  ; Uniprot P25101 ) and Endothelin receptor ET-B (ETB, CHEMBLID1785  ; Uniprot P24530 ) receptors mediate a number of physiological effects via the natural peptide agonist Endothelin-1 (ET1 , CHEMBL437472  ; Uniprot P05305 ). In addition to normal roles in supporting homeostasis, these effects can include pathologies such as inflammation, vasoconstriction, fibrosis and hypertrophy. Macitentan acts as an antagonist for both receptors with both a high affinity and long residen