Skip to main content


Showing posts from April, 2011

Access to ChEMBL through Pipeline Pilot

We have recently released a set of RESTful Web Services , which give users programmatic access to the ChEMBL data. During the Small Molecule Bioactivity course we hosted earlier in the year, and in a number of subsequent ChEMBL webinars, we've been asked if it is possible to use these new services in Pipeline Pilot , the short answer is yes . To help users get started we have created a simple protocol, which you can download , use and modify (the license for this, for those interested is CC0 ). Remember to save the file on your disk - the link itself will look goofy in your browser. The protocol will retrieve data for a list of ChEMBLids and return a list of bioactivity and compound data in an html table. We will be interested to hear how you get on, so tell us about any changes/enhancements you would make to the protocol, we're convinced it can be improved. Over the coming months you can expect the set of ChEMBL Web Services to grow. We will keep you informed of any

ChEMBL Amazon Web Services

We have started to use Amazon Web Services (AWS) for a number of mini projects in the group. As part of this work we have created ChEMBL Amazon Machine Image (AMI), which we have decided to make publicly available. The ChEMBL AMI is based on the 64-bit Amazon Linux AMI, but additional comes installed with a MySQL server, which contains the ChEMBL_09 database. The benefit of making this available is that a user can have a MySQL instance up and running in a matter minutes and at almost no cost if the service is run on the AWS Free Usage Tier . To create an instance based on the ChEMBL AMI, go through the following steps: Set up an AWS account (note you will need an existing Amazon account or credit card to do this). Login to the AWS Management Console Change Region to EU West (Ireland). Currently the ChEMBL AMI is only available to instances running in this region. Go to EC2 tab and click on Launch Instance button. The Create Instance Wizard s

Molecular databases and molecule complexity - part 2

Let have some examples - benzene  ( chembl277500 ) is unambiguous, it has no possibility of forming any tautomers, it cannot become protonated or lose a proton ( i.e. act as a base or acid) under anything approaching physical conditions, it has no stereocenters, and furthermore has no internal degrees of freedom (it it conformationally rigid). So there is no ambiguity over calculated properties such as logP, molecular weight, etc , and you could take the structure directly from a database and do things like docking with it. Next is pyridine  ( chembl266158 ), this has two biological forms, it is still rigid, and has no stereocenters or tautomeric forms, however, it can act as a base, and so can exist in a protonated form. These two forms have different molecular weights, overall charge and many other differences (for example, it's molecular dipole ). In particular, the binding to a receptor will be very different for these two forms, pyridine can act as a hydrogen bond ac

Molecular databases and molecule complexity - part 1

At one level a database of small molecules seems a really simple thing - a set of identifiers and then a 2D structure. You can then do a bunch of really cool things with this, as the large literature in the area shows. For example, one thing which is pretty common is to take a library of molecules, then 'dock' them into a protein structure, hopefully to find a novel lead; or maybe even a new use for a drug (or prediction of a side effect of a known drug). The wide availability of pipeline tools, web services connecting directly to remote databases, and so forth, makes this sort of thing really simple, and arguably too simple. However, there are many challenges with handling normalised 2-D chemical data. One thing we have started to think about recently, is just how ambiguous a 2D representation of a structure is for typical users interested in the analysis of compound properties, docking, etc . The problem arises from the fact that molecules are 'complex', in that

ChEMBL RESTful Web Service API - Update

We are pleased to announce that we have updated the ChEMBL RESTful Web Service API ( application programming interface ) with some more of the features that you, the ChEMBL users requested. In particular, we have added support for the: Retrieval of results in JSON data format Searching of compounds by Standard InChiKey Searching of targets by UniProt and RefSeq  identifiers Sample urls: Retrieve a compound record in JSON format - Search for a compound based on a given Standard InChiKey - Search for a target based on a given UniProt accession - Search for a target based on a given RefSeq accession and return it in JSON format - In addition to those new features we have added some example Perl and Python scripts to our

MedChemBuzz: Really good Med Chem Blog

Just a short note to highlight a really good blog on med chem and drug discovery - MedChemBuzz - just the thing for your daily 2 hour 45 minute commute to work ;)

New Drug Approvals 2011 - Part XI Gabapentin enacarbil (HorizantTM)

ATC code (partial): N03AX    On April 6th, the FDA approved gabapentin enacarbil (tradename Horizant , Research Code: XP-13512, NDA 022399) for the treatment of moderate-severe forms of restless legs syndrome (RLS). Patients suffering from RLS experience an urge to move their legs or other limbs. This urge is prompted by a painful or itchy sensation in the corresponding limb. Symptoms are most severe during phases of relaxation. Patients also sometimes have limb jerking during sleep. Gabapentin enacarbil is a prodrug of the anticonvulsant and analgesic gabapentin ( CHEMBL940 ). Much of the processing of gabapentin enacarbil into the active ingredient takes place in enterocytes, upon absorption in the gut. These first pass modifications are mainly mediated by non-specific carboxylesterases and via several steps of hydrolysis and yield gabapentin (the active ingredient), along with carbon dioxide, acetaldehyde and isobutyric acid. Absorption of Gabapentin into the

New Drug Approvals 2011 - Pt. X Vandetanib (ZactimaTM)

ATC code: L01XE12 On the 6th April 2011, the FDA approved Vandetanib (trade name: Zactima TM , ATC code: L01XE12, NDA 022405), a multi-kinase inhibitor, for the treatment of symptomatic or progressive medullary thyroid cancer in patients with unresectable locally advanced or metastatic disease. ( medullary thyroid cancer ; CRUK Thyroid cancer ; ICD C73 ) Medullary thyroid cancer is a rare form of Thyroid cancer, but is associated with poorer prognosis. While the primary tumor can be successfully removed using surgery and radiotherapy, and thus can have a high 5 and 10 year survival rate (>90%), the metastatic disease remains challenging and is has a low 40% survival rate. Medullary thyroid cancer can be a sporadic or hereditary disease, and has complex underlying genetic causes. Approximately 25% of cases are associated with the RET (REarranged during Transfection) proto-oncogene. RET mutations cause Multiple Endocrine Neoplasia type 2 (MEN 2) which increases the r