ChEMBL Resources


Wednesday, 29 June 2011

New Drug Approvals 2011 - Pt. XX azficel-T (laVivTM)

On June 21st 2011, the FDA approved azficel-T (trade name: laViv) for the aesthetic treatment of moderate to severe nasolabial fold wrinkles in adults ("smile lines"). laViv is an autologous cell therapeutic, consisting of fibroblasts (cells which can produce collagen) which are produced from a biopsy of post-auricular tissue by proliferation in vitro, and re-injected into the nasolabial folds to improve their cosmetic appearance. With increasing age, nasolabial folds can become more pronounced, caused by habitual facial expressions (i.e. laughing), and a decline of collagen production.

Alternative non-drug treatments include liposuction and facelift. After biopsy, dermal fibroblasts are expanded using standard tissue-culture procedures until a sufficient amount of cells for re-injection is obtained. This process takes 11-22 weeks. laViv is provided in two vials of approximately 18 million fibroblasts in 1.2 mL suspension and should be administered in three sessions at 3-6 weeks intervals by injection at 0.1 mL per linear centimeter of nasolabial fold wrinkle. The mechanism by which laViv works is not known.

In two clinical trials, the efficacy of azficel-T has been evaluated based on a five-point Subject Wrinkle Assessment scale, and additionally by physicians employing a six-point Evaulator Wrinkle Severity Assessment, and considered successful if a two-point improvement post- compared to pre-treatment was achieved. In both studies, the Subject Wrinkle Assessment for the group using laViv (57%/45%) was significantly improved as compared to the vehicle control group (30%/18%); equally, the Physician Wrinkle Assessment indicated an improvement of the condition using laViv (33%/19% as compared to 7% for both control groups). To avoid immune reactions, the identity of donor and recipient has to be assured.

Common adverse reations are injection-site reactions such as redness, bruising, swelling, and pain. Pediatric safety and efficacy have not been established; clinical studies lack sufficient number of subjects in geriatric or non-White population. Efficacy of the product beyond six months has not been established. laViv has been developed by Fibrocell Technologies.

The product website can be found here, full prescribing information, here.

Paper: PSICQUIC and PSISCORE: accessing and scoring molecular interactions

To study proteins in the context of a cellular system, it is essential that the molecules with which a protein interacts are identified and the functional consequence of each interaction is understood. A plethora of resources now exist to capture molecular interaction data from the many laboratories generating such information, but whereas such databases are rich in information, the sheer number and variability of such databases constitutes a substantial challenge in both data access and quality assessment to the researchers interested in a specific biological domain.

The paper is available here, and here is the PSICQUIC registry.

%T PSICQUIC and PSISCORE: accessing and scoring molecular interactions
%A B. Aranda
%A H. Blankenburg  
%A S. Kerrien
%A F.S.L. Brinkman  
%A A. Ceol  
%A E. Chautard  
%A J.M. Dana  
%A J. De Las Rivas  
%A M. Dumousseau  
%A E. Galeota
%A A. Gaulton
%A J. Goll
%A R.E.W. Hancock  
%A R. Isserlin
%A R.C. Jimenez  
%A J. Kerssemakers  
%A J. Khadake
%A D.J. Lynn  
%A M. Michaut  
%A G. O'Kelly
%A K. Ono  
%A S. Orchard  
%A C. Prieto 
%A S. Razick  
%A O. Rigina  
%A L. Salwinski  
%A M. Simonovic  
%A S. Velankar  
%A A. Winter
%A G. Wu  
%A G.D. Bader  
%A G. Cesareni  
%A I.M. Donaldson  
%A D. Eisenberg  
%A G.J. Kleywegt 
%A J. Overington  
%A S. Ricard-Blum  
%A M. Tyers
%A M. Albrecht
%A H. Hermjakob
%J Nature Methods 
%V 8
%P 528–529
%D 2011
%O doi:10.1038/nmeth.1637

Monday, 27 June 2011

New Drug Approvals 2011 - Pt. XIX Belatacept (NulojixTM)

ATC code: L04AA28

On June 15th 2011, the FDA has approved Belatacept (trade name: Nulojix; Research Code: BMS-224818), a selective T-cell (lymphocyte) costimulation blocker indicated for phophylaxis of organ rejection in adult patients receiving a kidney transplant. Belatacept is approved for use in combination with other immunosuppressants, specifically basiliximab, mycophenolate mofetil and corticosteroids.

Belatacept is a potent antagonist that inhibits T-lymphocyte activation by binding to the B7-ligands, namely CD80 (Uniprot: P33681; Pfam: PF08205, PF07686) and CD86 (Uniprot: P42081; Pfam: PF07686), present on antigen-presenting cells, and thereby blocking interaction with CD28 (Uniprot: P10747; Pfam: PF07686), the receptor of these two ligands. This interaction provides a costimulary signal necessary for full activation of T-lymphocytes. Activated T-cells are the predominant mediators of immunologic rejection. In vitro, Belatacept inhibits T-cell proliferation and the cytokines interleukin-2, interferon-γ, interleukin-4 and TNF-α.

There are some protein structures known for the B7-ligands, CD80 and CD86. Here are two typical entries for CD80 (PDBe:1i8l) and CD86 (PDBe:1i85) in complex with CTLA-4.
Belatacept is derived from Abatacept (trade name: Orencia; approved in 2005 for the treatment of rheumatoid arthritis, ChEMBLID: CHEMBL1201823), a soluble fusion protein that consists of the extracellular domain of the human cytotoxic T-lymphocyte antigen-4 (CTLA-4; Uniprot: P16410; Pfam: PF07686), linked to a modified Fc (hinge-CH2-CH3 domains) portion of human immunoglobulin G1 (CTLA4-Ig). CTLA-4 is similar to the T-cell costimulatory protein CD28, and both molecules bind to CD80 and CD86 on antigen-presenting cells. However, CTLA-4 transmits an inhibitory signal to T-cells, whereas CD28 transmits a stimulatory signal. Although Abatacept binds to the B-7 ligands with higher affinity when compared with CD28, it has never reached the market as an organ transplantation therapy due to the fact that it does not completely and equally block of the costimulation pathway (the difference in antagonistic effect to CD80 compared with CD86 is a 100-fold decrease in affinity to the CD86). Given this, Belatacept was developed by altering two amino acids in the B-7 ligand-binding portion of the Abatacept molecule (a leucine and an alanine were replaced by a glutamic acid and a tyrosine, respectively). These modifications have resulted in a 4-fold increase in binding affinity to the CD86 and a 2-fold increase in CD80 binding affinity in comparison to Abatacept. Also, it has been shown that, in vitro, this increase in binding affinity to the B-7 ligands resulted in a 10-fold increase in inhibiting T-cell activation when compared with Abatacept. 

Other immunosuppressive therapies to treat transplant rejection are available on the market and these include calcineurin inhibitors, such as Tacrolimus (ChEMBLID: CHEMBL1237096), mTOR inhibitors, such as Everolimus (ChEMBLID: CHEMBL1201755), anti-proliferatives, such as Mycophenolic acid (ChEMBLID: CHEMBL866), corticosteroids, such as Hydrocortisone (ChEMBLID: CHEMBL389621) and antibodies, such as Basiliximab (ChEMBLID: CHEMBL1201439) and Rituximab (ChEMBLID: CHEMBL1201576).

Belatacept recommended dosage is a 10 mg/kg intravenous infusion on days 1 (day of transplantation) and 5, end of weeks 2, 4, 8, and 12 after transplantation in the initial phase, followed by a maintenance phase of 5 mg/kg at the end of week 16 after transplantation and every 4 weeks thereafter. The molecular weight of Belatacept is approximately 90 kDa. After a 10 mg/kg intravenous infusion at week 12, Belatacept has a volume of distribution (Vd) of 0.11 L/kg, a systemic clearance (CL) of 0.49 mL/h/kg and a terminal half-life (t1/2) is 9.8 days. The full prescribing information can be found here

The license holder is Bristol-Myers Squibb Company and the product website is

Tuesday, 21 June 2011

New Drug Approvals 2011 - Pt. XVIII Ezogabine (PotigaTM)

ATC code: N03AX21

On June 10th, FDA approved ezogabine (trade name Potiga, NDA 022345) to treat seizures associated with epilepsy in adults. However, before being launched, Potiga waits categorised by the Drug Enforcement Agency (for  review under the Controlled Substances Act) before formal marketing can proceed.

Epilepsy is a chronic neurological disorder involving a variety of symptoms caused by abnormal electrical activity in the brain. Episodic bouts ('seizures') can potentially be controlled by medication - however, for around 1 in 3 patients, this can not achieved satisfactorily with current medication. Ezogabine (ChEMBLID:41355) represents a novel approach, being the first anticonvulsant to specifically target neuronal potassium channels

The molecular targets of ezogabine are KCNQ/Kv7 potassium channels; by stabilizing their open conformation, the drug reduces their excitability. It shares its mode of action with the structurally very similar non-opioid analgesic Flupiritine (ChEMBLID:255044). There are numerous other anticonvulsant drugs approved, such as Carbamazepine (ChEMBLID:108), or Lamotrigine (ChEMBLID:741), two sodium channel blockers. 

Its name stem, -gab-, designates it a GABA mimetic (γ-Aminobutyric acid, ChEMBL ID 96, the predominant inhibitory neurotransmitter in the mammalian central nervous system). For a substance to be GABAergic, there is no need to directly compete with GABA, or to bind to the GABA receptor. However, there is evidence that ezogabine directly interacts with the GABAA receptor, acting as an allosteric agonist, synergetically increasing GABA binding, thereby excerting a sedative effect additionally to its primary target, KCNQ.

The main molecular target of ezogabine are the human KCNQ2 and -3 potassium channels (UniProt O43526 and O43525, respectively) - according to a patch clamp assay, it has 1.3 uM affinity for the murine KCNQ2 ortholog (see also ref). There are no experimental structures available for members of the KCNQ protein family, although there are X-Ray structures for other potassium channels.

Ezogabine (canonical smiles CCOC(=O)Nc1ccc(NCc2ccc(F)cc2)cc1N , standard InChI InChI=1S/C16H18FN3O2/c1-2-22-16(21)20-15-8-7-13(9-14(15)18)19-10-11-3-5-12(17)6-4-11/h3-9,19H,2,10,18H2,1H3,(H,20,21)) has 6 rotatable bonds, a molecular weight of 303.3 Da, 3 hydrogen bond donors, 2 hydrogen bond acceptors, and is thus fully Rule-of-Five compliant.

Ezogabin has moderately high bioavailability (50-60%), a high volume of distribution (6.2 L/kg) and a terminal half-life of 8 to 11 hours. Potiga tablets are administered three times daily. Ezogabine has a number of potentially severe adverse effects, such as urinary retention, and psychiatric symptoms such as new or intensification of depression, anxiety, psychosis, and in rare cases suicidal thoughts. 

Potiga has been developed by Valeant and will be marketed by GSK.

Full prescribing information will become available at launch of the drug.

Annotation of ChEMBL with compound availability data

We are starting to plan a few things, and one of these is to provide links through to the sources of physically available compounds for ChEMBL. To help us, here's a few questions - I tried to set up an online poll, but lost the will to live with all the spam on polls that is out there.

Here are the questions:

  • Is integration of available compounds in ChEMBL a good idea?
  • Should we integrate available compounds via current informatics resources (e.g. ZINC, ChemSpider)?
  • Should we set up a small set of available compounds from actual suppliers (e.g. NCGC, MolPort, Prestwick, ChemDiv, Tocris, etc. etc.). If so what suppliers should we use?

If you want to contribute,  free to mail if you have any specific ideas, or can help us out on this.

Thursday, 16 June 2011

Prerelease of Kinase SARfari 4.0

Not Moon Safari but Kinase SARfari! There are a lot of changes to the interface and integration of data, and also oodles more data (104% more) contained in the latest release. These include:

  • Unified SARREGNOs to CHEMBL IDs (also Assay & Doc ids).
  • Updated assays, activities and compounds from ChEMBL_10.
  • Added 30 non-human kinase domains.
  • Calculated site similarity distances and neighbourhood density (ND) scores between all kinase domains.
  • Added Drug Icons into the interface.
  • Added links to ChEMBL Target/Doc/Assay Report Cards.
  • Renamed old sources (drugstore & candistore).
  • Contents: Kinase domains: 989, Kinase bioactivity datapoints: 435,873, Kinase compounds: 51,090.
For the time being, the new version can be found on our dev site at As always, we would appreciate any bug reports, feedback, etc.

Tuesday, 14 June 2011

Recruitment: Group Leaders, EMBL, Heidelberg

There are two group leader positions currently listed at the recruitment pages for the Heidelberg site for EMBL. Due to the goofy web recruitment system we have, I can't give a link to the jobs themselves, but you should be able to find them from here.

Areas of interest for one of these posts include:

  • structural bioinformatics (e.g. modeling of protein complexes and their interactions and/or dynamics in a cellular context).
  • image analysis/visualization (e.g. reading out data from GFP screens, E-tomograms or visualizing a virtual cell atlas).
  • cheminformatics (e.g. chemical-protein-network analysis).
  • systems bioinformatics (e.g. tissue modeling, analysis network perturbations).
  • transcriptional regulation/epigenetics (e.g. chromatin modification analysis).

The closing deadline is June 19th 2011 - so real soon!!

Sunday, 12 June 2011

Recruitment: New Approaches to the Treatment of Cardiovascular Disease

We are involved in a fascinating collaboration with Prof. Aroon Hingorani from the Clinical Epidemiology Dept of UCL Division of Medicine - working with clinical data (phenotype and GWAS) to identify new approaches to the treatment of cardiovascular disease (drug reuse, patient stratifcation, etc.). Further details of the position are here. Closing date is June 24th 2011.

Cool passport eh?

Bioinformatics Training Course course KDMC11, July 12th - 15th, 2011, Portugal

There is an interesting course being run in Oeiras, Portugal in July.

About one hundred million different chemical compounds have already been synthesized. The number of theoretically possible organic molecules exceeds the number of atoms in the universe. This raises a number of questions, including:
  • Where can one find information about chemical structures and their properties?
  • How can one efficiently retrieve such information?
  • Which molecules, if synthesized, could potentially assist the fight against certain types of cancer?
  • Why is it that some pharmacological targets are considered more promising for the prevention or treatment of Alzheimer?s disease than others?
  • Are there ways to better predict ADME(T) properties of synthesized molecules?
  • Why does such a significant proportion of launched drugs originate from structures found in natural products?
  • Why can multi-target pharmacological agents be superior to single-targeted ones?
  • Can new medical applications be found for old drugs?
  • Why is the alliance of chemo- and bioinformatics beneficial to the life sciences, biotech and pharma industries?
  • What are the major challenges facing chemoinformatics now?
During this course participants will learn how to efficiently find answers to these and many other related questions. Attendees will be instructed in the use of the relevant databases and associated software to:
  • Represent compounds and (bio)chemical reactions using chemical information in a computer.
  • Search for information about chemical structures and their properties in public and commercially available databases.
  • Perform similarity searches with an understanding of the advantages and disadvantages of the various methods.
  • Prepare data sets for further (Q)SAR/(Q)SPR analysis, estimating the quality and completeness of the data.
  • Create and validate (Q)SAR/(Q)SPR models for finding and optimization of lead compounds.
  • Use the above techniques for virtual screening and design of chemical compounds with the required properties.
Target audience
Researchers working in life sciences, professionals in the pharmaceutical and biotech industries: organic, medicinal, pharmaceutical chemists, biochemists, molecular biologists, pharmacologists, toxicologists, and others.

Information on all GTPB courses can be found at

Thursday, 9 June 2011

ChEMBL 10 Released

We are pleased to announce the release of ChEMBL_10. This latest version of the ChEMBL database contains:
  • 1,118,566 compound records
  • 1,000,468 distinct compounds
  • 534,391 assays
  • 4,668,202 bioactivities
  • 8,372 targets
  • 40,624 documents
  • 6 data sources
This release of the ChEMBL database contains a subset of the data from the PubChem BioAssay database. Specifically, we have included dose-response endpoints (e.g., IC50, Ki, Potency) from confirmatory assays in PubChem - the aim of this is to integrate data that is comparable to the type and class of data contained within ChEMBL. This subset contains:
  • 333,864 compound records (PubChem Substance entries)
  • 794 assays (PubChem BioAssay assays)
  • 1,473,189 bioactivities (IC50 etc. measurements)
You can access the data via the ChEMBL database interface:

Changes to the interface include:
  • 'Activity Source Filter' link has been added to the main search bar to allow users to include/exclude activity sources (e.g. PubChem BioAssay, Literature, ...) in the current working session
You can download the data from the ChEMBL ftpsite:

 All other ChEMBL resources (e.g. Web Services) are also now connected to ChEMBL_10.

Saturday, 4 June 2011

Safari plugin for ChEMBL data

For those of you who use the safari browser as part of your normal work, Matt Swain has written a plugin that you might find useful. You simply highlight words within a page, right-click, and then a box pops up with the option to search chembl for that chemical structure (using the selected text as the query). Matt also wrote similar plugins for ChemSpider, PubChem and OPSIN. These, and more, are available here. Thanks Matt!

Thursday, 2 June 2011

Interested in Being at the Forefront of Text Mining for Drug Discovery Competitive Intelligence?

Well, so are we! If you are interested in applying for a postdoctoral fellowship in our group we would welcome you contacting us for further details. The position would apply text mining approaches across the broad internet to 'discover' interesting disclosures of either compound structures, progression through key clinical development or regulatory milestones, evidence of therapeutic efficacy for a compound/mechanism class etc. This data would be placed into the public domain, and integrated against other data sources, including ChEMBL, and other relevant data sources. You will need to have prior experience of entity recognition within free text, great scripting skills in a language such as perl or python, web analytics, and internet search engines (but hey, who doesn't know how to use google?). Additionally, experience of chemical structure extraction (image and/or text-based), ontology use/building and data integration and mining skills would be greatly beneficial.

The fellowship is intended for people wishing to pursue an independent research career, and significant autonomy and responsibility is available.

So, if you are interested, contact us.

Wednesday, 1 June 2011

New Drug Approvals 2011 - Pt. XVII Fidaxomicin (Dificid TM)

ATC code (partial): A07A

On May 27th 2011, the FDA approved Fidaxomicin (Tradename: Dificid; Research Code: PAR-101, OPT-80, NDA 201699), a macrolide narrow spectrum antibacterial drug indicated for the treatment of Clostridium difficile-associated diarrhea (CDAD) in adults. Clostridium difficile (C. difficile) is an anaerobic, spore-forming Gram-positive bacteria, and overgrowth of this species can cause severe diarrhea and other more serious intestinal conditions, such as colitis.

Fidaxomicin is a fermentation product obtained from the Actinomycete Dactylosporangium aurantiacum. It exerts its therapeutic effect by inhibiting beta subunit of the bacterial enzyme DNA-directed RNA polymerase (RNAP) (UniProt:Q890N5), resulting in the death of C. difficile. Bacterial RNA polymerase is a large (~400 kDa) five subunit protein, and is the target of the already approved antibiotic rifampicin. Other treatments for CDAD already in the market include antibiotics such as Metronidazole (trade name Flagyl; ChEMBLID: CHEMBL137) and Vancomycin (ChEMBLID: CHEMBL262777). Patients generally respond to these antibiotic therapies, however there is a risk of recurrent infection associated with these treatments. Fidaxomicin has been shown to be more active in vitro than Vancomycin (minimum inhibitory concentration (MIC) of 0.12 µg/mL and 1.0 µg/mL, respectively) against C. difficile and also more selective, having limited activity in vitro and in vivo against components of the normal gut flora.

There are several known structures of bacterial RNA polymerases in complex with various antibiotics, typical is the structure of the Thermus aquaticus RNA polymerase in complex with sorangicin (PDBe:1ynn)

The recommended dose of Fidaxomicin is one 200 mg tablet twice daily for 10 days (equivalent to a daily dose of 380 umol). At therapeutic doses, Fidaxomicin has a minimal systemic absorption, with plasma concentrations of Fidaxomicin and OP-1118, its main and microbiologically active metabolite, in the ng/mL range. The mean terminal half-life (T1/2) of Fidaxomicin and OP-1118 is 11.7 and 11.2 hours, respectively. Fidaxomicin is primarily transformed by hydrolysis at the isobutyryl ester to form OP-1118. Metabolism of Fidaxomicin and formation of OP-1118 are not dependent on cytochrome P450 (CYP) enzymes. Fidaxomicin is mainly excreted in feces, with 92% of the dose recovered as either Fidaxomicin and OP-1118.

Fidaxomicin (IUPAC: [(2R,3S,4S,5S,6R)-6-[[(3E,5E,8S,9Z,11S,12R,13E,15E,18S)-12-[(2R,3S,4R,5S)-3,4-dihydroxy-6,6-dimethyl-5-(2-methylpropanoyloxy)oxan-2-yl]oxy-11-ethyl-8-hydroxy-18-[(1R)-1-hydroxyethyl]-9,13,15-trimethyl-2-oxo-1-oxacyclooctadeca-3,5,9,13,15-pentaen-3-yl]methoxy]-4-hydroxy-5-methoxy-2-methyloxan-3-yl]3,5-dichloro-2-ethyl-4,6-dihydroxybenzoate; SMILES: CC[C@H]1\C=C(/C)\[C@@H](O)C\C=C\C=C(/CO[C@H]2
[C@H](O)[C@@H]4O)[C@H](C)O; ChEMBL: CHEMBL485861; PubChem: 46174142) has a molecular weight of 1058 Da, an ALogP of 7.7, seven hydrogen bond donors and 18 acceptors, and thus is not rule of five compliant. A notable feature is the 18-member polyene macrolide ring.

The full prescribing information can be found here.

The license holder for Fidaxomicin is Optimer Pharmaceuticals, Inc. and the product website is