Thursday, 9 September 2010

EMBL-EBI Small Molecule Bioactivity Course - Feb 2011


Just posted on the EMBL-EBI website are the first details of the Small Molecule Bioactivity Course. There will be more details later, and the agenda needs sketching out in more detail, but the dates, and guest lecturers are all set. So, if you would like to take part in an introductory level course to the use of chemogenomics approaches to understanding biology and supporting healthcare research, keep an eye out for more details, or take a chance in case all the places go, and register now!

On a related note, I would like to highlight the overall good-all-round-goodness and helpfulness of Noel O'Boyle; in honour, I may even go back to using the Irish form of my surname O'Verington.....

The image above will mean little to most, but a lot to the few who watch UK kids TV.

Friday, 3 September 2010

ChEMBL_06 is live


We're pleased to announce the release, a few moments ago of ChEMBL_06. This contains an additional 29,142 compound records and 138,348 new bioactivities. We've also done quite a lot of compound cleanup, names, research codes (vide infra), and so forth. A variety of database dumps are available from the public ftp site, and the live web database is now connect to ChEMBL_06.

Additional data this release includes the standard literature data, but also the data from the brilliant Genomics of Drug Sensitivity in Cancer project, coordinated by Ultan McDermott at the Sanger Center (interested readers in the oncology area are also pointed to this previous blog post).

2010 New Drug Approval - Pt. XI - Alcaftadine (Lastacaft)







The summer got in the way of a timely post on this new drug. On 28 July 2010, Alcaftadine was approved for the treatment of patients with allergic conjunctivitis as a 0.25% opthalmic solution.
This allergic reaction is most familiar in patients with hay fever but can also be caused by other allergens such as dust mites, moulds, perfumes etc. It causes red, itchy and watery eyes.
Allergic conjunctivitis is caused by a type I hypersensitivity reaction of the immune system. Antigenic epitopes of the allergen are detected by IgE antibodies which mediate the excessive activation of mast cells and basophils.  The symptoms of allergic conjunctivitis are mainly caused by the release of histamine from these activated immune cells. Histamine increases the permeability of blood vessels and stimulates the activity of immune cells, through a number of differing histamine receptors.

Alcaftadine and it's carboxylic acid metabolite (produced via a non P450 route) are antagonists of the H1 histamine receptor (Uniprot: P35367) and also inhibits histamine release.

Alcaftadine is administered topically as a 0.25% solution. In a pharmakokinetics study, the plasma CMAX of Alcaftadine is 60pg/mL and occurs after 15 minutes, the plasma CMAX of the active metabolite is 10pg/mL and occurs after one hour. Plasma protein binding (ppb) for Alcaftadine is 39.2%, and for the carboxylic acid metabolite is 62.7%. The elimination half-life of the metabolite is appoximately 2 hours. The presence of the aldehyde is an unusual chemical feature in Alcaftadine, since aldehydes are usually quite reactive, as would be expected this group is readily metabolized to a carboxylic acid.

The full prescribing information is here.

Adverse reactions may include eye irritation, eye redness, nasopharyngitis, headache and influenza.




IUPAC: 11-(1-methylpiperidin-4-ylidene)-5,6-dihydroimidazo[2,3-b][3]benzazepine-3-carbaldehyde
SMILES: CN1CCC(CC1)=C2c3ccccc3CCn4c(C=O)cnc24
InChI: 1S/C19H21N3O/c1-21-9-6-15(7-10-21)18-17-5-3-2-4-14(17)8-11-22-
16(13-23)12-20-19(18)22/h2-5,12-13H,6-11H2,1H3
 

Alcaftadine was developed by the Janssen Research Foundation and will be marketed in the US under the name Lastacaft by Vistakon Pharmaceuticals.

Monday, 30 August 2010

Innovation and Ownership in Drug Discovery by Country (maybe, perhaps, well maybe not then!)

I've been looking at the Research Code data recently, and here is an interesting plot. It is the counts of Research Codes classified by Country. It is a first, look-see plot, based on currently incomplete data, but I think it is quite interesting nonetheless.


A basic assumption behind the assignment of a distinct research code stem is that they reflect an autonomous entity with the aim of discovering drugs. Today the majority of newly founded entities will be funded by private/VC money, and these will be acquired by a larger company once some degree of commercial success, or anticipated commercial potential has been achieved. Our data is a 'blend' of recent and historical data, and over time, the structure and scale of research has changed (a smaller number of companies in the distant past, and a larger number from the mid 1990s onwards as a large number of biotechs were established; also there will be differences across various countries).

The way we have collected the research codes (773 of them so far) will focus on clinical stage compounds, and therefore the ability of that company and associated infrastructure to move compounds through into clinical development. In our tables the research code has a 'currently controlling company' assigned to it, and this company has a 'country' assigned to it - this is the location of its corporate headquarters, and to a first approximation will record where the controlling rights/IP is now held (ignoring any specific licensing deals that have been done over specific drugs). Of course, the location of the headquarters does not reflect where the work is, or has been historically, done. Many current companies have multiple research codes, for example Pfizer has 32 distinct historical research codes, and this count will correlate with a number of mergers and acquisitions over time; these mergers will sometimes switch 'ownership' from one country to another.

The distribution follows a classic power-law distribution (80:20 rule, or a whole bunch of other similar names) -  specifically, six countries (of 27) cover 86% of research code stems (the USA, Japan, Germany, France, the UK and Switzerland). To my mind there are a few surprises; for example, the relatively high rank of Japan - this may reflect a complex corporate history of mergers, there are certainly few biotechs in Japan producing clinical candidates; but I just don't know yet. Secondly, Sweden seems lower than I would have expected, but this may be down to mergers transferring 'corporate ownership' from one country to another (Astra and Pharmacia). Conversely, Italy seems higher than I would have initially predicted - but maybe I don't know the history of the industry as well as I should.

Another obvious feature is the low current rank of India and China - although a lot of basic research and outsourcing is done in these territories now, very little of this is currently owned and coordinated by companies headquartered there.

I've given up on trying to use google docs for any of this stuff - it is not that stable for me, and so if anyone is interested in the underlying spreadsheet, mail me....

Friday, 27 August 2010

Current GPCR X-ray structures

As part of resurrecting GPCR SARfari from the ashes, we needed to refresh the protein structure content. There are now a surprisingly large number of distinct X-ray family A, rhodopsin-like GPCR structures known, of course it is never enough, but large nonetheless. There are five distinct proteins (bovine and squid rhodopsin, human beta-2 adrenergic receptor, turkey beta-1 adrenergic receptor and human Adenosine A2A receptor. These are known in a variety of different liganded states, crystal forms, resolutions, and also with differing numbers of distinct chains within crystallographic assymmetric units. So in total, there are 27 distinct X-ray PDB entries and 45 distinct GPCR domain structures.

Here is a table, as of 27th August 2010.

PDB codeCh.ProteinLigandSpeciesRes.Date
1f88ARhodopsinretinalBos taurus2.84 Aug 2000
1f88BRhodopsinretinalBos taurus2.84 Aug 2000
1gzmARhodopsinretinalBos taurus2.620 Nov 2003
1gzmBRhodopsinretinalBos taurus2.620 Nov 2003
1hzxARhodopsinretinalBos taurus2.84 Jul 2001
1hzxBRhodopsinretinalBos taurus2.84 Jul 2001
1l9hARhodopsinretinalBos taurus2.615 May 2002
1l9hBRhodopsinretinalBos taurus2.615 May 2002
1u19ARhodopsinretinalBos taurus2.212 Oct 2004
1u19BRhodopsinretinalBos taurus2.212 Oct 2004
2g87ARhodopsinretinalBos taurus2.62 Mar 2006
2g87BRhodopsinretinalBos taurus2.62 Mar 2006
2hpyARhodopsinretinalBos taurus2.818 Jul 2006
2hpyBRhodopsinretinalBos taurus2.818 Jul 2006
2pedARhodopsinretinalBos taurus2.92 Apr 2007
2pedBRhodopsinretinalBos taurus2.92 Apr 2007
2i36ARhodopsinapoBos taurus4.117 Oct 2006
2i36BRhodopsinapoBos taurus4.117 Oct 2006
2i36CRhodopsinapoBos taurus4.117 Oct 2006
2i37ARhodopsinapoBos taurus4.117 Oct 2006
2i37BRhodopsinapoBos taurus4.117 Oct 2006
2i37CRhodopsinapoBos taurus4.117 Oct 2006
2j4yARhodopsinretinalBos taurus3.425 Sep 2007
2j4yBRhodopsinretinalBos taurus3.425 Sep 2007
3capARhodopsinapoBos taurus2.924 Jun 2008
3capBRhodopsinapoBos taurus2.924 Jun 2008
3c9lARhodopsinretinalBos taurus2.65 Aug 2008
3c9mARhodopsinretinalBos taurus3.416 Feb 2008
3dqbARhodopsinapoBos taurus3.223 Sep 2008
2z73ARhodopsinretinalTodarodes pacificus2.513 May 2008
2z73BRhodopsinretinalTodarodes pacificus2.513 May 2008
2ziyARhodopsinretinalTodarodes pacificus3.727 Feb 2008
2r4rAbeta-2 adrenergic receptorapoHomo sapiens3.46 Nov 2007
2r4sAbeta-2 adrenergic receptorapoHomo sapiens3.46 Nov 2007
2rh1Abeta-2-adrenergic receptorCarazololHomo sapiens2.430 Oct 2007
3d4sAbeta-2 adrenergic receptorTimololHomo sapiens2.817 Jun 2008
3kj6Abeta-2 adrenergic receptorapoHomo sapiens3.416 Feb 2010
3ny8Abeta-2 adrenergic receptorICI-118551Homo sapiens2.811 Aug 2010
3ny9Abeta-2 adrenergic receptornovel analog of ICI-118551Homo sapiens2.811 Aug 2010
3nyaAbeta-2 adrenergic receptorAlprenololHomo sapiens3.211 Aug 2010
2vt4Abeta-1 adrenergic receptorCyanopindololMeleagris gallopavo2.724 Jun 2008
2vt4Bbeta-1 adrenergic receptorCyanopindololMeleagris gallopavo2.724 Jun 2008
2vt4Dbeta-1 adrenergic receptorCyanopindololMeleagris gallopavo2.724 Jun 2008
2vt4Cbeta-1 adrenergic receptorCyanopindololMeleagris gallopavo2.724 Jun 2008
3emlAAdenosine A2a receptorZM-241385Homo sapiens2.614 Oct 2008

Monday, 23 August 2010

More Research Code Stems



Many thanks to those of you who have sent in research code stems! I have updated this page, with about another 70, and the full table should shortly be accessible in chembldb.

Thursday, 19 August 2010

SMR Meeting On Epigenetics - 22nd September 2010


Drat! It's almost as if the SMR committee look at my Google calendar and book all their meetings on days when I'm otherwise occupied.....

Anyway, on Wednesday September 22nd 2010 the SMR are holding a meeting on Epigenetics, one of the hottest current areas of disease biology, at the NHLI, further details here.

Druggability assessment

Here are a couple of references for some work on computer-based target assessment we have been involved in.

%T The Molecular Basis of Predicting Druggability
%A Al-Lazikani, B.
%A Gaulton, A.
%A Paolini, G.
%A Lanfear, J.
%A Overington, J.
%A Hopkins, A.
%I Wiley-VCH Verlag GmbH
%O http://dx.doi.org/101002/9783527619368.ch36
%O DOI 10.1002/9783527619368.ch36
%P 1315-1334
%B Bioinformatics - From Genomes to Therapies
%E Lengauer, T.
%O ISBN: 978-3-527-31278-8
%D 2007

%T The Molecular Basis of Predicting Druggability
%A Al-Lazikani, B.
%A Gaulton, A.
%A Paolini, G.
%A Lanfear, J.
%A Overington, J.
%A Hopkins, A.
%I Wiley-VCH Verlag GmbH
%O http://dx.doi.org/10.1002/9783527619375.ch14b
%O DOI 10.1002/9783527619375.ch14b
%P 804-823
%B Chemical Biology: From Small Molecules To Systems Biology and Drug Design
%E Schreiber, S.L., Kapoor, T.M., & Wess, G.
%O ISBN: 978-3-527-31150-7
%D 2007

Tuesday, 17 August 2010

2010 New Drug Approval - Pt. X - Ulipristal Acetate (Ella)

The most recent approval by FDA is Ulipristal Acetate, approved on August 13th 2010 under the trade name Ella. Ulipristal Acetate (previously known by the research code CDB-2914 or VA-2914) is a progesterone agonist/antagonist emergency contraceptive, indicated for prevention of pregnancy following unprotected intercourse or known or suspected contraceptive failure.
This drug is a selective progesterone receptor modulator (SPRM) with antagonist and partial agonist effects (a progesterone agonist/antagonist) at the progesterone receptor (PR, NR3C3) (Uniprot code: P06401). The Progesterone Receptor is a member of a very significant family of proteins for drug discovery, the Nuclear Receptors, a family of around 50 genes which are transcription factors, the transcription by NRs is usually ligand regulated. Ulipristal Acetate prevents progesterone, the endogenous ligand, from occupying its receptor. Ulpristal Acetate binds in the ligand binding domain (LBD) of PR (PFAM: PF00104).

There are several structures known of PR complexed with ligands, a representative one is (PDB: 3D90). Ulipristal Acetate will compete with Levonorgestrel, another progestagen available on the market, which is approved for use up to three days post-intercourse as opposed to five days in the case of Ulipristal Acetate.
Ulipristal Acetate is a small-molecule, natural product derived drug (Molecular Weight 475.6 g.mol-1), Rule-of-Five compliant and it is delivered as a tablet. Ulispristal Acetate is highly bound to plasma proteins (>94%), including high density lipoprotein, alpha-1-acid glycoprotein, and albumin. It is metabolized to mono- and di-demethylated metabolites, mostly by CYP3A4; the mono-demethylated metabolite pharmacologically active. Ulpristal Acetate shows high affinity for the related nuclear receptor - glucocorticoid receptor (GR, NR3C1). The terminal half-life of Ulipristal Acetate is ca. 32 hours. The recommended dosage is one tablet (30 mg) taken orally, with or without food, as soon as possible, within 120 hours (five days) after unprotected intercourse or a known or suspected contraceptive failure.
The full prescribing information can be found here.
The structure 17alpha-acetoxy-11beta-(4-N,N-dimethylaminophenyl)-19-norpregna-4,9-diene-3,20-dione is a synthetic progestagen and is thus very similar to progesterone. Like other steroid hormones of this class, Ulipristal Acetate is characterized by its basic 21-carbon skeleton, i.e., four interconnected cyclic hydrocarbons with two methyl branches and a ketone. In this particular case, one of the methyl groups is replaced by a substituted aromatic amine.
NAME="Ulipristal Acetate"
TRADEMARK_NAME="Ella"
ATC_code= NA
SMILES="CC(=O)C1(CCC2C1(CC(C3=C4CCC(=O)C=C4CCC23)C5=CC=C
(C=C5)N(C)C)C)OC(=O)C"
InChI="InChI=1S/C30H37NO4/c1-18(32)30(35-19(2)33)15-14-27
-25-12-8-21-16-23(34)11-13-24(21)28(25)26(17-29(27,30)3)
20-6-9-22(10-7-20)31(4)5/h6-7,9-10,16,25-27H,8,11-15,17H2,
1-5H3/t25-,26+,27-,29-,30-/m0/s1"
ChemDraw=Ulipristal_Acetate.cdx
The license holder is Laboratoire HRA Pharma.

Monday, 16 August 2010

MGMS Young Modeller Forum Meeting - December 10 2010, London, UK

The MGMS have their annual Young Modellers Forum (YMF) meeting on December 10th 2010 at SOAS in London. Further details are here....

Saturday, 14 August 2010

Research Code to Company Name Mapping


As part of a long-term project connected with literature and web mining, competitor intelligence, and the history of drug development; here is a spreadsheet of research code stems, company name, country of company, and the name of the company when the research code stem was in use. It is incomplete, but still quite substantial with over 600 stems documented.

Please feel free to download and use the spreadsheet in which ever way you please. If you find any errors, or can provide a longer list (as long as this is from Publicly available sources!) that would be fantastic. Any additions will be credited appropriately.

So, Google docs is not currently working for me, but getting the list onto the blog itself, and therefore getting it indexed up in search engines, etc. was not too painful, so here is a link to an HTML table (sorted by company name).

Wednesday, 11 August 2010

USAN Watch - August 2010



The August 2010 USANs have just been published, these are:


USANResearch code Drug Type Drug ClassTarget
AlvocidibFlavopiridol, HL-275, HMR-1275, L86-8275, MDL-107826A, NSC-649890Synthetic small moleculetherapeuticCDK inhibitor
DanoprevirR05190591, RG-7227, ITMN-191Synthetic small moleculetherapeuticHCV Proteinase inhibitor
LatrepiridineSynthetic small moleculetherapeuticComplex MOA
LunacalcipolCTA-018Natural product-derivedtherapeuticVitamin D receptor
MavrilimumabCAM-3001mAb therapeuticGMCSFr alpha-chain
Moxetumomab pasudotoxCAT-8015, HA22mAbtherapeuticCD-22
Semuloparin sodiumAVE-5026OligosaccharidetherapeuticAntithrombin III


It may be of interest to note the USANs "Semuloparin" and "Semuloparin Sodium" in July and August this year. These USANs refer to the same active substance (Semuloparin), one being the sodium salt of the other. There are slight differences that exist between the WHO INN process and the USAN process. INNs do not include the salt/counterion in the name, whereas USANs historically have. Now, for USANs, both the salt and the parent molecule get assigned distinct USANs.

Deadline for ESPOD Project on Malaria Target Discovery Is Approaching....


A reminder that the deadline for the application for the EMBL-EBI/Sanger ESPOD fellowships is fast approaching - including that for the exciting Overington/Rayner malaria project - (the final date for applications is August 15th 2010 in fact). So if you are interested, please send in your completed application!

Tuesday, 3 August 2010

ChEMBL Resources for Drug Discovery Course - Feb 2011


We have penned into our diaries the dates of Monday February 14th 2011 thru Friday February 18th 2011 for the second ChEMBL residential training course. This will be held on campus here at Hinxton, registration and details of the sessions to be covered will appear on the EBI website shortly.

2010 year was our first course, and we had to prepare a lot of material, etc. but we really enjoyed it, and the 2011 course will be even better.

The image above is from the excellent xkcd.

Monday, 2 August 2010

From One Of Our Collaborators - MoSS+ChEMBL with Bioclipse

Pharmaceutical Knowledge Retrieval through Reasoning of chEMBL RDF” is the title of my master thesis, a twenty-week research project performed at the Department of Pharmaceutical Bioscience at Uppsala University (Prof. Wikberg, supervised by Egon Willighagen). The project aims at using the ChEMBL data with a technology that might be new to some: by using semantic web technologies. The life sciences workbench Bioclipse (doi:10.1186/1471-2105-10-397) has support for several semantic web tools, including RDF, and was used to establish such a connection.

Two aspects were looked at in this study. Firstly, we developed the search functionality for ChEMBL data to use RDF. For this, we took advantage of the RDF-ized ChEMBL knowledgebase (using the data from ChEMBL 02). Secondly, we developed a use case where compounds derived from ChEMBL are analyzed with the substructure mining software MoSS (see the Bioclipse Wiki). Here, we search for common and discriminative substructures within or between kinase families.
Within the context of these two aspects, we developed an application using both the JavaScript and the Wizard functionality in Bioclipse. The above shown wizard shows how various searches for compound-protein interaction can be formulated. Results are shown in the "Results table". The user can then select which data he wants to save, by moving it to the lower table which lists the data that will be saved by this wizard.

A second, more application-targeted Wizard was developed that primarily concentrates on retrieving compounds that bind proteins in a certain kinase family with a given activity type (see below). A histogram can be opened to visualize the distribution of activities. Lower and upper bound values can be selected, for focus, for example, only on that active compounds. A second, identical wizard page is provided to select a second dataset. This allows the user to set up a between-family data set. The saved data can then be used in the MoSS application to find the common and discriminative substructures (not shown).


Benefits of this approach focus on the data interoperability: the RDF technologies are used as uniform and Open Standard access to the ChEMBL data. Using this approach, implementing new search queries is very easy, and does not require one to know anything about the database schema; a common controlled vocabulary (ontology) hides those implementation details. Community standards for such vocabularies are under development, and will integrating the ChEMBL data with other databases and other applications.

Does this sounds interesting to you, or do like to give us feedback? Please send a note to annzi.andersson+chembl@gmail.com . Further details are provided in my blog!

 Sincerely, Annsofie Andersson.