We are going to speak in the Data Integration and Knowledge Management track at the Bio-IT World (Europe) meeting to be held in the beautiful city of Hannover, Germany, October 5th to 7th 2009. Should be a good meeting...
The ChEMBL caravan will always have a place in our hearts, but now we have an office, and we must move on from the pain. It has some walls and a door, with a nice hook for jackets and coats. It is nice, bijou even, and has bought a smile to all our faces. Most importantly it gives us a place to entertain guests and visitors, so if anyone is in the area, please pop by and have a cup 'o tea and a slice 'o cake.
As a sideline, to fund the tea and cakes, we have a nice T-shirt - XXXL only, minimum order 10 pieces if you are interested.
We are going to speak at the 238th ACS National Meeting, Washington, DC, August 16-20, 2009, on "ChEMBL: Large-scale Mapping of Medicinal Chemistry and Pharmacology Data to Genomes". The Abstract for the talk is:
Although the majority of effective therapeutics are small molecules,
there is relatively little readily accessible public domain data
mapping drugs to their molecular targets. When one considers clinical
trial stage, or discovery stage data, the situation deteriorates
further. However, this type of data is essential for Chemical Biology
experiments, and is crucial for informed target selection in drug
discovery. To address this issue, we have built a series of large
scale databases, known as ChEMBL, that map small molecule structures
to their target genes and also their functional effects. This data
also captures a large ammount of human and model organism
pharmacological data, systems often used in pre-clinical validation
and safety pharmacology testing. A variet…
I am on holiday today - sort of. Went to Borders for a Starbucks (product placement hopefully pays well), and while queuing for my Orange Mocha Frappuccino!, I caught sight of the O'Reilly books; one stood out from the crowd - Programming Collective Intelligence. It looks a very cool collection of code (Python) implementing a whole variety of data analysis/machine learning techniques and routines to build smarter, more responsive and adaptive web 2.0 applications. Skimming the pages while I had my caffeine speedball led me to spend my cash.
%A Toby Segaran
%T Programming Collective Intelligence
%O ISBN 978-0596529321
No, this isn't a hotel review, despite the picture above; however since finding this image, I now feel compelled to visit Carlsbad, NM to pose under the sign. Now for the post itself; find below some screenshots for an internal interface for StARlite data developed by our close collaborators at the Institute of Cancer Research in Sutton. This interface for StARlite shows some basic workflow themes that give some ideas as to the potential uses of StARlite 'straight out of the tin'. Several of the views will be incorporated into the EMBL-EBI public web interface ;) Bissan's group is developing an integrated system for cancer chemogenomics, called canSAR.
Compound Searching: What would an SAR database be without a compound sketcher and search mechanism. Well here is one, implemented with Marvin, and the Dotmatics Pinpoint cartridge.
Compound Browsing: A "Top Trumps" view on compounds is quite a useful paradigm for browsing and selecting compounds for further an…
Published in 1994 in recognition of the huge influence of Motoo Kimura on the field of theoretical studies on molecular evolution. This is a collection of papers and essays published by Kimura over the period 1955 to 1986. The writing is just truly beautiful, the prose, pace and clarity in the text humbles me as a supposed native English speaker (as this blog so clearly shows!). If you're not interested in the science at all, just buy it for the masterclass of technical writing inside.
The theme of the book is around The Neutral Theory, quite a contentious issue in evolution (essentially, this states that the vast majority of observed mutations at a molecular level are not adaptive; now flame me!) This book changed the way I thought about mutation, protein sequence and structure and function. Forever.
%T Population Genetics, Molecular Evolution, and The Neutral Theory
%A Motoo Kimura
%E J.F. Crow
%O ISBN: 0-226-43562-8
I came across a nice tabular summary of some existing 'public' 'primary' protein-ligand interaction databases that primarily focus on protein-ligand affinity data (so Ki, Kd, IC50, EC50, etc.), that I have reproduced below (many thanks to Helena Strömbergsson, from Uppsala University for the data).
NameTarget Class FocusApprox sizeBindingDBAll~48,000PDSPReceptors~47,000BRENDAEnzymes~19,000BindingMOADAll~3,500PDBBindAll~3,500AffinDBAll~700PLDAll~500
The comparable number from StARlite (31) are 507,645 (of which 186,370 are better than 100nM) for affinity class end-points. Oh, and we have started a new load.....
The next web-meeting for a walkthrough of the StARlite schema, data model assumptions and sample queries will be at 11am to noon local UK time (so at this time of year GMT/UT) on Friday 20th March. If you wish to take part in this meeting please use this link (do not modify the header of the email in any way!).
The last time we tried a web meeting, my domestic broadband connection could not cope with audio and the slides at the same time, so you will need to dial into a UK land line number; unfortunately, this will not be a freephone number.
Finally, if you can't make this time, we will set up a similar meeting in another few weeks or so.
I have provisionally planned the ChEMBL group retreat for 2009. It will be in Crieff, Scotland, and will be in late September. The format will allow detailed discussion and brainstorming of ideas for the ChEMBL project, and will be themed around the following areas.
Open-Source Drug Discovery and Open Science.
Patent data-mining and indexing.
Auto-curation and in-line predictive model generation.
An Ontology for drug discovery screening cascades.
A web-services primer.
The mornings and evenings will be informal discussions of science, while the afternoons will be fun, fungi, flora and photography (who said alliteration is dead!) oriented walks in the wooded areas around Crieff. The picture above is of a reasonably rare parasite of truffles (a Cordyceps sp.) found in Crieff around the same time of year in 2008. This is the fruiting stage of the fungus (the teleomorph), the non-fruiting body stage (the anamorph) of a closely related fungus is the source of the powerful immunosuppresi…
Just downloaded and synced up Papers for the iPhone from mekentosj. What a great little app, beautiful interface, very snappy performance, and allows the carrying of a whole bunch of literature in your pocket, and also searching/downloads from your handheld. What more could a hipster mobile scientist want? (Apart from good 3G coverage, 64GB of memory, free Wi-Fi everywhere, and free journal access, of course).