Skip to main content

Posts

Showing posts from July, 2015

Paper: Activity, assay and target data curation and quality in the ChEMBL database

We've just published an Open Access paper in the Journal of Computer-Aided Molecular Design on the curation of bioactivity, assay and target data in ChEMBL, including current practices and future plans. 
Here is the abstract:
The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curati…

ChEMBL python client update

Along with updating ChEMBL web services to the new 2.x version, we've also updated the python client library (chembl_webresource_client). The change was backwards compatible so it's possible that existing users haven't even noticed the change.

As we've already provided examples of using new web services via cURL or using live docs, now it's good time to explain the changes made to the python client.

First of all, if you haven't installed (or updated) it yet, you can do it using Python Package Index:


Now you can access new functionality using the following import statement:


Just as a mild warning, in 0.8.x versions of the client the new part will be called new_client. In 0.9.x it will change the name to client and the old part will be renamed to old_client and deprecated. In 1.0.x the old functionality will be removed completely.

OK, so since we know how to import our new_client object, we can try to do something useful. Let's retrieve some activities. We k…

Biological annotations in SureChEMBL

Termite annotation in action. (Termite not to scale)
SureChEMBL is perhaps the only freely available, large-scale, comprehensive and live resource of chemistry extracted from the patent literature. SureChEMBL automatically annotates, normalises and indexes chemistry found in the full text, images and attachments (i.e. mol files) of patent documents. The next logical step for us, was to complement the chemical annotations with biological ones, such as mentions of gene names and classifications, protein classes and disease indications.
As the first step towards this direction, we usedTermite provided bySciBite (via funding fromOpenPHACTS) to integrate these annotations dynamically into the full text patent view of the SureChEMBL user interface; in other words, you can now view biological annotations on-the-fly.
How do I add the annotations and navigate through them? There is now an additional checkbox underneath the 'Highlight additional recognised chemical terms' checkbox:

Simply c…

myChEMBL + docker

In addition to the myChEMBL 20 VM images released earlier, today we are very happy to release myChEMBL Docker images.

What's docker?Docker is a new open-source project that automates the deployment of distributed applications. It takes advantage of some new cool features of modern Linux kernel in order to run virtual containers, avoiding the overhead of starting and maintaining virtual machines [from Wikipedia].

In contrast to virtual machines, which emulate virtual hardware, docker containers employ the kernel of the host machine so they don't require or include the whole operating system. While still separated from the host, they only add a very thin level of abstraction [ZDNet article].

Why docker? Docker is an emerging technology; it has become extremely popular over the last year and been adopted and used by the largest IT companies, such as RedHat, Canonical and Microsoft. Basically, using this platform you can do three things:
BuildShipRun an arbitrary complex piece of s…

We're recruiting!

Want to join the ChEMBL team?

We are seeking to recruit an experienced Web Application Developer to join the Chemogenomics Team at the European Bioinformatics Institute (EMBL-EBI).

You will develop a series of web-based applications and interfaces for the ChEMBL chemogenomic resources. In collaboration with senior team members you will also have a role in determining and advising on the web development strategy for the chemogenomic resources. In addition you will be involved with the development, maintenance and documentation of these tools and supporting their usage within EMBL-EBI and externally. The position will also involve some requirement gathering and use-case development.

For more information or to apply for this position follow this link:
http://ig14.i-grasp.com//fe/tpl_embl01.asp?newms=jj&id=53807&aid=15470

The ChEMBL team

myChEMBL 20 has landed

We are very pleased to announce that the latest myChEMBL release, based on the ChEMBL 20 database, is now available to download. In addition to the ChEMBL upgrade, you will also find a number of changes and new features:

Updates in system and Python libraries, including the iPython notebook server Upgrade in the web services (data and utils) to match the new functionality provided by the main ChEMBL ones Current stable version of RDKit (2015.03) Two brand new notebooks, namely an RDKit tutorial and a tutorial on SureChEMBL data mining, increasing the total number of notebooks to 14 Updates in several other iPython notebooks and the KNIME workflow, in order to take advantage of the new data, models and web services functionality Several bug fixes A CentOS 7 VM version, in addition to the existing Ubuntu 14.04 one New virtualisation technologies, as explained in the section below

Lots of flavours

ChEMBL @ Boston this August

A couple of us will be visiting Boston, MA for the ACS Meeting between 16 and 20 August. We'll be talking about SureChEMBL and ChEMBL. If you'd like to arrange a meeting/seminar or just go out for drinks and clams, just let me know
George