Two new papers from the group have just been published, both in Journal of Chemoinformatics - and of course both Open Access.
The first deals with some extensions to UniChem to allow far more flexible searches. The abstract is:
UniChem is a low-maintenance, fast and freely available compound identifier mapping service, recently made available on the Internet. Until now, the criterion of molecular equivalence within UniChem has been on the basis of complete identity between Standard InChIs. However, a limitation of this approach is that stereoisomers, isotopes and salts of otherwise identical molecules are not considered as related. Here, we describe how we have exploited the layered structural representation of the Standard InChI to create new functionality within UniChem that integrates these related molecular forms. The service, called ‘Connectivity Search’ allows molecules to be first matched on the basis of complete identity between the connectivity layer of their corresponding Standard InChIs, and the remaining layers then compared to highlight stereochemical and isotopic differences. Parsing of Standard InChI sub-layers permits mixtures and salts to also be included in this integration process. Implementation of these enhancements required simple modifications to the schema, loader and web application, but none of which have changed the original UniChem functionality or services. The scope of queries may be varied using a variety of easily configurable options, and the output is annotated to assist the user to filter, sort and understand the difference between query and retrieved structures. A RESTful web service output may be easily processed programmatically to allow developers to present the data in whatever form they believe their users will require, or to define their own level of molecular equivalence for their resource, albeit within the constraint of identical connectivity.
%T UniChem: extension of InChI-based compound mapping to salt, connectivity and stereochemistry layers %A J Chambers %A M Davies %A A Gaulton %A G Papadatos %A A Hersey %A JP Overington %J Journal of Cheminformatics %D 2014 %V 6:43 %O doi:10.1186/s13321-014-0043-5 %O http://www.jcheminf.com/content/6/1/43 %T A document classifier for medicinal chemistry publications trained on the ChEMBL corpus %A G Papadatos %A GJP van Westen %A S Croset %A R Santos %A S Trubian %A JP Overington %J Journal of Cheminformatics %D 2014 %V 6:40 %O doi:10.1186/s13321-014-0040-8 %O http://www.jcheminf.com/content/6/1/40