VEHICLe - virtual exploratory heterocyclic library

An interesting and thought provoking paper from last year was 'Heteroaromatic Rings of the Future' by Will Pitt (of UCB) (subscription required) and others at UCB. The basic idea of the paper was to exhaustively identify then analyse the class of all possible heterocycles with the following constraints. i) mono and bicyclic rings, ii) Only 5 and 6 membered rings, iii) Only containing C, N, O, S and H, iv) neutral, v) obey Hückel’s 4n+2 rule of aromaticity , and vi) Only exocyclic carbonyls. Heterocycles like this are at the very core of drug discovery and medicinal chemistry.

The dataset is now available for download from the chembl ftp site, and also as a Google document

The file contains...

regid: the id for each distinct ring system
SMILES: the encoded chemical structure of each ring system
Training dataset hits: the count of substructure hits found in the
original search of commercial compound catalogues, drugs etc. (as reported in the paper).
Beilstein hits: the count of substructure hits in the Beilstein
database at that time (June 2008). Some fields are blank - searching with benzene
and other common ring systems would have taken too long.
Pgood: predicted synthetic tractability after training with both the
above datasets
Tautomer cluster: tautomeric equivalents are grouped into clusters

Will can be contacted at will.pitt (at) ucb.com for a free reprint of the paper, or more discussions of the work.

We will integrate the VEHICLe ring system regids into Chembl at some point in the future.

%T Heteroaromatic Rings of the Future
%A W.R. Pitt
%A D.M. Parry
%A B.G. Perry
%A C.R. Groom
%J J. Med. Chem.
%D 2009
%V 52
%P 2952-2963
%O VEHICLe

The ChEMBL-og

Search This Blog

VEHICLe - virtual exploratory heterocyclic library

Labels

Comments