For the last three months, I've been busy working my way through a 9000 long (sometimes headache-inducing) set of ChEMBL compound ids. These had been highlighted for curation for the reason that for each ChEMBL_id in the list, there were two or more compound keys from the same paper. This implied that either there were two indistinguishable using InChI representation compounds described in the paper or they were different compounds that had been somehow merged together in the database. Each ChEMBL_id was individually checked against the data in the original paper to see if there were indeed two compound keys for the same structure. The outcome of this check gave rise to one of four cases: The structure(s) was found to be incorrect and was redrawn. The structure was correct for some records but not others, so a new compound was created for those selected records. The structure required the definition of stereochemistry or a salt. The structure was le
The Organization of Drug Discovery Data
| | | | | | | |