ChEMBL identifiers are stable with respect to the entities they represent. For compounds (with known/defined structures), ChEMBL identifiers represent distinct compound structures, as defined by the standard InChI, e.g., CHEMBL25 represents: InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12). Therefore, two compounds reported in different papers but having the same standard InChI will be assigned the same ChEMBL ID.
These ChEMBL IDs will never be reassigned to a structure with a different standard InChI. However, since compounds may be reported or drawn incorrectly in the literature, it is sometimes necessary to alter the compound ChEMBL ID (structure) to which a particular bioactivity measurement links. In this case, the old (incorrect) ChEMBL identifier may be 'downgraded' in the database if no other data link to it. Downgraded compounds are not currently displayed on the live interface, but are retained in the database and the ChEMBL ID lookup table, and could be re-instated in future (with the same ChEMBL ID) if new data become available for them.
External identifiers for ChEMBL entities are also recorded in the database, where possible. For example, in addition to ChEMBL IDs and InChI/InChIKeys, all small molecule compounds with defined structures are assigned ChEBI identifiers. Where data are taken from other resources, the original identifiers are also retained (e.g., SIDs and AIDs for PubChem substances and assays, HET codes for PDBe ligands). PubMed identifiers or Digital Object Identifiers (DOIs) are stored for documents, and protein targets are represented by primary accessions from the UniProt database.
Comments