Since ChEMBL was first released in 2009, the diversity of data sources and data types in the database has increased significantly. Increasingly, we are dealing with more complex assays such as measurement of drug pharmacokinetic parameters or toxicology data sets such as clinical biochemistry and tissue histopathology data. There are a number of problems handling these kinds of assays with the current data model/database schema. For example, since parameters such as compound doses or time points could not be recorded against individual activity measurements (only the whole assay) such experiments were typically split so that a separate assay was created for each compound or time point measured. This is obviously far from ideal. Another issue is that such experiments frequently measure or derive multiple endpoints from a particular assay (e.g., AUC, Cmax, tmax, t1/2 for a pharmacokinetic study) or produce large amounts of raw data that may need to be associated with summary-level
The Organization of Drug Discovery Data
| | | | | | | |