First, let’s review the bioactivity data included in ChEMBL. We extract bioactivity data directly from seven core medicinal chemistry journals. Some common activity types, such as IC50s, are standardised to allow broad comparisons across assays; the standardised data can be found in the standard_value, standard_relation and standard_units fields. Original data is retained in the database downloads in the value, relation and units fields.
However, we extract all data from a publication including non-numerical bioactivity and ADME data. In these cases, the activity comments may be populated during the ChEMBL extraction-curation process in order to capture the author's overall conclusions.
Similarly, for deposited datasets and subsets of other databases (e.g. DrugMatrix, PubChem), the activity_comments reflect the overall activity conclusions from the depositor (e.g. active, inactive, toxic, non-toxic) and may also take into account other factors such as counter screens and/or controls. Since the criteria used to assign the activity are determined by the depositor, we also provide links back to the original assay data.
Why are the activity comments important?
First, these provide a way of capturing a broad range of relevant bioactivity or ADME data from publications. Second, the activity comments may take into account complexities such as counter screens, curve fitting etc. that have been addressed by depositors and may explain cases where apparently potent activities have been flagged as inactive/inconclusive by the depositor.