Some time ago we showed an example of how a model trained in Python's PyTorch could be run in a C++ backend by exporting it to the ONNX format.
Greg also showed us in his blogpost how our multitask neural network model could be used in a very nice KNIME workflow by exporting it to ONNX. That was possible thanks to RDKit's Java bindings and the ONNX Java runtime.
As a refresher, most of the most popular machine learning frameworks can export their models to this format and many programming languages can load them to run the predictions. This certainly is a beautiful example of interoperability!
Here is our demo with its available source code. Start typing a smiles into the box and enjoy!
Updated code to generate the model is also available here. This updated code takes advantage of the PyTorch Lightning library.
The ChEMBL database contains bioactivity data that links compounds to their biological targets. Most ChEMBL targets are proteins (~ 70% in version 27) and these are mapped to their UniProt accessions. On the ChEMBL interface, searches can be performed with either protein names or accessions...but did you know that protein similarity searches are also possible? Here’s an example using human Phospholipase DDHD2 , a target not found in ChEMBL. 1. On the ChEMBL interface , click 'Enter a Sequence: 2. Input the FASTA sequence corresponding to human Phospholipase DDHD2 and click 'Search in ChEMBL': 3. Review the BLAST results, select targets of interest and browse bioactivity data: The BLAST search identifies the mouse Phospholipase DDHD2 homologue alongside a small number of bioactivity data points and active compounds . ChEMBL's sequence search feature is currently only available through the interface. However, sequence data for prote