Fancy playing with RDKit C++ API without needing to set up a C++ project and compile it? But wait... isn't C++ a compiled programming language? How this can be even possible?
Thanks to Cling (CERN's C++ interpreter) and xeus-cling jupyter kernel is possible to use C++ as an intepreted language inside a jupyter notebook!
We prepared a simple notebook showing few examples of RDKit functionalities and a docker image in case you want to run it.
With the single requirement of docker being installed in your computer you'll be able to easily run the examples following the three steps below: docker pull eloyfelix/rdkit_jupyter_clingdocker run -d -p 9999:9999 eloyfelix/rdkit_jupyter_clingopen http://localhost:9999/notebooks/rdkit_cling.ipynb in a browser
Kuster Lab Chemical Proteomics Drug Profiling (src_id = 48, Document ChEMBL_ID = CHEMBL3991601):
Data have been included from the publication: The target landscape of clinical kinase drugs. Klaeger S, Heinzlmeir S and Wilhelm M et al (2017), Science, 358-6367 (https://doi.org/10.1126/science.aan4368)
FPSim2 is a new tool for fast similarity search on big compound datasets (>100 million) being developed at ChEMBL. We started developing it as we needed a Python3 library able to run either in memory or out-of-core fast similarity searches on such dataset sizes.
It's written in Python/C++ and features: A fast population count algorithm (builtin-popcnt-unrolled) from https://github.com/WojciechMula/sse-popcount using SIMD instructions.Bounds for sub-linear speed-ups from 10.1021/ci600358fA compressed file format with optimised read speed based in PyTables and BLOSCUse of multiple cores in a single search In memory and on disk search modesSimple and easy to use
Source code is available on github and Conda packages are also available for either mac or linux. To install it type:
conda install rdkit -c rdkitconda install fpsim2 -c efelix
Try it with docker (much better performance than binder):
docker pull eloyfelix/fpsim2 docker run -p 9999:9999 eloyfelix/fpsim2 open http:/…
Happy New Year from the ChEMBL Group to all our users and collaborators. Firstly, do you want a new challenge in 2019? If so, we have a position for a bioinformatician in the ChEMBL Team to develop pipelines for identifying links between therapeutic targets, drugs and diseases. You will be based in the ChEMBL team but also work in collaboration with the exciting Open Targets initiative. More details can be found here(closing date 24thJanuary). In case you missed it, we published a paper at the end of last on the latest developments of the ChEMBL database “ChEMBL: towards direct deposition of bioassay data”. You can read it here. Highlights include bioactivity data from patents, human pharmacokinetic data from prescribing information, deposited data from neglected disease screening and data from the IMI funded K4DD project. We have also added a lot of new annotations on the therapeutic targets and indications for clinical candidates and marketed drugs to ChEMBL. Importantly we ha…
But what is a multi-task neural network? In short, it's a kind of neural network architecture that can optimise multiple classification/regression problems at the same time while taking advantage of their shared description. This blogpost gives a great overview of their architecture. All networks in references above implement the hard parameter sharing approach.
So, having a set of activities relating targets and molecules we can train a single neural network as a binary multi-label classifier that will output the probability of activity/inactivity for each of the targets (tasks) for a given query molecule…