ChEMBL Resources

The SARfaris: GPCR, Kinase, ADME

Monday, 28 January 2013

UniChem Released

For data managers of chemistry resources, the maintenance of structure-based links to other chemistry resources can be a tedious chore. The job is all the more burdensome knowing that your counterparts in other chemistry based-resources are essentially duplicating your efforts, in order to keep their links to your resource updated.

In an attempt to remove this duplication of effort, and automate the processes involved, we have developed UniChem,  and which is described in a recent publication.

Getting structure-based links out of UniChem can be achieved either via the web-interface or the web services. For automated updating, using the web-services is often the best choice. The current set of web service methods has been designed to allow users several options for how they might obtain links data. Below are detailed two possibilities.

One such option would be to use the following methods: First, query UniChem for all valid src_id’s using the ‘GetSrcIds’ method. Then, iterate through this list and retrieve, using the ‘GetSourceInfo‘ method, all the details of these sources that you require (eg: the ‘base-url’ for constructing links). Lastly, iterate through the src_id list once more, this time retrieving all the mappings from your source to each of the other sources, using the ‘GetMapping’ method. Combining the results of the second and third queries can provide you with all the mappings from your compound identifiers to the URLs for the compounds in the other sources. These data can be stored locally, and queried and incorporated into a compound page when required. Periodic refreshes of these local tables by repeating the above process would be required to pick up UniChem updates.

Alternatively, you may wish to create links more dynamically, using, for example, the ‘GetVerboseSrcCpdIdsFromInchiKey’ method. Using this method, compound web pages may be populated with all links as the page is requested, after querying UniChem on the fly with the InChIKey. Returned from this single query is a list of sources which contain valid compound links. For each of the sources, a keyed list describes information such as the ‘base-url’, etc. One of the keys (‘src-compound_id’) maps to an array of src-compound_ids. Combining the ‘base-url’ with each of the src_compound_ids gives the required links. See the example of this method in the link immediately above.

No comments: