We are exploring establishing links from the ChEMBL compounds to patents. The implementation can have two basic routes....
- Links from the interface to patents (simple and quick to do now we have UniChem).
- Patent uri's in the database itself (more complex, and more difficult to keep up to date, but arguably more useful).
So to help our planning for next year, comments, wishes are most welcome....
Comments
As to the source of the patent structures. There are a number of initiatives underway at the moment to text-mine chemical structures from patents. We're currently not free to say what some of these sources are, but one source could be the feed from the EPO team.
These structures would be loaded into UniChem (qv) and all the lookups done there.
A big problem with other ways of chemical patent data are shown by your other comments - indirect access through semi-open resources, with significant onus on the user to ensure they don't violate any explicit or ambiguous usage constraints/licenses.
One of the ideas of patent filings is explicitly to make things easy to find so researchers don't waste time recreating other peoples IP, and also can build on top of this. Current systems do not really allow this.....