Skip to main content

Should CAS numbers be in ChEMBL and/or UniChem?


A very quick survey to add excitement to either your holiday or work-day! None of these sucker links, where there appears a 0.24% complete progress bar on the second page, it's just a simple yes/no question on whether it's a good idea to add CAS registry numbers to ChEMBL and/or UniChem. No promises that we could deliver this, but depending on what you vote for, we will consider our options.

Update: Given the multiple channels out there, there are also comments on this on LinkedIn (in the ChUG - "ChEMBL User Group" group - why not join, if you're not already) and a couple on Google+.

Update 2: I'll let the poll run till the end of the week (Friday 8th 2014) - and then write something up on the results.

Comments

I would argue against this. The CAS registry number is proprietary and not easy to use. Particularly, you are not allowed to collect them, though they have an informal limit at 10k registry numbers. This causes serious licensing issues with ChEMBL: you will have to make it a separate database and release it separate files. CC-BY-SA does not allow further restrictions such as those imposed for the CAS registry number.
jpo said…
I would disagree with the statement that I'm not allowed to collect them. How can anyone stop me from public sources of course. For example, is there a license carve out on the wikipedia CAS numbers? wikipedia content is CC-BY-SA, so perfect alignment with the current ChEMBL license. There are lots of other sources of large sets of CAS RNs - NCI resolver, ChemSpider, PubChem, UNII). There are also many on public, non-copyrighted documents, patents, INN/USAN documents, etc. To say that i'm not allowed to do anything with them, is just bonkers.

There is a formal limit of 10K, if you sign (or your organisation, with relevant scope of the license).

I'm of mixed view myself as to whether it is worth doing something with ChEMBL - hence to poll - see what the community thinks. For some of the stuff I'm currently working on (clinical candidate disclosures) they are required, and I have never seen a statement to say I can't reuse them in any document I've come across). The whole idea is that they (CAS RNs) are useful to cross reference chemical (and biological) objects with systems that choose to use them.

Sorry for briefish reply, holiday, and just back from the beach with wet trunks!
jpo said…
I would disagree with the statement that I'm not allowed to collect them. How can anyone stop me from public sources of course. For example, is there a license carve out on the wikipedia CAS numbers? wikipedia content is CC-BY-SA, so perfect alignment with the current ChEMBL license. There are lots of other sources of large sets of CAS RNs - NCI resolver, ChemSpider, PubChem, UNII). There are also many on public, non-copyrighted documents, patents, INN/USAN documents, etc. To say that i'm not allowed to do anything with them, is just bonkers.

There is a formal limit of 10K, if you sign (or your organisation, with relevant scope of the license).

I'm of mixed view myself as to whether it is worth doing something with ChEMBL - hence to poll - see what the community thinks. For some of the stuff I'm currently working on (clinical candidate disclosures) they are required, and I have never seen a statement to say I can't reuse them in any document I've come across). The whole idea is that they (CAS RNs) are useful to cross reference chemical (and biological) objects with systems that choose to use them.

Sorry for briefish reply, holiday, and just back from the beach with wet trunks!