ChEMBL Resources

The SARfaris: GPCR, Kinase, ADME

Sunday, 18 September 2011

Protein Descriptors

Some time ago we asked about protein descriptor services on the web - the long and short of it is that, there really wasn't anything that fit our needs, so we wrote one. Given a sequence it returns a long vector of descriptors, things like hydrophobicity, fraction of each amino acid, pI, and a whole bunch of other stuff. If there is interest, we could open this up as a web service - so return a JSON or XML object in real time (and for UniProt sequences have these precalculated). There will be some licensing issues for some of the descriptors, but I'm sure we can sort something out.


Gerard van Westen said...

Did you obtain these from the AAIndex db? (

You could also think of using descriptors that rely on a PCA analysis of the input data you are providing (actually I could help you with some of these, although in the form of a PP component).

George said...

Great idea.

jpo said...

In reply to above...

We are mostly interested in bulk sequence properties at the moment (so fractional composition, hydrophobicity, features, etc), so a descriptor that gives a number for an input sequence.

There are loads of other stuff that would be cool to add, antigenicity, secondary structure prediction fractions, etc.

The sort of license issues are related to use of services which are freely available for academics, but there are some restrictions for "commercial use" - for example TMHMM where there is a download version of the software for academic institutes to use. I would guess that this license doesn't really cover the setup of a derivative service, allowing access over web services. This is just one example, not highlighted for any particular reason; but we would need to get permission from a fair number of software providers.

If we do set it up we want two things 1) freely accessible to all without restriction by user type and 2) compliant with the software licenses and wishes of the original authors.

Gerard van Westen said...

Ah I see, well in this case you can also have a look at PROFEAT.

the webserver is located here:

However I cannot tell you anything about the performance of this particular descriptor, might serve as a benchmark to your own solution?