Wednesday, 10 August 2011
Descriptors for Protein Sequences?
Does anyone know of a website/web service that calculates a series of descriptors of a protein sequence, analogous to the descriptors that are regularly calculated for small molecules.
Specifically what I'm looking for is something that gives a large set of descriptors for either a sequence, or for a given stable identifier (e.g. UniProt ID). The descriptors I'd like back would be things like Molecular weight, number of each amino acid, fraction of each amino acid, hydrophobicity values, complexity/sequence entropy values, number of transmembrane helices, presence of certain features (e.g. signal sequence, nuclear localisation sequence, etc.), domain counts would be good as well - building up a 'fingerprint' for the sequence. I guess with a little bit of thought, it would be possible to come up with a fuller list of descriptors, and the above certainly isn't exclusive, but you get the idea; I'm sure. To be clear, I don't want an annotation service, I just want some numerical/logical feature descriptors.
Does something like this exist, should it be built if not, and so forth?