Negative similarity conservation scores
Issue #370
resolved
We are using blast.pdb() and the sequence analysis tools in the Bio3D package (seqaln, conserv, consensus) to look at homologous protein sequences.
When testing with conserv() and low cutoffs for the consensus sequence and using the "similarity" scoring method, some scores come through as negative. These are observed in the original scoring matrix, before any other calculations / normalizations are done.
Is it reasonable for these similarity scores to be negative? If so what do the negative scores represent?
Comments (4)
-
-
- marked as task
-
- changed status to resolved
The question has been answered. Will reopen if the user comes back and has more questions.
-
- changed version to v2.2
- Log in to comment
Negative values are indeed possible but not common with the "similarity" approach depending upon which matrix of the residue residue similarity scores are chosen. For example the PET91 based similarity matrices (from Jones, Taylor and Thornton 1991) is an update of the MDM78 Dayhoff matrix normalised such that all maxima are on the diagonal with a score of 10. You can see the different matrices and their occasionally negative values in the "bio3d/inst/matrices/" directory of the score code. What is returned is an average of all these 'similarity scores' of all pairwise residue comparisons for that position in your alignment.
Negative values imply that on average your position is quite diverse.