Negative similarity conservation scores

Issue #370 resolved
Former user created an issue

We are using blast.pdb() and the sequence analysis tools in the Bio3D package (seqaln, conserv, consensus) to look at homologous protein sequences.

When testing with conserv() and low cutoffs for the consensus sequence and using the "similarity" scoring method, some scores come through as negative. These are observed in the original scoring matrix, before any other calculations / normalizations are done.

Is it reasonable for these similarity scores to be negative? If so what do the negative scores represent?

Comments (4)

  1. Barry Grant

    Negative values are indeed possible but not common with the "similarity" approach depending upon which matrix of the residue residue similarity scores are chosen. For example the PET91 based similarity matrices (from Jones, Taylor and Thornton 1991) is an update of the MDM78 Dayhoff matrix normalised such that all maxima are on the diagonal with a score of 10. You can see the different matrices and their occasionally negative values in the "bio3d/inst/matrices/" directory of the score code. What is returned is an average of all these 'similarity scores' of all pairwise residue comparisons for that position in your alignment.

    Negative values imply that on average your position is quite diverse.

  2. Log in to comment