Calculating cluster centroid structure
I am trying to cluster my MD trajectory using RMSD distances in hclust. I calculate the mean structure (medoid) of each cluster using the following:
# the mean structure for the cluster #1
mxyz.1 <- colMeans(xyz[grps==1, ])
# output the structure
write.pdb(pdb=pdb, file="clus1.pdb", xyz=mxyz.1)
The problem is that the MD simulation is of a multi-domain protein, showing large inter-domain movements. Hence the average structures calculated for each cluster show a number of bad contacts, incorrect sigma skeleton, etc. (for example, the phenyl group side-chain gets averaged to a weird configuration).
Is there a better way for calculating the centroid/medoid for each cluster? For example, finding the representative structure in the cluster that shows the minimum sum of squares of the RMSD distances with the other structures in the same cluster.
Comments (3)
-
-
reporter Thanks a lot. The solution works fine. I had searched for the issue before posting, but could not find it.
-
- changed status to resolved
- Log in to comment
Hi,
Have you checked this issue? There, Barry has provided a nice solution to the same question.