Calculating cluster centroid structure

Issue #398 resolved
Karan Kapoor created an issue

I am trying to cluster my MD trajectory using RMSD distances in hclust. I calculate the mean structure (medoid) of each cluster using the following:

# the mean structure for the cluster #1
mxyz.1 <- colMeans(xyz[grps==1, ])

# output the structure
write.pdb(pdb=pdb, file="clus1.pdb", xyz=mxyz.1)

The problem is that the MD simulation is of a multi-domain protein, showing large inter-domain movements. Hence the average structures calculated for each cluster show a number of bad contacts, incorrect sigma skeleton, etc. (for example, the phenyl group side-chain gets averaged to a weird configuration).

Is there a better way for calculating the centroid/medoid for each cluster? For example, finding the representative structure in the cluster that shows the minimum sum of squares of the RMSD distances with the other structures in the same cluster.

Comments (3)

  1. Karan Kapoor reporter

    Thanks a lot. The solution works fine. I had searched for the issue before posting, but could not find it.

  2. Log in to comment