How to tell the difference between two clusters at residual level by PCA?

Issue #800 new
Cheng created an issue

Figure 17 of This link introduces how to show the clustering results. Is there a way to structurally tell the difference between two clusters at residual-level?

It is something like Figure 14, but only for the difference between the particular clusters.

It is also something like Figure 13. But Figure 13 shows the total difference along PC1 or PC2. Can this difference be broken down into residue-level?

If this is possible, can I ask the exact codes? Thank you!

Comments (5)

  1. Xinqiu Yao


    It’s not very clear to me what you were requesting. Could you give an example of such a kind of analysis?

  2. Cheng reporter

    Hi, so Figure 17 shows the three clusters based on their relative positions along the PC1 and PC2 axes. PC1 and PC2 are generated by capturing the variance of individual residues, and different residues contribute differently as shown in Figure 14.

    My idea is, can we plot figures like Figure 16, but change the axes to Residue 1, Residue 2, Residue 3, …, Residue n, so that we can see the relative positions at residue level?

  3. Xinqiu Yao

    If you want to plot conformers based on their relative positions of Residue 1 and 2, for example, you just need to map them using the corresponding Cartesian coordinates. For example, the x-coord of Residue 1 and x-coord of Residue 2. But, of course, there will be the problem that you have to make a lot of plots because each residue/atom has three coordinates and there are many residues. That’s why we do PCA, which finds an optimal linear combination of these coordinates that separate conformers the best.

    I am not sure I have answered the question. It will be helpful if you could find a paper or something that does the analysis you have suggested.

  4. Cheng reporter

    Thank you for the information. I think, “map them using the corresponding Cartesian coordinates” will take the entire cartesian coordinates into account. However, each PC only partially captures the variance of each residue.

    Sorry I could not find a published example.

    To rephrase my purpose: based on Figure 16, we know red dots are mostly separated from blue ones along the PC1. But how can we interpret this structurally? e.g. which residue contributes mostly in separating red dots from blue dots?

  5. Log in to comment