Principal Component Analysis Plot Coloring
As you have shown in demo tutorial; Trajectory Analysis with Bio3D. I would like to draw your attention at the Principle component analysis plot figure 4. Your have choose the different color scheme as mentioned there... The continuous color scale (from blue to whit to red) indicates that there are periodic jumps between these conformers throughout the trajectory. My question is here that, which conformations are represented by the blue and red color? Why there is spread of red and blue color at both side of the zero? Because I have also generated the PCA plot for my protein and its complexes with ligands. I am inlcuding the PCA plot figures for all the systems (A, B, C, D). Can you explain me about this coloring system and their spread across the plot?
Comments (10)
-
-
reporter Here is the PCA plot for my protein (A) and its complexes (B, C, D) with ligands. As you have mentioned, coloring is based on simulation time, how it follows along the principal components (PCs)? Is this coloring system can be start from both sides of the zero or always follow the low value (-ve) side along the PCs axis?
-
Yes, I see the weird smooth transition of colors along PC1. It might suggest that your systems has performed a uni-directional conformational change... What commands exactly did you use to make these plots?
-
reporter What is the role of PC2 here, if everything is along PC1? Is this coloring system can be start from both sides of the zero or always follow the low value (-ve) side along the PCs axis?
I have analyzed the trajectory movie, it shows very low fluctuations in overall structure except two loop region which are fluctuating very high.
Here are the PCA commands
library(bio3d)
dcdfile <- system.file("examples/pet.dcd", package="bio3d")
pdbfile <- system.file("examples/pet.pdb", package="bio3d")
dcd <- read.dcd(dcdfile)
pdb <- read.pdb(pdbfile)
print(pdb)
print(pdb$xyz)
print(dcd)
ca.inds <- atom.select(pdb, elety="CA")
xyz <- fit.xyz(fixed=pdb$xyz, mobile=dcd, fixed.inds=ca.inds$xyz, mobile.inds=ca.inds$xyz)
pc <- pca.xyz(xyz[,ca.inds$xyz])
plot(pc, col=bwr.colors(nrow(xyz)) )
-
In the plot, each point represents a conformational frame from the simulation. The color of each point is dependent on the sequential order of the represented frame in the simulation trajectory (equivalent to simulation time), not the coordinates of the point in the PC1-PC2 plane. That means the "blue" can start from bottomleft, topright, or wherever the first frame is located. The coloring just shows a trace of the trajectory and has nothing to do with the coordinates themselves.
The plots you attached still seem weird, in both the scattering pattern and coloring. I strongly recommend you check the trajectory and PCA carefully. For example, you may check what motions PC1 and PC2 represent by using the function mktrj() along with VMD. Also, you've mentioned the very flexible loops, which might be the cause of the problem. Remove these loops manually and see what you can get.
You can also attach your dcd and pdb files here (reduce files sizes if they are too large) and then I would like to check for you.
-
reporter I have also used the gromacs software for PCA analysis of these proteins. PC1& PC2 plotting is the same there too but not smart like Bio3D. So I have got the same plotting from the two different softwares. Herewith I have attached the PC1 trajectory file image generated from bio3d.
-
Is this from MD simulation? The data look very different from what I saw in many cases. For example, in panel D, it seems the system has a "jump" near the end of trajectory from -10 to 80 along PC2, which is nonphysical... Did you use periodic boundary condition? Have you wrapped all the parts to the proper box?
-
reporter Yes, this is PC1 trajectory from MD simulation for panel A. I will recheck the simulation data for panel D to rectify this issue.
-
reporter The error has been corrected for panel D.
Thank you very much Dr Xin Qiu Yao for helping me.
-
- changed status to resolved
- Log in to comment
Hi,
The coloring is based on simulation time not conformation. In Figure 4, there are two apparent clusters of conformation in the PC1-PC2 panel, locating on both sides of zero along the PC1 axis. The spread of red and blue points means that during simulation the system undergoes reversible conformational change and accesses the two clusters multiple times.
The figure you attached here cannot display properly. Can you try it again?