How to plot the mode arrows stucture pictures with the pca object

Issue #367 resolved
xwind created an issue

I use the following scripts to generate the pca object 'pc'. However I donnot like the illustrate style of 'p1'.

Instead, I'd like view the style of mode arrows figure as the nomal mode analysis done (show as the attached figure, the arrow indicate the mode movement).

So how I got it in the bio3d package from the 'pc' object?

## read DCD files
dcd <- read.dcd('./mdpr.dcd')
pdb <- read.pdb('./avg_mod.pdb')

## fit
ca.inds <- atom.select(pdb, elety = c('CA'))
xyz <- fit.xyz(fixed = pdb$xyz, mobile = dcd,
               fixed.inds = ca.inds$xyz,
               mobile.inds = ca.inds$xyz)

## pca
pc <- pca.xyz(xyz[, ca.inds$xyz])

## picture
p1 <- mktrj.pca(pc, pc = 1, b = pc$au[,1], file = "pc1.pdb")

My sessionInfo:

> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=zh_CN.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=zh_CN.UTF-8        LC_COLLATE=zh_CN.UTF-8    
 [5] LC_MONETARY=zh_CN.UTF-8    LC_MESSAGES=zh_CN.UTF-8   
 [7] LC_PAPER=zh_CN.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bio3d_2.2-4

loaded via a namespace (and not attached):
[1] compiler_3.2.3 parallel_3.2.3 tools_3.2.3    grid_3.2.3    

Comments (16)

  1. Barry Grant

    Try updating the package to make sure that function is included - you likely have an older version.

  2. Xinqiu Yao

    In the version you have installed, the function is called view.modes(). Try and see if it works. Or, you may install the development version and use pymol.pca(). See here for more detailed instruction on installation.

  3. xwind reporter

    Thanks for your help and I found the view.modes() is what I need.

    Besides, may I ask extra questions about the PCA analysis?

    1. By using functions
    p1 <- mktrj.pca(pc, pc = 1, b = pc$au[,1], file = "pc1.pdb")
    

    What's the beginning structures comes from (coordinate)?

    1. Further, these structures is along the given principal component. So is the mode arrows from the first structure to the last structure similar function as view.modes()?

    Haha, because the view.modes() did not provide the function to adjust the mode arrow size and also cutoff some values too small. So I may use the modevectors in pymol to draw the mode arrows. Maybe the future pymol.pca() may provide more control parameters.

  4. Xinqiu Yao

    Sure. The beginning structure is the mean of the input structures in PCA, and other structures in the trajectory are linear interpolation along the PC from the mean. So, the mode arrows from the first structure to the last structure are indeed the same as shown in view.modes().

    You can set the 'scale' in 'view.modes()' to adjust the arrow size, which is the same as in the new 'pymol.pca()'.

  5. sr_w2022

    I have the same questions as above. Actually, I have used the pymol.pca() function, and get the arrow picture. And my question is, does the arrow indicate the first PC of the pca results?

    My scripts are as below:

    library(bio3d)
    dcd<-read.dcd("proteinCA.dcd")
    pdb<-read.pdb("proteinCA.pdb")
    ca<-atom.select(pdb,elety="CA")
    xyz<-fit.xyz(fixed=pdb$xyz,mobile=dcd,fixed.inds = ca$xyz,mobile.inds=ca$xyz)
    dim(xyz)==dim(dcd)
    pca<-pca.xyz(xyz[,ca$xyz])

    pymol.pca(pca,scale=25,dual="TRUE",file="pc.py")

    If the arrow structure I get isn’t the first PC of the pca, then how can I get that? I have also tried “pymol.pca(pca,pc=1,scale=25,dual="TRUE",file="pc1.py")" or other scripts like "pymol.pca(pca$z[,1,],scale=25,dual="TRUE",file="pc.py")", but neither of them succeeded, can you show me how to do that correctly?

    And I have the second question. I read a paper that showed the motion of pca as arrows and plot their size and color according to their magnitudes of motion and the magnitudes were labeled. I want to do the same thing through Bio3d, how can I do that with their values indicated?


  6. Xinqiu Yao

    Hi,

    By default, what is shown should be PC1 vectors already. If you are not sure, try using mode=1 (instead of ‘pc=1’) in the function call. (See ?pymol.pca for more detail).

    I am not very clear about the second question. Isn’t the figure already what you want (showing arrows with both colors and lengths scaled by the magnitude of the motion)?

  7. sr_w2022

    Thanks, I used mode=1 and get the motion successfully.
    As for the second question, do the arrows in the picture I got showing the absolute magnitude of motion? Because I would like to compare the system between different ligand-bound proteins, and I’m not sure if the result is a relative comparison inside one system or the absolute degree for all systems that I can use to compare. And if there is any way I can get these values, to make a color bar with values labeled as in the second picture?

  8. Xinqiu Yao

    Both arrow size and color are showing “normalized” rather than “absolute” motion. Hence, they are not directly comparable between systems.

    There may be a solution without modifying the code. The following idea is (to my best knowledge) theoretically correct but never tested. If you wish, you can try and let me know:

    There are two kinds of “normalizations” you need to worry about. One is that the arrows directly reflect the eigenvector without considering the eigenvalue (i.e., the variance along that eigenvector is ignored). This means that the sum of squares of arrow lengths for any system is equal to 1 (or the square of the value set by scale in the function call). This can be easily solved by setting scale to the square root of the corresponding eigenvalue or any of its multiplications.

    The second normalization regards the colors and it is scaled by the longest arrow in a system. The trick is to add a dummy atom at some arbitrary position that reflects the maximal arrow length across all systems. For example, if you know that an atom in system A has the longest arrow (which can be easily found by checking the x, y, z components of each atom in the eigenvector), you can copy the eigenvector components for that atom from system A and paste it at the end of eigenvectors of other systems. Also, you need to add a dummy atom in the pca$mean (simply add three arbitrary numbers at the end; these will be the position that you don’t what to show up in the window). These processes will allow the function to scale arrow colors in a consistent manner.

    For the colorbar, you can use R to create one. For example,

    ggplot2::ggplot(data.frame(x=c(0, 0), y=c(0, 0), col=c(0, 1)), aes(x, y, col=col)) + geom_point() + scale_color_gradientn(colors=colorRampPalette(c("blue", "white", "red"))(255))

    You can reset the maximal value to that fits your case (e.g., the maximal arrow length after being scaled by the eigenvalue).

    Hope it may help.

  9. Log in to comment