Different results of PCA from Bio3d and gromacs

Issue #679 resolved
Former user created an issue

Hi everyone,

I been trying to figure out how PCA works and since i came across Bio3d which is much more user friendly. I tried to carry out the PCA analysis which i had already performed by gromacs. But i find a difference in the clusters and also the values range too high in bio3d. Please have a look at plot and clarify where am going wrong ?

code used for bio3d-

library(bio3d) dcd <- read.dcd('prt-mu-gdp-gtp.dcd') NATOM = 2696 NFRAME= 511 ISTART= 0 last = 511 nstep = 511 nfile = 511 NSAVE = 1 NDEGF = 0 version 24 |================================================| 100% pdb <- read.pdb('prt-mu-gtp.pdb') ca.inds <- atom.select(pdb, elety = 'CA') xyz <- fit.xyz(fixed=pdb$xyz,mobile=dcd, fixed.inds=ca.inds$xyz,mobile.inds=ca.inds$xyz) pc <- pca.xyz(xyz[, ca.inds$xyz]) plot(pc, col=bwr.colors(nrow(xyz)))

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /opt/local/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets [6] methods base

other attached packages: [1] bio3d_2.3-4

loaded via a namespace (and not attached): [1] compiler_3.5.1 parallel_3.5.1 tools_3.5.1
[4] Rcpp_1.0.1 grid_3.5.1

Comments (5)

  1. Xinqiu Yao

    Hi,

    I didn’t spot any problem in the codes you printed above. Does your system have a single or multiple chains? Is it possible that there are imaging issues in the trajectory - for example, one chain crosses the box boundary but the other does not - that was taken care by gromacs but not by bio3d? You can test it by generating a properly fitted trajectory by gromacs and then using bio3d to just do PCA and compare. Let me know if it is the case.

  2. Aiswarya Pawar

    Hi,

    Sorry to have not logged earlier.

    I have two simulations (single chain) wildtype and mutant and concatenated these two simulations into one, and also taken care of the PBC conditions and converted into dcd file. I have used initial PDB structure from the simulation for the Bio3d. I dont understand the issue here.

    Also I felt there would be a problem with concatenation i performed wildtype only simulation still am getting similar plots for Bio3d. Am i missing something here.

  3. Aiswarya Pawar

    i tried again and realised while reading dcd file, i got an error such that

    Warning message:
    In dcd.header(trj, verbose) :
    Check DCD header data is correct, particulary natom

    has this got to do with the high variation in data.

  4. Aiswarya Pawar

    HI,

    I redid the pbc condition and it worked. So basically it was an issue with the periodic image.

    Thank you for the correction

  5. Log in to comment