Mass-weighted PCA analysis – problem with output trajectory

Issue #299 resolved
Michal created an issue

Hi,

I am using Bio3d to perform PCA calculations on RNA molecular dynamics trajectory.

Since I wanted to perform such analysis for different subsets of atoms, I used atomic masses in pca.xyz function.

The analysis works, but resulting pdb trajectory from principal components (atom movements) is giving me very long atomic distances (over scaled?), of over 30A.

(I can also see some minor local distortions (like too many bonds) shown in VMD for non-mass-weighted PCA analysis.)

Is it really crucial to use mass-weighting for essential mode analysis?

In general, my PCA plots look almost similar, except for different PC scales, and sometimes some principal components are mirrored, depending on atom selection.

It is also possible that I have done something wrong with getting mass vector from the structure, for example.

Here are my commands:

pdbref is a reference PDB structure, while

pdb is a trajectory loaded in as a multiframe PDB file (saved with VMD)

selatoms.inds <- atom.select(pdbref, "noh")
xyz <- fit.xyz(fixed=pdbref$xyz, mobile=pdb$xyz,
               fixed.inds=selatoms.inds$xyz,
               mobile.inds=selatoms.inds$xyz)

pc <- pca.xyz(xyz[,selatoms.inds$xyz], use.svd = TRUE)

masses <- atom2mass.pdb(pdb, inds = selatoms.inds)

pcmw <- pca.xyz(xyz[,selatoms.inds$xyz], use.svd = TRUE, mass = masses)

p1 <- mktrj.pca(pcmw, pc=1, b=pcmw$au[,1], file=PDBFileName, 
               resno=pdb$atom$resno[selatoms.inds$atom], 
               elety=pdb$atom$elety[selatoms.inds$atom],
               resid=pdb$atom$resid[selatoms.inds$atom],
)

Thank you for your suggestions,

Michal

Comments (7)

  1. Xinqiu Yao

    Hi,

    In the 'mass-weighted' PCA, the eigenvalues are also scaled by masses, which are usually much larger than those in non-mass-weighted PCA. The mktrj.pca() function uses eigenvalues as a reference for the extent to which the collective motion is generated, which assumes non-mass-weighted PCA. So, to get a reasonable trajectory, reduce the default values for step and mag (e.g. step=0.01, mag=0.1) in mktrj.pca().

    The distortion in the trajectory is normal. Note that the generated file is not a "real" picture how molecules move; It merely represents a linear extrapolation of an 'instantaneous' movement. So, it will become nonphysical if extrapolated too much.

  2. Michal reporter

    Hi,

    Thank you for your reply.

    It helped, but the picture is still not perfect, although is much better. I guess it is just a matter of playing a bit with rescaling parameters.

    How about performing mass-weighted PCA?
    Is it not so important to apply mass scaling for such standard trajectory analysis, then?
    Of course, using the same atom type, like backbone CA or P atoms only, makes life a lot easier, but using more atoms are better for visualization purposes, on the other hand.

    By the way, these mag and step keywords have to be at the very end of mktr.pca parameters, right?

  3. Xinqiu Yao

    Hi,

    From a data mining point of view, you don't have to do mass weighting, and this is probably how many folks do for trajectory analysis. Mass-weighted PCA is more suitable for a direct comparison with normal mode analysis, which usually takes into account masses.

    The function arguments don't have to be at the end as long as you specify them with explicit names. For example mktrj.pca(pcmw, pc=1, mag=0.1, step=0.01) and mktrj.pca(pcmw, step=0.01, pc=1, mag=0.1) give exactly the same results.

    Let me know if you have more questions.

  4. Michal reporter

    OK, I see.

    Thank you for pointing that out and for help.

    According to mag and step keywords, when I was trying to apply them in the longer version of the mktrj.pca() command, written above, any other location than these two keywords given at the end, gave me the following error:

    Error in write.pdb(xyz = coor, file = file, ...):
       argument is missing, with no default. 
    

    Cheers, Michal

  5. Xinqiu Yao

    Hi,

    Can you provide the full command you used to call mktrj.pca()? Also, make sure all arguments have been given with associated names if you want to ignore the positions.

  6. Michal reporter

    Hi,

    I have checked the commands again and I think I have found the problem.

    It is somehow related to the last comma in mktrj.pca() paramater field.

    If the following command is used:

    p1 <- mktrj.pca(pcmw, pc=1, file=PDBFileName,
                   resno=pdb$atom$resno[selatoms.inds$atom], 
                   elety=pdb$atom$elety[selatoms.inds$atom],
                   resid=pdb$atom$resid[selatoms.inds$atom],   
    )
    

    everything is fine, but as soon as I add mag=1, step=0.125 keywords, no matter where, the comma in the last line is causing the mentioned error:

    Error in write.pdb(xyz = coor, file = file, ...) :
      argument is missing, with no default
    

    Removing the last comma solved the problem.

    Hope, that will help.

    Michal,

  7. Log in to comment