Bug: nma.pdbs with fit=FALSE doesn't return rmsip.map, and others

Issue #34 resolved
Xinqiu Yao created an issue
pdbs <- pdbaln(c("1tag", "1as0"))
modes <- nma.pdbs(pdbs, fit=FALSE)
modes$rmsip.map
#NULL

Also, another issue is modes.array doesn't match full.nma$modes

dim(modes$modes.array)
#[1] 939  20   2
dim(modes$full.nma[[1]]$modes)
#[1] 939 939
identical(modes$modes.array[,1,1], modes$full.nma[[1]]$modes[,1])
#[1] FALSE
identical(modes$modes.array[,1,1], modes$full.nma[[1]]$modes[,7])
#[1] FALSE

For the second question, my understanding of modes.array and full.nma might be wrong. Forgive me if it is the case...

Comments (12)

  1. Lars Skjærven

    Exactly. I have switched off RMSIP calculation unless you fit prior to normal mode calculations. since rmsip compares the directionality of the mode vectors, they wont make much sense if the structures are not aligned first. Right? Of course you can claim that the user might have aligned them manually, so perhaps issuing a warning would be better?

    The nma object holds both the eigenvectors (nma$U) and the mode vectors nma$modes. The $U are the raw unmodified eigenvectors. If mass=TRUE, they will be in mass-weighted coordinates. $modes on the other hand are converted to unweighted cartesian coordinates. They are also scaled by the thermal fluctuations (unless temp=NULL). Note that $U = $modes when temp=NULL and mass=FALSE. (see a more thorough explanation here).

    In nma.pdbs, we use the $U since we want to compare the raw (orthogonal) vectors. Moreover, modes.array holds only the top 20 non-trivial modes, while nma$U holds all modes, i.e. also the first six trivial modes. Thus, you should compare in the following way:

    > head(all.modes$modes.array[,1,1])
    [1] -0.06126235  0.10521696 -0.05430943 -0.04789024  0.06667126 -0.03675411
    > head(all.modes$full.nma[[1]]$U[,7])
    [1] -0.06126235  0.10521696 -0.05430943 -0.04789024  0.06667126 -0.03675411
    > identical(all.modes$modes.array[,1,1], all.modes$full.nma[[1]]$U[,7])
    [1] TRUE
    

    I see that this can be confusing, but I think we should return also the trivial modes in the full nma object. While for nma.pdbs I think limiting the size of modes.array is important.

    Hmm.. but perhaps we should return $modes.array also if full=FALSE - so that full only relates to the nma objects ?

    The differences in the dimensions of the matrices relates to reducing the size of the output. i.e. only the first 20 (non-trivial) modes are returned in $modes.array. This is hard coded at the moment...

    > dim(all.modes$full.nma[[1]]$U)
    [1] 288 288
    > dim(all.modes$modes.array[,,1])
    [1] 288  20
    
  2. Xinqiu Yao reporter

    Thanks for the long reply, Lars! You are right. To compare modes we should first fit all the structures. In most time, we analyze a group of PDB structures pre-fitted using the core positions (like in doing PCA). Maybe it is better to have a warning message but still calculate rmsip.map if fit=FALSE instead of returning NULL, isn't it?

    I understand that only top 20 non-trivial modes are printed in modes.array to reduce output size. The variable name, however, is kind of misleading (it is easy to be regarded as an array of nma$modes, right?) Maybe consider changing the name to e.g. U.array (too straightforward?).

  3. Lars Skjærven

    More intuitive names for output of nma.pdbs sounds reasonable. I'd like some input on what you think:

    $fluctuations
    $rmsip.map ==> $rmsip
    $modes.array ==> $U.array
    $full.nma
    

    what is $U and $L anyway?

  4. Barry Grant

    The $U and $L notation come from a great old book by W.J. Karzanowski on multivariate analysis that I got back in my first year of graduate school. So they are because of the way I first coded this and for sentimental reasons.

    I like the $fluctuations and $rmsip names. Are you suggesting that $U is ambiguous - surely everybody has read the Karzanowski book ;-)

    Possible confusion here comes from $modes.array ==> $U.array, as this object contains only the top 20 non-trivial modes. Is there a name that will convey this trimmed down content?

    Whats in $full.nma again?

  5. Lars Skjærven

    absolutely not. I'm sure its only me.. :)

    $nma.full is the full nma objects. so that nma.full1 gives you the nma object for structure 1.

    $U.array: essential subspace - something in that direction...?

  6. Log in to comment