Bug: nma.pdbs with fit=FALSE doesn't return rmsip.map, and others

Lars Skjærven

Exactly. I have switched off RMSIP calculation unless you fit prior to normal mode calculations. since rmsip compares the directionality of the mode vectors, they wont make much sense if the structures are not aligned first. Right? Of course you can claim that the user might have aligned them manually, so perhaps issuing a warning would be better?

The nma object holds both the eigenvectors (nma$U) and the mode vectors nma$modes. The $U are the raw unmodified eigenvectors. If mass=TRUE, they will be in mass-weighted coordinates. $modes on the other hand are converted to unweighted cartesian coordinates. They are also scaled by the thermal fluctuations (unless temp=NULL). Note that $U = $modes when temp=NULL and mass=FALSE. (see a more thorough explanation here).

In nma.pdbs, we use the $U since we want to compare the raw (orthogonal) vectors. Moreover, modes.array holds only the top 20 non-trivial modes, while nma$U holds all modes, i.e. also the first six trivial modes. Thus, you should compare in the following way:

> head(all.modes$modes.array[,1,1])
[1] -0.06126235  0.10521696 -0.05430943 -0.04789024  0.06667126 -0.03675411
> head(all.modes$full.nma[[1]]$U[,7])
[1] -0.06126235  0.10521696 -0.05430943 -0.04789024  0.06667126 -0.03675411
> identical(all.modes$modes.array[,1,1], all.modes$full.nma[[1]]$U[,7])
[1] TRUE

I see that this can be confusing, but I think we should return also the trivial modes in the full nma object. While for nma.pdbs I think limiting the size of modes.array is important.

Hmm.. but perhaps we should return $modes.array also if full=FALSE - so that full only relates to the nma objects ?

The differences in the dimensions of the matrices relates to reducing the size of the output. i.e. only the first 20 (non-trivial) modes are returned in $modes.array. This is hard coded at the moment...

> dim(all.modes$full.nma[[1]]$U)
[1] 288 288
> dim(all.modes$modes.array[,,1])
[1] 288  20

2013-10-09T09:30:36+00:00

Xinqiu Yao reporter

Thanks for the long reply, Lars! You are right. To compare modes we should first fit all the structures. In most time, we analyze a group of PDB structures pre-fitted using the core positions (like in doing PCA). Maybe it is better to have a warning message but still calculate rmsip.map if fit=FALSE instead of returning NULL, isn't it?

I understand that only top 20 non-trivial modes are printed in modes.array to reduce output size. The variable name, however, is kind of misleading (it is easy to be regarded as an array of nma$modes, right?) Maybe consider changing the name to e.g. U.array (too straightforward?).

2013-10-09T15:59:49+00:00

Lars Skjærven

assigned issue to

Lars Skjærven
edited description

2013-10-10T08:02:27+00:00

Lars Skjærven

More intuitive names for output of nma.pdbs sounds reasonable. I'd like some input on what you think:

$fluctuations
$rmsip.map ==> $rmsip
$modes.array ==> $U.array
$full.nma

what is $U and $L anyway?

2013-10-16T13:15:03+00:00

Xinqiu Yao reporter

Thanks! I thought $U and $L are eigenvector and eigenvalue, right?

2013-10-16T13:59:13+00:00

Lars Skjærven

yes, but I never caught that abbreviation.. U stands for

2013-10-16T14:02:06+00:00

Barry Grant

The $U and $L notation come from a great old book by W.J. Karzanowski on multivariate analysis that I got back in my first year of graduate school. So they are because of the way I first coded this and for sentimental reasons.

I like the $fluctuations and $rmsip names. Are you suggesting that $U is ambiguous - surely everybody has read the Karzanowski book ;-)

Possible confusion here comes from $modes.array ==> $U.array, as this object contains only the top 20 non-trivial modes. Is there a name that will convey this trimmed down content?

Whats in $full.nma again?