Score calculations normal modes (SIP,RMSIP)

Issue #570 resolved
Yazhini arangas created an issue

When I run nma.pdbs() on 2 proteins and calculate scores from sip.enm() and rmsip.enm() it gives much higher scores than when I run nma.pdb() separately and compare them by sip() and rmsip() for a same pair of proteins. Why there is a difference?

a <-  read.pdb('133L.pdb')
b <-  read.pdb('g60_133L_maln1.pdb')
m <- nma.pdb(a,subset=16)
n <- nma.pdb(b,subset=16)
x <- sip(m$fluctuations,n$fluctuations)
x1 <- rmsip(m,n)
ab <- pdbaln(c('133L.pdb','g60_133L_maln1.pdb'))
mn <- nma.pdbs(ab,subset=16)
y <- sip.enm(nm)
y1 <- nm$rmsip

Always y(1) >> x(1). Is it because of fit operation at pdbaln. Is it appropriate to do nma calculation separately and compare them by these scores (SIP,RMSIP) and do we need normalization for doing NMA separately before score calculations?

Comments (7)

  1. Lars Skjærven

    hmm...

    # fetch test pdbs
    > ids = c("1rx2_A", "1rg7_A")
    > files = get.pdb(ids, split=T)
    > files
    [1] "./split_chain/1rx2_A.pdb" "./split_chain/1rg7_A.pdb"
    
    # align and superimpose
    > pdbs = pdbaln(files)
    > pdbs$xyz = pdbfit(pdbs, outpath="tmp")
    
    # calc modes
    > m1 = nma(read.pdb("tmp/1rg7_A.pdb_flsq.pdb"))
     Building Hessian...        Done in 0.044 seconds.
     Diagonalizing Hessian...   Done in 0.082 seconds.
    > m2 = nma(read.pdb("tmp/1rx2_A.pdb_flsq.pdb"))
       PDB has ALT records, taking A only, rm.alt=TRUE
     Building Hessian...        Done in 0.042 seconds.
     Diagonalizing Hessian...   Done in 0.083 seconds.
    
    # sip from individual pdb objects
    > sip(m1$fluctuations, m2$fluctuations)
    [1] 0.9630835
    
    # sip from pdbs object
    > modes = nma(pdbs)
    > sip(modes)
               1rx2_A.pdb 1rg7_A.pdb
    1rx2_A.pdb   1.000000   0.963085
    1rg7_A.pdb   0.963085   1.000000
    

    make sure your individual pdbs are aligned before calculating rmsip ?

    PS - what does the argument subset=16 give you?

  2. Xinqiu Yao

    Hi,

    I have replied your Email about the same issue. The sip() results look very similar between the two methods. The minor difference is because of fitting of structures in nma.pdbs() but not in separate nma() calls.

    For rmsip comparison, it is wrong not doing fitting. So, you should always use nma.pdbs() results, or manually do structural fitting and then run nma() separately.

    Let me know if anything is still unclear.

  3. Yazhini arangas reporter

    Oops. "what does the argument subset=16 give you?" That I actually was intended to mention as keep or subspace but wrongly written as subset. Thank you for pointing it. I have one more clarification. If I am not wrong, 'Keep' will store modes and 'subspace' will store eigenvectors as given in the help page. If I want to calculate SIP from result of nma.pdbs() command with option subspace=16 (to store first 10 non-trivial modes), then the fluctuations of those 10 modes will only be considered for the analysis right?. The reason why I am asking because if subspace store only eigenvectors, what about eigenvalues? However I have compared SIP values with and without subspace option. Both are giving same results. (Does it implicitly mean SIP takes only 1st 10 non-trivial modes?)

  4. Lars Skjærven

    using subspace=10 in nma.pdbs() will store 10 non-trivial eigenvectors. using keep=16 in nma.pdb() will store 10 non-trivial eigenvectors.

    sorry for the confusing nomenclature and lack of documentation on this.

    checking which eigenvalues (stored in $L) are non-zero will also guide you here

    > modes$L
     [1] 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.013383 0.013933
     [9] 0.022355 0.025518 0.029944 0.033954 0.039717 0.042778 0.046968 0.051500
    
  5. Log in to comment