Issue with fit.xyz

Issue #355 resolved
Former user created an issue

I have tried this:

pdb <- read.pdb("nw_NOL_neutral_shift.pdb")
> dcd <- read.dcd("sys1_stride20.dcd")
> sele1 <- atom.select(pdb, resno = c(137,142,145,119), chain = "D", elety = "CA")
> sele2 <- atom.select(pdb, resno = 121, elety = "CA")
> sele3 <- atom.select(pdb, resno = 463, elety = "CA")
> sele4 <- combine.select(sele1, sele2, sele3, operator = "+")
>combpdb <- trim.pdb(pdb, sele4)
> inds <- atom.select(combpdb)
> **trj <- fit.xyz(pdb$xyz, mobile = dcd, fixed.inds = inds$xyz, mobile.inds = inds$xyz)**
> cij <- dccm(trj[, inds$xyz])
> plot(cij)

again when i replace the trj object by trj1 as:

# till inds object everything is same as written above code
> **trj1 <- fit.xyz(combpdb$xyz, mobile = dcd, fixed.inds = inds$xyz, mobile.inds = inds$xyz)**
> cij1 <- dccm(trj1[, inds$xyz])
> plot(cij1)

Though my selection is same, i.e. inds, for these two case(trj1 and trj), i am getting two different cij plots so i am really confused which one is right cij matrix for the the selection i made ?

please, i need some suggestion, thanking you in advance,

kk

Comments (11)

  1. Xinqiu Yao

    Hi,

    The selection is basically a group of atomic indices mapping to the input pdb object. In the above codes, your input pdb changed (from 'pdb' to 'combpdb'), and that's why your final results changed even if the same selection object was used.

    By reading your codes, I can't figure out which pdb you should use unless you provide more information. What atoms are in the 'dcd'? Are they equivalent to the atoms in 'pdb'? (A quick check is to print the number of atoms in pdb and dcd, respectively. Are they the same?) What correlation matrix are you going to build, the whole protein CA atoms or the subset specified by 'sele4'?

  2. Kajwal Kumar Patra

    Hello sir

    The atoms in 'pdb' are the same those are in 'dcd'. i want to build the correlation matrix of the subset specified by 'sele4' thats why i have trimed the 'pdb' to 'combpdb', which will only contain 'sele4' atoms. and the 'inds' is chosen as inds <- atom.select(combpdb). the confusion is whether i should use:

    trj <- fit.xyz(pdb$xyz, mobile = dcd, fixed.inds = inds$xyz, mobile.inds = inds$xyz)
    

    or should i use

    trj <- fit.xyz(combpdb$xyz, mobile = dcd, fixed.inds = inds$xyz, mobile.inds = inds$xyz)
    
  3. Xinqiu Yao

    Then, you should use

    trj <- fit.xyz(pdb$xyz, mobile = dcd, fixed.inds = sele4$xyz, mobile.inds = sele4$xyz)
    cij <- dccm(trj[, sele4$xyz])
    
  4. Kajwal Kumar Patra

    Dear sir, i have used

    trj <- fit.xyz(pdb$xyz, mobile = dcd, fixed.inds = sele4$xyz, mobile.inds = sele4$xyz)
    cij <- dccm(trj[, sele4$xyz])
    

    and also

    trj <- fit.xyz(pdb$xyz, mobile = dcd, fixed.inds = inds$xyz, mobile.inds = inds$xyz)
    cij <- dccm(trj[, inds$xyz])
    

    still i am getting two different plots. but both the 'inds' and 'sele4' contains the same atoms with same cordinates. why so??

  5. Xinqiu Yao

    Are you sure they are the same thing? Can you plot each content and show them here? For example print(sele4$atom) and print(inds$atom)

  6. Kajwal Kumar Patra

    print(sele4$atom) [1] 728 19389 23872 23907 23921 24208 24285 24345 24431 24448 24639 25277 25295 25456 25710 26954 [17] 27099 28016 28037

    print(inds$atom) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

    but the atoms and their coordinates are same for both 'sele4', and 'inds'

    what is wrong in using 'inds' instead of 'sele4' ??

  7. Xinqiu Yao

    'inds' or 'sele4', they are just INDEX! They don't contain any atom coordinate information. When you provide a 'pdb' object (such as you did in calling fit.xyz), these indices are used to extract coordinates from the pdb object.

    For example, if you use 'sele4' in the above fit.xyz(), the coordinates of atoms #728, 19389, ... (which are corresponding to the atoms you selected) will be used. On the other hand, if you use 'inds', the coordinates of atoms #1, 2, 3... (which are NOT the selected atoms) will be used.

    A message you can take home: Always use the 'selection' object along with the 'pdb' from which the 'selection' was made.

  8. Kajwal Kumar Patra

    thank you sir, its been a pleasure to discuss with you.

    i will be bothering you in future,

    kk

  9. Kajwal Kumar Patra

    Hello sir,

    i have tried this

    pdb <- read.pdb("nw_NOL_neutral_shift.pdb")
    dcd <- read.dcd("sys1_stride20.dcd")
    > sele1 <- atom.select(pdb, resno = c(137,142,145,119,117,150,149,218,164,206,233,207,120,311,319,374,375), chain = "D",elety ="CA")
    > sele2 <- atom.select(pdb, resno = 333, chain = "C", elety = "CA")
    > sele3 <- atom.select(pdb, resno = 156, chain = "A", elety = "CA")
    > sele4 <- combine.select(sele1, sele2, sele3, operator = "+")
    trj <- fit.xyz(fixed = pdb$xyz, mobile = dcd, fixed.inds = sele4$xyz, mobile.inds = sele4$xyz)
    > cij <- dccm(trj[, sele4$xyz])
    > plot(cij, contour=FALSE)
    > net <- cna(cij)
    print(sele4$atom)
     [1]   728 19389 23872 23907 23921 24208 24285 24345 24431 24448 24639 25277 25295 25456 25710
    [16] 26954 27099 28016 28037
    > pa <- cnapath(net, from= 728, to=25456, k=50)
    

    it shows error: Error in .Call("R_igraph_get_shortest_paths", graph, as.igraph.vs(graph, : At iterators.c:759 : Cannot create iterator, invalid vertex id, Invalid vertex id

    when i tried

    pa <- cnapath(net, from=1, to=14, k=50)
    

    **No path found. Please check if the network contains isolated parts!

    Warning message: In .Call("R_igraph_get_shortest_paths", graph, as.igraph.vs(graph, : At structural_properties.c:4517 :Couldn't reach some vertices**

    then i wanted to know the resno of selection "sele4", so for that , i tried this

    resno <- pdb$atom[pdb$calpha, "resno"]
    > resno[sele4$atom]
    

    i got [1] 353 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA, obviously which is not right.

    now what is wrong in my cnapth calculation and above 'resno' finding ??

    thanks

  10. Xinqiu Yao

    The first error is apparent: you have built a network for a subset of the pdb and so you cannot use the atom indices that were constructed from original pdb. (The second command for path analysis is correct, i.e. use 1 and 14, respectively, to specify the source and sink nodes).

    The warning means your network probably contains parts that are completely separated from others (no edge connecting them). If the source node belongs to one part and the sink node belongs to the other non-connected part, no path is to be found. This can happen if you have a very "sparsely" connected network when e.g. the cutoff is too high for the input correlation values. You can check the network connectivity by eyes using the function pymol.dccm() (See help(pymol.dccm) for more details). And then, if possibly, use a smaller cutoff (e.g. net <- cna(cij, cutoff.cij=0.3) and try it again.

    The last error, you again mis-used the atom indices: sele4 is constructed based on the original pdb but resno is corresponding to a subset (only C-alpha atoms) of the pdb. Simply pdb$atom[sele4$atom, 'resno'] will give you the righ answer.

    I strongly recommend you go through some basic R learning materials (for example, you can find some useful links here), and then the bio3d tutorials (make sure you understand every step in the tutorials). This could help you to understand the matrix/vector, sub-matrix/sub-vector, etc. those basic R concepts and better use of bio3d in future. Good luck!

  11. Log in to comment