Suboptimal path calculation question

Issue #543 resolved
Kyle Stiers created an issue

Hello,

I have read through previous users' questions about 'source and sink' nodes in the cnapath() function. I am using a Luthey-Schulten filtered network (with the cm() function) and trying to decide appropriate from/to nodes. My main question is, are these nodes simply residue numbers, i.e. if you kept all Ca's for the analysis, is there 1:1 correspondence between nodes and Cas?

In my first attempt I simply set 'from' to the first residue of the protein and 'to' as the final residue. Then, trying to probe modes of communication in allosteric sites used what I believe to be the first Ca of one site as 'from' and the final residue of the distant site as 'to'. Is this valid? It seems to not be, because the paths are quite different between those two results - and there is a sphere drawn at every residue along the path, whereas the HIVP example only has 2 drawn despite "from=32, to=131".

Clarification on this would be greatly appreciated.

A separate note/question is when I run the line:

write.pdb(pdb, b=normalize.vector(node.betweenness), file="node-betweenness.pdb")

I get the error:

Error in write.pdb(pdb, b = normalize.vector(node.cut.betweenness), file = "node-cut-betweenness.pdb") : write.pdb: the lengths of all input vectors != 'length(xyz)/3'.

I tried solving this by using the ca.inds, but then it complains about $ operator with atomic vectors.

Thanks!

Comments (11)

  1. Xinqiu Yao

    Yes, nodes in the network are 1:1 correspondence to C-alpha atoms. Note that residue numbers in the network are renumbered starting from 1 - therefore, the first residue is '1' not the "residue number" shown in PDB, and the same for other residues.

    For the second question, could you provide more detail? For example, how did try to solve the problem by using ca.inds?

  2. Kyle Stiers reporter

    Hi Xin-Qiu,

    Thanks for that clarification, but I feel I'm still missing a slight nuance here. I'm aware that the renumbering occurs but in my situation my Ca selection starts with residue #1 in the PDB file, so the numbering should match up. I guess my main question is why does the HIVP example only show 2 nodes? Was that to simplify visualization for the tutorial, or a consequence of how things were calculated? I'm fixated on this because I'm very interested in seeing if the suboptimal path goes through a bound ligand to connect 2 sites, but takes a different path in the apo state. It looks to me like the HIVP example has other nodes in the path, and they just aren't shown for some reason.

    Because of the error listed above, saying input vectors don't equal length(xyz)/3 I assumed it needed something that would equal the length of the pdb (562) * 3 which equals 1686. This is the exact length of pdb$xyz[,ca.inds$xyz]. However, when I run:

    write.pdb(pdb$xyz[,ca.inds$xyz], b=normalize.vector(node.betweenness), file="node-betweenness.pdb")
    

    I get the error:

    Error in pdb$xyz : $ operator is invalid for atomic vectors
    
  3. Xinqiu Yao

    I see. The nodes except for "source" and "sink" in the figure of tutorial were turned off just for visualization purpose.

    The error seems indicate the pdb object has something wrong. Could you print it (simply type pdb and return in an R session) and see what it looks like? Also, make sure class(pdb) returns a vector containing 'pdb'.

    Let me know if you have further problems.

  4. Kyle Stiers reporter

    '>pdb

     Call:  read.pdb(file = "trjconv_first_frame.pdb")
    
       Total Models#: 1
         Total Atoms#: 95978,  XYZs#: 287934  Chains#: 1  (values: NA)
    
         Protein Atoms#: 8659  (residues/Calpha atoms#: 562)
         Nucleic acid Atoms#: 0  (residues/phosphate atoms#: 0)
    
         Non-protein/nucleic Atoms#: 87319  (residues: 10001)
         Non-protein/nucleic resid values: [ MG (1), SOL (10000) ]
    
       Protein sequence:
          MVKIVTVKTQAYQDQKPGTSGLRKRVKVFQSSANYAENFIQSIISTVEPAQRQEATLVVG
          GDGRFYMKEAIQLIARIAAANGIGRLVIGQNGILSTPAVSCIIRKIKAIGGIILTASHNP
          GGPNGDFGIKFNISNGGPAPEAITDKIFQISKTIEEYAVCPDLKVDLGVLGKQQFDLENK
          FKPFTVEIVDSVEAYATMLRSIFDFSALKELLSGPNRLKIRIDAM...<cut>...TVIT
    
    + attr: atom, xyz, calpha, call
    

    '>class(pdb)

    [1] "pdb" "sse"
    
  5. Xinqiu Yao

    Your pdb looks normal. Then, the error message is weird. Could you also check pdb$xyz and ca.inds (both printing them out and checking their classes. The two should be 'xyz' and 'select' class, respectively).

    Also, make sure length(normalize.vector(node.betweenness)) == length(ca.inds$atom).

    In addition, try spdb <- trim(pdb, inds=ca.inds); write.pdb(spdb, b=normalize.vector(node.betweenness), file="node-betweenness.pdb"). Note that by default, the first argument of write.pdb() should be a 'pdb' object instead of 'xyz'.

  6. Kyle Stiers reporter
    > pdb$xyz
    
       Total Frames#: 1
       Total XYZs#:   287934,  (Atoms#:  95978)
    
        [1]  99.09  51.97  18.82  <...>  76.52  36.76  69.15  [287934] 
    
    + attr: Matrix DIM = 1 x 287934
    > class(pdb$xyz)
    [1] "xyz"    "matrix"
    > ca.inds
    
     Call:  atom.select.pdb(pdb = pdb, elety = "CA")
    
       Atom Indices#: 562  ($atom)
       XYZ  Indices#: 1686  ($xyz)
    
    + attr: atom, xyz, call
    > class(ca.inds)
    [1] "select"
    > length(normalize.vector(node.cut.betweenness)) == length(ca.inds$atom)
    [1] TRUE
    > spdb <- trim(pdb, inds=ca.inds); write.pdb(spdb, b=normalize.vector(node.cut.betweenness), file="node-betweenness.pdb")
    

    Hmm. Looks like it worked when using trim instead of trying to pull them out of the PDB object directly. Is that simply due to the function expecting a PDB class object and how I was doing it was an XYZ object?

    Thanks Xin-Qiu!

  7. Xinqiu Yao

    You can do it using write.pdb(xyz=pdb$xyz[, ca.inds$xyz], ...), but the written pdb will lose sequence information and use 'ALA' to replace it.

  8. Aiswarya Pawar

    Hi,

    I tried to use the above command to write out the pdb file. But i couldnt highlight the regions as shown in the figure, ie the color and weight in the regions(yellow, orange,red). Please can you update what command to be used.

  9. Log in to comment