warning about read.dcd() and some problems about protein structure network

Issue #586 resolved
lwx created an issue

Hi i am a newbie about bio3d I met some problems about Protein Structure Networks construction when i use read.dcd() there are some warnings: 1:In readChar(trj, nchars=4): can only read in bytes in a non-UTF-8 MBCS locale 2: In readChar(trj, 80) : can only read in bytes in a non-UTF-8 MBCS locale 3: In readChar(trj, 80) : can only read in bytes in a non-UTF-8 MBCS locale

I dont konw if these warning will influnce my consequence. can you show me whats worng with my dcd file? thank you very much.

another question, when i use the cna() dccm() with my dcd file, cna() will cost lots of my time, almost a week? I didnt remember how long it had taken.....
so , what can i do to reduce the time during computing? i tried the paramenter ncore ,but it seem doesnt work.. and i tried the cmap() to filter the matrix,but return this:

cm<-cmap(trj,dcut = 7,scut=0,pcut = 0.75,mask.lower = FALSE) |=============================== 38% Error: cannot allocate vector of size 298.8 Mb

thanks

Yours sincerely Liu

Comments (9)

  1. Xinqiu Yao

    Hi,

    Can you make an example trajectory that we can reproduce your errors/warnings? It could be a smaller version of your original trajectory by skipping every a few frames.

    I also recommend test your trajectory file using VMD and see if it can load it successfully.

    For your second question, it is likely that your network has too many edges (it normally completes in seconds). One solution is to increase the threshold cutoff.cij. The other one is to apply the cmap() to filter long range edges. The error you saw on calculating cmap is probably that your 'trj' is too large. Reduce its size first and try again.

    Here are some old issues related to your second question: https://bitbucket.org/Grantlab/bio3d/issues/316/slow-cna-too-slow https://bitbucket.org/Grantlab/bio3d/issues/448/very-slow-cna-cij-calculation https://bitbucket.org/Grantlab/bio3d/issues/162/q-a-question-about-bio3d-cna

  2. lwx reporter

    thank you very much for your repply . I know a lettle about the vmd and the trajectory, My work is to get the dynamic correlative network and analysis it . the trajectory file xx.dcd is form my comate. he used gromacs and got the xx.xtc file.then he changed the xtc to the dcd. I remenber that he said the trajectory is filtered to 100 frames.

  3. lwx reporter

    about the second question, i tried to use the computer with 164g RAM and it was solved.

    thankyou very much

  4. lwx reporter

    another question, i scanned the similar three issues, i saw that to filter the matrix ,you use this code: inds <- atom.select(pdb, elety="CA") cm <- cmap(trj[, inds$xyz], dcut=10, scut=0, pcut=0.75, mask.lower=FALSE) but in the instruction ,the code is : cm <- cmap(trj, dcut = 4.5, scut = 0, pcut = 0.75, mask.lower = FALSE) i want to konw whats the difference of this[],and how could it influent my consequnse. thank you very much!

  5. Xinqiu Yao

    The dcd you attached has no problem to read in my computer:

    > read.dcd('1wuh.dcd')
     NATOM = 8850 
     NFRAME= 101 
     ISTART= 0 
     last  = 101 
     nstep = 101 
     nfile = 101 
     NSAVE = 1 
     NDEGF = 0 
     version 24 
      |=================================================================================================================================================| 100%
    
       Total Frames#: 101
       Total XYZs#:   26550,  (Atoms#:  8850)
    
        [1]  68.89  71.66  52.38  <...>  68.4  35.42  42.83  [2681550] 
    
    + attr: Matrix DIM = 101 x 26550
    

    Is it the one causing problems on your side?

    About [], it is to pick up certain rows/columns (depending on where the comma is) from a matrix. For example, trj[, inds$xyz] means to pick up all rows but only columns specified by the indices inds$xyz (which is the positions of C-alpha atoms) from the matrix trj. Please follow the user guide to find resources for learning R (http://thegrantlab.org/bio3d/user-guide).

  6. lwx reporter

    hello,sorry to reply lately.

    i still have someproblems:

    • why they are different about the network with distance cutoff =7A (i constructed through NAPS, another tool )and the network with distance cutoff =7A and cijcutoff=0 (I constructed through bio3d, using trajectory in dcd format.)

    first network has 3700+ edges,and the second has 5700+deges

    this is my code :

    #!
    dcd  <-   read.dcd('1cza.dcd')
    pdb  <-   read.pdb('1cza.pdb')
    inds<-atom.select(pdb,elety = 'CA')
    trj<-fit.xyz(fixed = pdb$xyz,mobile = dcd,fixed.inds = inds$xyz,mobile.inds = inds$xyz)
    cij<-dccm(trj[,inds$xyz])
    cm<- cmap(trj[,inds$xyz], dcut=7, scut=0, pcut=0.75, mask.lower=FALSE)
    net<-cna(cij,cutoff.cij=0,cm=cm)
    

    the pdbfile has all atoms, and about the dcdfile, for convenient, i deleted all ligands in the simulation process. does it have a great effect on my results?

    Or if i made some mistakes about my codes? or it means that in the trajectory, the aminoacids of protein get densely in at least 75% frames? i think the difference is too large to be causing by this . and i used the 1cza.pdb in chimera to check it ,i found some nodes has nerghbors at a faraway position beyound 7A. if i should use dcdfile containing only calpha?

    • if there is a way to build a structural network only using distance cutoff in bio3d?
  7. lwx reporter

    sorry ,i made some mistakes..

    now the network seems right. and, can you tell me that if my code is correct

  8. Log in to comment