memory issue with bio3d

Issue #156 resolved
bala.chandramouli created an issue

Hello, I am trying bio3d on a 64bit machine running with Ubuntu 14, having a RAM of 4GB and swap of 8GB. I am using bio3d 2.1 with R ver. 3.0

I was successful in installing bio3d and its dependencies. However i find bio3d is extremely memory intensive and slow in reading a trajectory of ~900MB (amber.netcdf traj). Moreover even with smaller trajectories: 5000 frames, 340 Ca atoms per frame, the fit function completely freezes the computer. From the top command i realize that my swap is not used at all. I dnt know if this is an issue with R or bio3d. Your insights will be of great help to me.

Thanks, Bala

Comments (4)

  1. Lars Skjærven

    Hi Bala, You're right. The current version of Bio3D reads the entire trajectory into memory. You should thus be careful with large (> GBs) trajectory files. A 340 amino acid CA-only trajectory shouldn't be a any problem though. Did you try with a clean R session? As an example, this 20.000 frame trajectory takes only 31 MB:

    > trj = read.ncdf("prod_CA.nc")
    [1] "Reading file prod_CA.nc"
    [1] "Produced by program: cpptraj"
    [1] "File conventions AMBER version 1.0"
    [1] "Frames: 19843"
    [1] "Atoms: 72"
    
    > format(object.size(trj), "Mb")
    [1] "31.4 Mb"
    

    See also this thread for a related issue.

  2. Xinqiu Yao

    And also, remember that by default AMBER uses single precision for trajectories while R uses double. It means that when your file is ~900MB, it will become ~1.8GB in memory. In this case, if you have already big data in your workspace, reading the trajectory may consume all your memory.

    As Lars suggested, start a clean R session and try again the trajectory with CA atoms only. Make sure that you don't have previously saved workspace automatically loaded (Type 'ls()' to see if it is empty).

    Let us know if it still doesn't work. We may also need the exact commands you used that caused the problems.

  3. bala.chandramouli reporter

    Lars, Xin,

    1) Thank you both of you. Firstly i checked if the R session is clean, the ls() command resulted in character(0). It seems it is clean. I new to R as well.

    2) checking the object size resulted in an error which i dnt get.

    format(object.size(trj), "Mb") Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, : invalid 'trim' argument

    3) My original trajectory of ~8000 protein atoms with a size of 900MB. However extracting trajectory of CA coordiantes made the job faster. Since most of my analysis rest on CA, i think this should be fine for me.

    Thanks, Bala

  4. Log in to comment