read.pdb - maxlines

Issue #30 resolved
Lars Skjærven created an issue

why not set default value of maxlines to -1 ?

Comments (7)

  1. Barry Grant

    Because it could be very very slow with big structures and I would rather not leave a user hanging waiting for a prompt to come back. Also I want to encourage folks to use netcdf for multimodel files.

    There is a danger with having a maxlines set as it is currently is that a user might continue with a truncated pdb object without noticing the returned warning message. So perhaps a warning would be better placed if the file is over a certain number of lines but the function should continue on with the maxlines=-1 setting and require a Ctrl-C?

  2. Lars Skjærven reporter

    I would prefer maxlines=-1. It's not only multi model files. e.g. groel (pdb 1svt: 63k lines). also, must functions, e.g. read.table and readLines uses maxlines (or nrows) = -1.

  3. Lars Skjærven reporter

    241k lines runs in roughly 20 sec. Ctrl-C works.

    > raw = readLines("ho.pdb")
    > length(raw)
    [1] 241514
    
    > system.time(read.pdb("ho.pdb", maxlines=-1))
       user  system elapsed 
     21.635   0.068  21.771 
    
    > system.time(read.pdb("ho.pdb", maxlines=-1))
    ^C
    Timing stopped at: 10.762 0.016 10.811 
    
  4. Log in to comment