Skipping of lines in user file

Issue #343 resolved
Fabian Metzger created an issue

When one intends to skip more lines than the user file has, BDSIM starts to enter an infinite loop cannot start the simulation.

Comments (4)

  1. Laurie Nevay

    Thanks for the proposed fix (branch in git repo). So what is the behaviour now if you have say a file with 10 lines or particles; and you request to skip say 15?

    So we have two parameters so far… nlinesIgnore and nlinesSkip. And we have the behaviour that we loop if we reach the end. But we should not loop if matchDistrFileLength is requested. http://www.pp.rhul.ac.uk/bdsim/manual-develop/model_control.html#userfile

    If we request nlinesIgnore > nlinesInFile, then we should complain or throw an exception.

    If we request nlinesSkip > nlinesInFile… I think we should probably issue a warning if no matchDistrFileLength or if (!matchDistrFileLength) throw an exception also. Don’t you think?

    Similarly with nlinesIgnore + nlinesInFile > nlinesInFile.

  2. Fabian Metzger reporter

    The behavior now is as follows: If we want to skip and/or ignore in total 15 lines and only have 10 lines in the file, we skip 10 lines, start again from the beginning, skip 5 lines, and read the remaining 5 lines. Then, we go back to the beginning and do the same again if more than 5 events have been requested.

    This includes always both variables (nlinesIgnore and nlinesSkip). So, we don’t issue a warning or throw an exception yet. We probably should discuss this further if we want to do this or not. This would result in BDSIM not simulating in case we request to skip/ignore more lines than we have in the userfile, but do we want this to happen?

    In the case we want to match the distribution file length, we probably need to include another check (I’m not sure if it’s there yet).

  3. Laurie Nevay

    So it’s important to not mix “ignore” and “skip” although they seem similar. “ignore” is for say comments / header info. i.e. lines that cannot be interpreted as coordinates. The “skip” is then how far to into the ‘data’ to start reading lines. Usually, we might expect ‘ignore’ to a low number like 1-10 and skip to be 0-quite large, but there’s no strict limit.

    Therefore, if nlinesIgnore > len(file), this is definitely an exception. Maybe it’s best to write out the scenarios.

    No matching of file length (looping permitted):

    • nlinesIgnore > len(file) → exception
    • nlinesSkip > len(file)-nlinesIgnore → exception (we’re skipping all the data but then wanting to loop on it again… doesn’t make sense)

    Matching distribution length:

    • previous two conditions… +
    • n events should be len(file) - nlinesIgnore - nlinesSkip which should be > 0 (the userfile has no filter on loading data so this is ok)

    So, I’ll need to read what you’ve done again, but I think it should basically be not a change to the looping behaviour but a check on variables and issue exceptions…

    We don’t want to read say 90% from the end of the file and 10% from the beginning in a scenario where we’re trying to divide up 1 big coordinate files into N runs of bdsim as this would contaminate the statistics / introduce bias.

  4. Laurie Nevay

    This has been fixed properly now. A lot of the file handling among the four input file types has been rewritten and common base classes used. The default behaviour is now to match the file length automatically. You can specify fewer events also. You can generally specify to loop the complete file an integer number of times too. There are specific warnings for bad behaviour that might contaminate your statistics.

  5. Log in to comment