Data out of order

Issue #21 resolved
Former user created an issue

Hi! I encountered a problem while using sas7bdat with Python 3.5. The data in the csv are somehow mingled. The first columns are OK however at the back end of the file the data are mixed and some of the zeros that lie in the middle columns are being moved to the end. Any idea? I think all the data are there but not where it should be.

Thank you!

Jouko

Comments (7)

  1. Kerby A Shedden

    I don't have a fix now, but the issue appears to be around line 707 in _process_byte_array_with_data.

    The bytes object that comes out of the RLE decompressor is too short (it should be 1488 bytes according to the row length property but it has only 1282 bytes). Apparently Python allows you to slice (but not index) beyond the end of a bytes object and you get an empty slice, but this is probably not what you should be getting. I suspect that the problem is in the RLE decompressor.

    I need this to work too so I will try to find some time to look into this later in the week.

  2. Kerby A Shedden

    By the way, if anyone knows about any documentation for the RLE decompressor please let us know.

  3. Kerby A Shedden

    I see the issue now. Expressions like "end_of_first_byte * 16" (line 121) overflow since end_of_first_byte is a single byte variable. I have a fixed copy locally and will PR hopefully later today. There are some similar issues in the RDC decompressor.

  4. Kerby A Shedden

    I made a PR for this here:

    https://bitbucket.org/kshedden/sas7bdat/pull-requests/1/rle-overflow-bug-closes-21/diff

    I'm not sure if I did this right, I haven't used bitbucket before.

    The issue is not what I stated above (Python doesn't have single byte ints). The issue is that the control bits were incorrectly ignored in one of the control codes. It should be fixed now, but there could possibly be issues with other control codes that will show up in future files.

  5. Jared Hobbs repo owner

    Thanks Kerby! The diff looks good. Please create a pull request here to get your changes into the main repo and we'll cut a new release.

    Jared

  6. Log in to comment