Format misspecified bug

Issue #54 resolved
Yaw Anokwa created an issue

I'm having misspecified errors whenever I try to format data in savReaderWriter 3.4.2.

Here's an example...

kwargs = dict(savFileName='/tmp/date.sav', varNames=['aDate'],
              varTypes={'aDate': 0}, formats={'aDate': 'DATETIME11.1'})
with savReaderWriter.SavWriter(**kwargs) as writer:
    spssDateValue = '1-Jan-1900 00:00:00'
    writer.writerow([spssDateValue])
Traceback (most recent call last):
  File "/Users/yanokwa/Desktop/test.py", line 95, in <module>
    with savReaderWriter.SavWriter(**kwargs) as writer:
  File "/usr/local/lib/python2.7/site-packages/savReaderWriter/savWriter.py", line 223, in __init__
    self.formats = formats
  File "/usr/local/lib/python2.7/site-packages/savReaderWriter/header.py", line 456, in formats
    raise SPSSIOError(msg % (varName, format_), retcode1)
savReaderWriter.error.SPSSIOError: format for 'aDate' misspecified ('DATETIME11.1')

Using a format of DATETIME40 works great.

Comments (10)

  1. Albert-Jan Roskam repo owner

    Hi,

    You need to use SavWriter.spssDateTime to do the conversion. From the docs:

    kwargs = dict(savFileName='/tmp/date.sav', varNames=['aDate'],
                  varTypes={'aDate': 0}, formats={'aDate': 'EDATE40'})
    with SavWriter(**kwargs) as writer:
        spssDateValue = writer.spssDateTime(b'2010-10-25', '%Y-%m-%d')
        writer.writerow([spssDateValue])
    

    Does the resolve your issues?

    Best wishes, Albert-Jan

  2. Albert-Jan Roskam repo owner

    (I am not behind my computer now so I can't check if Datetime11.1 is indeed falsely flagged as misspecified)

  3. Yaw Anokwa reporter

    I believe it is falsely flagged. This fails...

    kwargs = dict(savFileName='/tmp/date.sav', varNames=['aDate'],
                  varTypes={'aDate': 0}, formats={'aDate': 'DATETIME11.1'})
    with savReaderWriter.SavWriter(**kwargs) as writer:
        spssDateValue = writer.spssDateTime(b'1900-01-01 00:00:00', '%Y-%m-%d %H:%M:%S')
        writer.writerow([spssDateValue])
    
  4. Yaw Anokwa reporter

    Hi @fomcl! Anything I can do to help you replicate and fix this bug? I'm not familiar with the codebase, but glad to go poking around if you point me in the right direction!

  5. Albert-Jan Roskam repo owner

    Hi,

    Sorry, I am a bit behind with this and I pushed a few commits that fail on windows so I wanted to wait with this. I've also got the flu at the moment darn autumn :-)

    Anyway, I've only ever used datetimes without fractional seconds, but apparently it is possible to specify some of these formats with them. It seems that the SPSS I/O does not know its own specs [1]. In header.py [2] line 453 retcode is greater than zero. So perhaps:

                is_fractional_datetime = re.match(b"(datetime|dtime|time)[0-9]{,2)[.][0-9])" , format_, re.I)  # line added (untested)
                if retcodes.get(retcode1) == "SPSS_INVALID_PRFOR" and not is_fractional_datetime:
                    # invalid PRint FORmat
                    msg = "format for %r misspecified (%r)"
                    raise SPSSIOError(msg % (varName, format_), retcode1)
    

    Let me know if this works! (and with which Python version and platform)

    Thanks!

    Albert-Jan

    [1] https://www.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/syn_date_and_time_date_time_formats.htm [2] https://bitbucket.org/fomcl/savreaderwriter/src/404fa9206f63234041b724d80fffdba9f9d4b6bf/savReaderWriter/header.py?at=master&fileviewer=file-view-default

  6. Yaw Anokwa reporter

    I'm run an active OSS project, so I totally understand falling behind :)

    I think the regex is missing brackets and doesn't always capture fractional datetime. This might work better. is_fractional_datetime = re.match(b"(datetime|dtime|time)[0-9]{1,2}\.[0-9]" , format_, re.I) # line added (untested)

    With that fix, it does indeed bypass the SPSS_INVALID_PRFOR branch, but it still errors out.

    File "/usr/local/lib/python2.7/site-packages/savReaderWriter/savWriter.py", line 223, in __init__
        self.formats = formats
      File "/usr/local/lib/python2.7/site-packages/savReaderWriter/header.py", line 460, in formats
        checkErrsWarns(msg, retcode1)
      File "/usr/local/lib/python2.7/site-packages/savReaderWriter/error.py", line 120, in checkErrsWarns
        raise SPSSIOError(msg, retcode)
    savReaderWriter.error.SPSSIOError: Problem setting format_ 'DATETIME11.1' for 'aDate' [SPSS_INVALID_PRFOR]
    

    retcode1: 20 and retcode2: 21

    Glad to test on Python 2/3 on Mac and Python 2/3 on Ubuntu once we get this ironed out.

  7. Albert-Jan Roskam repo owner

    Hi,

    Well, it turns out that the SPSS I/O module was correct after all. If you look more closely at the IBM page I cited before, you see that for DATETMEw.d, the minumum value of w = 22. So DATETIME22.1 is okay, but DATETIME21.1 is not. Heh. :-) So I adjusted the error message a bit so this is hopefully a bit clearer. I also wrote a bunch of unittests to check this. I tested things with Python 2.7 through 3.5 and pypy, on Debian Linux 64. Will test it shortly on Windows.

    I would really appreciate it if you could run the tests under Mac. I sometimes run the tests on a Hackintosh VM, but this is a very old version. I have seen reports that the I/O libraries may not always be loaded, and problems with the locale.

    But thanks, because I am no longer behind with this now :-)

    Best wishes, Albert-Jan

  8. Yaw Anokwa reporter

    Thanks for the bug fix! Glad to help run the tests, but how do I do that? The more step-by-step commands you can provide the better :)

  9. Log in to comment