- edited description
Format misspecified bug
I'm having misspecified errors whenever I try to format data in savReaderWriter 3.4.2.
Here's an example...
kwargs = dict(savFileName='/tmp/date.sav', varNames=['aDate'],
varTypes={'aDate': 0}, formats={'aDate': 'DATETIME11.1'})
with savReaderWriter.SavWriter(**kwargs) as writer:
spssDateValue = '1-Jan-1900 00:00:00'
writer.writerow([spssDateValue])
Traceback (most recent call last):
File "/Users/yanokwa/Desktop/test.py", line 95, in <module>
with savReaderWriter.SavWriter(**kwargs) as writer:
File "/usr/local/lib/python2.7/site-packages/savReaderWriter/savWriter.py", line 223, in __init__
self.formats = formats
File "/usr/local/lib/python2.7/site-packages/savReaderWriter/header.py", line 456, in formats
raise SPSSIOError(msg % (varName, format_), retcode1)
savReaderWriter.error.SPSSIOError: format for 'aDate' misspecified ('DATETIME11.1')
Using a format of DATETIME40
works great.
Comments (10)
-
reporter -
repo owner Hi,
You need to use
SavWriter.spssDateTime
to do the conversion. From the docs:kwargs = dict(savFileName='/tmp/date.sav', varNames=['aDate'], varTypes={'aDate': 0}, formats={'aDate': 'EDATE40'}) with SavWriter(**kwargs) as writer: spssDateValue = writer.spssDateTime(b'2010-10-25', '%Y-%m-%d') writer.writerow([spssDateValue])
Does the resolve your issues?
Best wishes, Albert-Jan
-
repo owner (I am not behind my computer now so I can't check if Datetime11.1 is indeed falsely flagged as misspecified)
-
reporter I believe it is falsely flagged. This fails...
kwargs = dict(savFileName='/tmp/date.sav', varNames=['aDate'], varTypes={'aDate': 0}, formats={'aDate': 'DATETIME11.1'}) with savReaderWriter.SavWriter(**kwargs) as writer: spssDateValue = writer.spssDateTime(b'1900-01-01 00:00:00', '%Y-%m-%d %H:%M:%S') writer.writerow([spssDateValue])
-
reporter Hi @fomcl! Anything I can do to help you replicate and fix this bug? I'm not familiar with the codebase, but glad to go poking around if you point me in the right direction!
-
repo owner Hi,
Sorry, I am a bit behind with this and I pushed a few commits that fail on windows so I wanted to wait with this. I've also got the flu at the moment darn autumn :-)
Anyway, I've only ever used datetimes without fractional seconds, but apparently it is possible to specify some of these formats with them. It seems that the SPSS I/O does not know its own specs [1]. In
header.py
[2] line 453retcode
is greater than zero. So perhaps:is_fractional_datetime = re.match(b"(datetime|dtime|time)[0-9]{,2)[.][0-9])" , format_, re.I) # line added (untested) if retcodes.get(retcode1) == "SPSS_INVALID_PRFOR" and not is_fractional_datetime: # invalid PRint FORmat msg = "format for %r misspecified (%r)" raise SPSSIOError(msg % (varName, format_), retcode1)
Let me know if this works! (and with which Python version and platform)
Thanks!
Albert-Jan
[1] https://www.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/syn_date_and_time_date_time_formats.htm [2] https://bitbucket.org/fomcl/savreaderwriter/src/404fa9206f63234041b724d80fffdba9f9d4b6bf/savReaderWriter/header.py?at=master&fileviewer=file-view-default
-
reporter I'm run an active OSS project, so I totally understand falling behind :)
I think the regex is missing brackets and doesn't always capture fractional datetime. This might work better.
is_fractional_datetime = re.match(b"(datetime|dtime|time)[0-9]{1,2}\.[0-9]" , format_, re.I) # line added (untested)
With that fix, it does indeed bypass the SPSS_INVALID_PRFOR branch, but it still errors out.
File "/usr/local/lib/python2.7/site-packages/savReaderWriter/savWriter.py", line 223, in __init__ self.formats = formats File "/usr/local/lib/python2.7/site-packages/savReaderWriter/header.py", line 460, in formats checkErrsWarns(msg, retcode1) File "/usr/local/lib/python2.7/site-packages/savReaderWriter/error.py", line 120, in checkErrsWarns raise SPSSIOError(msg, retcode) savReaderWriter.error.SPSSIOError: Problem setting format_ 'DATETIME11.1' for 'aDate' [SPSS_INVALID_PRFOR]
retcode1: 20 and retcode2: 21
Glad to test on Python 2/3 on Mac and Python 2/3 on Ubuntu once we get this ironed out.
-
repo owner Hi,
Well, it turns out that the SPSS I/O module was correct after all. If you look more closely at the IBM page I cited before, you see that for DATETMEw.d, the minumum value of w = 22. So DATETIME22.1 is okay, but DATETIME21.1 is not. Heh. :-) So I adjusted the error message a bit so this is hopefully a bit clearer. I also wrote a bunch of unittests to check this. I tested things with Python 2.7 through 3.5 and pypy, on Debian Linux 64. Will test it shortly on Windows.
I would really appreciate it if you could run the tests under Mac. I sometimes run the tests on a Hackintosh VM, but this is a very old version. I have seen reports that the I/O libraries may not always be loaded, and problems with the locale.
But thanks, because I am no longer behind with this now :-)
Best wishes, Albert-Jan
-
repo owner - changed status to resolved
-
reporter Thanks for the bug fix! Glad to help run the tests, but how do I do that? The more step-by-step commands you can provide the better :)
- Log in to comment