Subtle bug/difference difference in Python 2 and 3 with SavWriter.spssDateTime

Issue #51 resolved
Ivar Refsdal created an issue

Hi,

and thanks for an excellent library.

When porting my program from Python 2 to 3, I discovered a subtle difference when calling SavWriter.spssDateTime:

#python 2:
writer.spssDateTime("2000-01-01", "%Y-%m-%d") ## this will return 13166064000.0

#python 3:
writer.spssDateTime("2000-01-01", "%Y-%m-%d") ## this will return sysmis (aka -1.7976931348623157e+308)

While I now recognize that the first argument to spssDateTime should be binary, Python 2 is happy with a string. spssDateTime in Python 3 on the other hand will throw an exception on datetimeStr.decode("utf-8"), catch the exception and return sysmis.

I therefore propose the following to support both strings and binary in Python 2 and 3:

    def spssDateTime(self, datetimeStr=b"2001-12-08", strptimeFmt="%Y-%m-%d"):
        """ This function converts a date/time string into an SPSS date,
        using a strptime format. See also :ref:`dateformats`"""
        try:
            if type(datetimeStr)==bytes:
                datetimeStr = datetimeStr.decode("utf-8")
            dt = time.strptime(datetimeStr, strptimeFmt)
        except (ValueError, TypeError, AttributeError):
            return self.sysmis
        day, month, year = dt.tm_mday, dt.tm_mon, dt.tm_year
        hour, minute, second = dt.tm_hour, dt.tm_min, dt.tm_sec
        return (self.convertDate(day, month, year) +
                self.convertTime(0, hour, minute, second))

Feedback? Should I make a pull request? It's just a single new line of code and an indentation of the following line.

Best regards Ivar Refsdal

Comments (5)

  1. Albert-Jan Roskam repo owner

    Hi Ivar,

    Thanks for reporting this. Yes, I agree that it not very friendly towards the user if a fault (not an error) like this occurs. I started writing the package in Python 2 (with bytestrings), and later made the code compatible with Python 3. But it is tricky to make each and every function accept both bytestrings and unicode strings. I would be grateful if you could make a pull request. Btw, using isinstance appears to be a tiny bit faster:

    In [1]: %timeit type("s")==bytes
    10000000 loops, best of 3: 92.9 ns per loop
    
    In [2]: %timeit isinstance("s", bytes)
    10000000 loops, best of 3: 76.5 ns per loop
    

    Best wishes, Albert-Jan

  2. Ivar Refsdal reporter

    Hi Albert-Jan,

    thanks for the swift reply. I've submitted a pull request now. Let me know if something needs to be further fixed. Regards, Ivar

  3. Albert-Jan Roskam repo owner

    Hi Ivar, thanks a lot for the pull request. Much appreciated!

    Best wishes, Albert-Jan

  4. Log in to comment