(more) problems with python3
Trying the following code from the documentation under python3 (on Windows) gives me multiple errors:
data = sav.SavReader(spss_file, returnHeader=True, ioUtf8=True, ioLocale="german")
# directly setting 'ioLocale' to, e.g., 'de_DE.cp1252' yields an error
with data:
allData = data.all()
print(str(data))
print(data[2, 3])
allData = np.array(allData)
First, when Setting ioLocale=None or ioLocale="de_DE.cp1252", I get the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-f4dd1e808d2f> in <module>()
1 #spss_file = os.path.join(r"C:\Users\RitschelG\Projekte\SPSS-Test", r"KOPIE_Rep22_Analysedatei.SAV")
2
----> 3 data = sav.SavReader(spss_file, returnHeader=True)
4 # directly setting 'ioLocale' to, e.g., 'de_DE.cp1252' yields an error
5 with data:
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\savReader.py in __init__(self, savFileName, returnHeader, recodeSysmisTo, verbose, selectVars, idVar, rawMode, ioUtf8, ioLocale)
55 """ Constructor. Initializes all vars that can be recycled """
56 super(SavReader, self).__init__(savFileName, b"rb", None,
---> 57 ioUtf8, ioLocale)
58 self.savFileName = savFileName
59 self.returnHeader = returnHeader
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\header.py in __init__(self, savFileName, mode, refSavFileName, ioUtf8, ioLocale)
29 def __init__(self, savFileName, mode, refSavFileName, ioUtf8, ioLocale=None):
30 """Constructor"""
---> 31 super(Header, self).__init__(savFileName, ioUtf8, ioLocale)
32 self.spssio = self.loadLibrary()
33 self.libc = cdll.LoadLibrary(ctypes.util.find_library("c"))
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\generic.py in __init__(self, savFileName, ioUtf8, ioLocale)
37 if not self.encoding_and_locale_set:
38 self.encoding_and_locale_set = True
---> 39 self.ioLocale = ioLocale
40 self.ioUtf8 = ioUtf8
41
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\generic.py in ioLocale(self, localeName)
388 self.setLocale = func(c_int(locale.LC_ALL), c_char_py3k(localeName))
389 if self.setLocale is None:
--> 390 raise ValueError("Invalid ioLocale: %r" % localeName)
391 return self.setLocale
392
ValueError: Invalid ioLocale: 'de_DE.cp1252'
Setting ioLocale="german" seems to work, however.
Second, the print statement in the above code (from the documentation) yields
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-ab18e6f08cb1> in <module>()
5 with data:
6 allData = data.all()
----> 7 print(str(data))
8 #print(data[2, 3])
9 allData = np.array(allData)
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\savReader.py in __unicode__(self)
134 """This function returns a conscise file report of the spss data file,
135 For example unicode(SavReader(savFileName))"""
--> 136 return self.getFileReport()
137
138 @property
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\savReader.py in getFileReport(self)
483 for cnt, varName in enumerate(self.varNames):
484 lbl = "string" if self.varTypes[varName] > 0 else "numerical"
--> 485 format_ = self.formats[varName].decode(self.fileEncoding)
486 varName = varName.decode(self.fileEncoding)
487 varlist.append(line % (cnt + 1, varName, format_, lbl))
AttributeError: 'str' object has no attribute 'decode'
The obvious reason is that in python3 there is no method decode for strings, anymore.
Third, the numpy type data access to the array also Fails with the error:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-28d52d549cc7> in <module>()
6 allData = data.all()
7 #print(str(data))
----> 8 print(data[2, 3])
9 allData = np.array(allData)
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\savReader.py in __getitem__(self, key)
248 start, stop, step = key.indices(self.nCases)
249 elif is_array_slice:
--> 250 return self._get_array_slice(key, self.nCases, len(self.header))
251 else:
252 key = operator.index(key)
C:\Users\AppData\Local\Continuum\Anaconda3\lib\site-packages\savReaderWriter\savReader.py in _get_array_slice(self, key, nRows, nCols)
340 result = numpy.array(list(records))[key].tolist()
341 if abs(key[1].start - key[1].stop) == 1:
--> 342 return reduce(list.__add__, result) # flatten list if it's one col
343 if is_index:
344 return result[0]
NameError: name 'reduce' is not defined
The obvious reason: In python3 there is no builtin function reduce, anymore.
Comments (8)
-
repo owner -
(I am the original author of this issue. Just had to create an account ...)
Thanks, AJ, for your ultra-fast response!
-
ioLocale="de_DE.cp1252"
: that's the Unix way of specifying a locale, so that won't work on windows.- Yes, but when I don't specify any locale, i.e.
ioLocale=None
, I still get the error:ValueError: Invalid ioLocale: 'de_DE.cp1252'
. Does it mean that the Unic way of specifying a locale, namely 'de_DE.cp1252', gets autoselected?
- Yes, but when I don't specify any locale, i.e.
-
Third, the numpy type data access to the array also Fails with the error: I fixed that quite some time ago (I can't even remember when but I just checked the code).
- Just for completeness: I used the pip installation. So, a pretty recent version, right? I will also check with the git version.
-
Btw, if you pip install savReaderWriter from the repository you could use SavReaderNp which reads .sav data straight into numpy arrays.
- Thanks for the information! I didn't know this.
Cheers, Gerhard
-
-
repo owner Hi Gerhard,
The pip installation from Pypi is version 3.3.0, which is fairly old by now. Better to use the repo version (Note: I usually run the unittests in Debian Linux. My Jenkins CI configuration is kinda messed up :-)
Well the
de_DE.cp1252
thing is weird!. I think this is a bug in Python. In my Dutch locale, getlocale returns a WIndows-like locale tuple:('Dutch_Netherlands', '1252')
. But I a German (and Spanish, and maybe more) a Unix-like tuple is returned (which is what you mentioned). This causes the SPSS I/O spssSetLocale function to return None, and I raise a ValueError if that happens. I think I should use the return value of the setter (not a typo) to get the locale. I checked this with Python 3.3.2 (activepython), 3.4.2 (miniconda) and 2.7.5 (activepython).ActivePython 3.3.2.0 (ActiveState Software Inc.) based on Python 3.3.2 (default, Sep 16 2013, 23:11:39) [MSC v.1600 64 bit (AMD64)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>> import locale >>> locale.getlocale() (None, None) # because setocale has not been called yet >>> locale.setlocale(locale.LC_ALL, "") 'Dutch_Netherlands.1252' >>> locale.getlocale() ('Dutch_Netherlands', '1252') >>> locale.setlocale(locale.LC_ALL, "german") 'German_Germany.1252' >>> locale.getlocale() ('de_DE', 'cp1252') # incorrect, unix-like! >>> locale.setlocale(locale.LC_ALL, "German_Germany.1252") 'German_Germany.1252' >>> locale.getlocale() ('de_DE', 'cp1252') # incorrect, unix-like! >>> locale.setlocale(locale.LC_ALL, "spanish") 'Spanish_Spain.1252' >>> locale.getlocale() ('es_ES', 'cp1252') # incorrect, unix-like! >>> locale.setlocale(locale.LC_ALL, "italian") 'Italian_Italy.1252' >>> locale.getlocale() ('Italian_Italy', '1252') # correct!
I am very upset now! :-)
-
repo owner getlocale replaced with setlocale, see issue
#26→ <<cset df032471df0f>>
-
repo owner Well, I did a quick fix in commit df032471df. I succesfully ran the unittests in Python 2.7, 3.3, 3.4, pypy2.7 but Under Debian Linux.
-
repo owner FYI I just submitted a bug in the Python Bug Tracker: http://bugs.python.org/issue23425
-
repo owner All right, I just checked it. The fix works, but it turned out I introduced another error a while ago. It affected Windows but not Linux. The code now works again in Windows and Linux. I specifically checked if I could open a .sav file while in a German host locale, with (a) ioLocale=None or (b) ioLocale="german" . Both work in Python 2.7 and 3.4 in Win 7 64.
-
repo owner - changed status to resolved
Fixed in commit b225071
- Log in to comment
Hi,
Thanks for taking the time to report this.
First, when Setting
ioLocale=None
orioLocale="de_DE.cp1252"
, I get the following error:ioLocale=None
: that sounds similar to issue#24. I follow the recommended way to initialize the locale by callinglocale.setlocale(locale.LC_ALL, "")
. But for some reason that sometimes causes problems. I would be very interested to hear about possible explanations/fixes.ioLocale="de_DE.cp1252"
: that's the Unix way of specifying a locale, so that won't work on windows.Second, the print statement in the above code (from the documentation) yields...: Thanks. I thought I had a test case for this but apparently this is not working properly. It might be the
@implements_to_string
class decoratorThird, the numpy type data access to the array also Fails with the error: I fixed that quite some time ago (I can't even remember when but I just checked the code).
Btw, if you pip install savReaderWriter from the repository you could use
SavReaderNp
which reads .sav data straight intonumpy
arrays.Best wishes, Albert-Jan