UnicodeDecodeError when loading .sav into pandas
Issue #75
new
I have a large SPSS file which I am attempting to load into a pandas dataframe. A number of the columns have different types of special characters, including Chinese and accented.
The documentation suggests doing the below:
with SavReader('greetings.sav', ioUtf8=True) as reader:
for record in reader:
print(record[-1])
and my code looks like below:
rawdata = []
with SavReader('largefile.sav', ioUtf8=True) as reader:
for record in reader:
try:
rawdata.append(record)
except UnicodeDecodeError:
r = record.decode('latin-1')
rawdata.append(r.encode('utf-8'))
data = pd.DataFrame(raw_data_list)
data = data.rename(columns=data.loc[0]).iloc[1:]
Running the above, triggers an error on for record in error
which looks like this. How can I get around this?