fomcl / savReaderWriter / issues / #63 - savReaderWriter Upgrade Error — Bitbucket

Issue #63 closed

Former user created an issue 2017-08-01

Hi,

I just upgraded my version of you package from 3.3.0 to 3.4.2 and when I run my python code which was working perfectly fine on 3.3.0 I get this error.

ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Comments (6)

Albert-Jan Roskam repo owner
Hi,

Thanks for taking the time to report this issue. Can you give some minimal code that reproduces the problem? Thanks!

Albert-Jan
- 2017-08-01T19:19:33+00:00

Rhys John Miller

Hi,

for chunk in pd.read_csv(filePath, chunksize=10**6):

                records = chunk.values 
                args =  (columnNames,lengthOfFields)

with savReaderWriter.SavWriter(filePath,*args) as writer:
      writer.writerows(records)

2017-08-02T07:51:38+00:00

Albert-Jan Roskam repo owner
Thanks. It seems that numpy.isnan is called with a str value. I am on vacation right now, but I'll fix this when I'm back.

Meanwhile, could you try writer.writerows(chunk)? I modified writerows a while ago so it accepts more datatypes, e.g. pandas.DataFrame and numpy.array.

Best regards, Albert-Jan
- 2017-08-02T12:26:32+00:00
Rhys John Miller
Thanks for replying whilst on holiday! I will let you enjoy yourself! But using the chunk instead give me the following error..
```
Cannot do inplace boolean setting on mixed-types with a non np.nan value
```
- 2017-08-02T12:34:11+00:00
Albert-Jan Roskam repo owner
- changed status to closed
added fix and unittests that (hopefully) close issue ~~#63~~

→ <<cset c2e9129a9653>>
- 2017-09-01T18:16:07+00:00

Albert-Jan Roskam repo owner

Hi Rhys,

As you can see I just added some code to fix first problem you described. However, I could not reproduce the second issue ("Cannot do inplace boolean setting on mixed-types with a non np.nan value"). I have no idea whether I fixed that because I haven't done much with this project lately. Anyway, could you try whether this solves your problem? Below is the test that I wrote (actually, two)

Thanks!

Albert-Jan

from os.path import join
from tempfile import gettempdir
from io import StringIO
from unittest.case import SkipTest

import pandas as pd

import savReaderWriter as srw


skip = False  # only Pypy

def test_writerows_pd_np_issue63():
    """
    issue #63 "ufunc 'isnan' not supported for the input types"
    Caused by strings that contained NaN values
    """
    if skip:
        raise SkipTest
    buff = StringIO(u"""n1,n2,s1,s2
    1,1,a,a
    2,2,b,bb
    3,,c,""")
    desired = [[1.0, 1.0, b'a', b'a'], 
               [2.0, 2.0, b'b', b'bb'], 
               [3.0, None, b'c', b'']]

    df = pd.read_csv(buff, chunksize=10**6, sep=',').get_chunk()
    arr = df.values
    savFileName = join(gettempdir(), "check.sav")
    kwargs = dict(varNames = list(df),
                  varTypes = dict(n1=0, n2=0, s1=1, s2=2),
                  savFileName = savFileName,
                  ioUtf8=True)

    # numpy
    with srw.SavWriter(**kwargs) as writer:
          writer.writerows(arr)
    with srw.SavReader(savFileName) as reader:
        actual = reader.all(False)
    assert actual == desired, actual

    # pandas
    with srw.SavWriter(**kwargs) as writer:
          writer.writerows(df)
    with srw.SavReader(savFileName) as reader:
        actual = reader.all(False)
    assert actual == desired, actual

2017-09-01T18:23:48+00:00

Log in to comment

Assignee: –

Type: bug

Priority: major

Status: closed

Votes: 0

Watchers: None

Jira: the preferred issue tracker for Bitbucket. Join the team!