savReaderWriter Upgrade Error

Issue #63 closed
Former user created an issue

Hi,

I just upgraded my version of you package from 3.3.0 to 3.4.2 and when I run my python code which was working perfectly fine on 3.3.0 I get this error.

ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Comments (6)

  1. Albert-Jan Roskam repo owner

    Hi,

    Thanks for taking the time to report this issue. Can you give some minimal code that reproduces the problem? Thanks!

    Albert-Jan

  2. Rhys John Miller

    Hi,

    for chunk in pd.read_csv(filePath, chunksize=10**6):
    
                    records = chunk.values 
                    args =  (columnNames,lengthOfFields)
    
    with savReaderWriter.SavWriter(filePath,*args) as writer:
          writer.writerows(records)
    
  3. Albert-Jan Roskam repo owner

    Thanks. It seems that numpy.isnan is called with a str value. I am on vacation right now, but I'll fix this when I'm back.

    Meanwhile, could you try writer.writerows(chunk)? I modified writerows a while ago so it accepts more datatypes, e.g. pandas.DataFrame and numpy.array.

    Best regards, Albert-Jan

  4. Rhys John Miller

    Thanks for replying whilst on holiday! I will let you enjoy yourself! But using the chunk instead give me the following error..

    Cannot do inplace boolean setting on mixed-types with a non np.nan value
    
  5. Albert-Jan Roskam repo owner

    Hi Rhys,

    As you can see I just added some code to fix first problem you described. However, I could not reproduce the second issue ("Cannot do inplace boolean setting on mixed-types with a non np.nan value"). I have no idea whether I fixed that because I haven't done much with this project lately. Anyway, could you try whether this solves your problem? Below is the test that I wrote (actually, two)

    Thanks!

    Albert-Jan

    from os.path import join
    from tempfile import gettempdir
    from io import StringIO
    from unittest.case import SkipTest
    
    import pandas as pd
    
    import savReaderWriter as srw
    
    
    skip = False  # only Pypy
    
    def test_writerows_pd_np_issue63():
        """
        issue #63 "ufunc 'isnan' not supported for the input types"
        Caused by strings that contained NaN values
        """
        if skip:
            raise SkipTest
        buff = StringIO(u"""n1,n2,s1,s2
        1,1,a,a
        2,2,b,bb
        3,,c,""")
        desired = [[1.0, 1.0, b'a', b'a'], 
                   [2.0, 2.0, b'b', b'bb'], 
                   [3.0, None, b'c', b'']]
    
        df = pd.read_csv(buff, chunksize=10**6, sep=',').get_chunk()
        arr = df.values
        savFileName = join(gettempdir(), "check.sav")
        kwargs = dict(varNames = list(df),
                      varTypes = dict(n1=0, n2=0, s1=1, s2=2),
                      savFileName = savFileName,
                      ioUtf8=True)
    
        # numpy
        with srw.SavWriter(**kwargs) as writer:
              writer.writerows(arr)
        with srw.SavReader(savFileName) as reader:
            actual = reader.all(False)
        assert actual == desired, actual
    
        # pandas
        with srw.SavWriter(**kwargs) as writer:
              writer.writerows(df)
        with srw.SavReader(savFileName) as reader:
            actual = reader.all(False)
        assert actual == desired, actual
    
  6. Log in to comment