#40 Merged at 5c63651
Repository
Deleted repository
Branch
default (778c91283a73)
Repository
pytave
Branch
default

Add isequal method to allow comparing pyobjects (fixes issue #53)

Author
  1. Abhinav Tripathi
Reviewers
Description
  • @pyobject/pyobject.m: Add an 'isequal' method to fix a BIST test and some more tests.
  • Issues #53: Pyobject needs "isqual" method resolved

Comments (32)

  1. Abhinav Tripathi author

    Sorry for the noise (I don't know if you people get mails for each of my commit edits)... I saw that one of the changes I did was already there in pull request #39 so I removed it from here.

    After that, I changed my repo on bitbucket to non-publishing repo (I remember Mike Miller said about doing this sometimes back). Although my commits were shown as drafts on bitbucket, I could not overwrite them (without stripping) if I did hg commit --amend on my pc. There was one benefit that I did on have to change the phase of the commit on my pc.

    1. Mike Miller repo owner

      Good, yes setting the repo to non-publishing simply means that commits will stay in draft phase so you can rewrite them. You will still need to strip them on the Bitbucket side if you rewrite and push a new copy unfortunately.

      There is an open issue to allow you to enable the Changeset Evolution plugin plugin on Bitbucket repos, which will make drafting and rewriting and rebasing a whole lot smoother (plugin needs to be installed and enabled both client-side and server-side for best results).

  2. Mike Miller repo owner

    In the Octave language, isequal performs both a value comparison and a type comparison. isequal is not the same thing as the operator ==. isequal (2, int8(2)) is false because they are not the same type. I think we want to keep that consistent for Python objects.

  3. Abhinav Tripathi author

    For the isequal thing.. Should I do x == y and type(x) == type(y) ?? Because we cannot call object.__eq__ as it gives "NotImplemented" for lists!! Precisely, if a = [1, 2, 3]; b = [1, 2, 3]; then object.__eq__(a,b) is NotImplemented while a == b and type(a) == type(b) is True...

    I will update this as per your other comments.

  4. Mike Miller repo owner

    I think the safest and most robust comparisons would be pycall ("operator.eq", x, y) and pycall ("operator.eq", x.__class__, y.__class__).

    Note that type(x) may be different from x.__class__ in Python 2, the latter is better for that reason.

    In Python 2, some classes define __cmp__ and some classes define __eq__. In Python 3, all classes use __eq__ and __cmp__ is no longer used at all. Calling operator.eq(x, y) takes care of doing the same thing that x == y would do for all versions and all objects without needing to define your own inline function.

  5. Colin Macdonald

    isequal (2, int8(2)) is false

    Citation needed. I've always thought it should work that way, but it doesn't. So when I implemented isequal in Symbolic, I didn't do that.

    1. Mike Miller repo owner

      Good question, we could make that go either way. I'm leaning towards making it false for any non-pyobject arguments. In practice a returned value will either be stored in a pyobject or automatically converted into an Octave type. If two objects are equal, then they both should be pyobjects or they both should have been converted to Octave types.

      For example, an open issue is to leave int objects (in Python 3) as pyobjects. After that change is made, I think that isequal (py.int (2), 2)) should be false, because Octave are intentionally not converting the object into one of the native types.

      Other opinions?

    2. Abhinav Tripathi author

      I have no problem with either approach. But, I would say to let it be true. Mostly because if a user calls a python function and wants to compare a value, he must be little worried about the type.. (ex: if in future we drop all the conversion to octave then too it should work)..

      1. Mike Miller repo owner

        And I would rather not see isequal return true for an implicit conversion of a cell array or a struct for example.

        x = py.dict (pyargs ("a", 1));
        y = struct ("a", 1);
        isequal (x, y)  ## should not be true
        

        The more I think about it the more I lean towards limiting the comparison to arguments that are already pyobjects. I think it's better to be precise and fix cases if necessary when it does not do what the user would expect.

        1. Abhinav Tripathi author

          Sure. In this case it seems fair to return false. But then I think it will get complicated to keep it true for integers or other cases.. Probably better to keep false then......

  6. Abhinav Tripathi author

    I have added the tests and updated this. I can add && isa (varargin{i-1}, "pyobject") && isa (varargin{i}, "pyobject") to res if we decide to keep the isequal false for different types of data.

    1. Mike Miller repo owner

      res = all (strcmp ("pyobject", cellfun ("class", varargin, "uniformoutput", false))) is a little more compact / vectorized way of doing this.

      1. Abhinav Tripathi author

        Thanks for that. I knew about cellfun but using the result into strcmp was new for me. I would have called cellfun twice to do that!

        Well, I've updated the PR now.

  7. Mike Miller repo owner

    I noticed another strange thing going on.

    Some Python objects, e.g. numpy arrays, may return a non-scalar value from the == operator.

    How would you deal with this in Python?

    x = [1, 2, 3]
    y = numpy.array((1, 2, 3))
    if (x == y):
        do_something();
    

    This throws a ValueError.

      1. Mike Miller repo owner

        Right, my point is how do we compute isequal for a numpy array? As it's written here, it's ambiguous.

        • If pycall("operator.eq", x, y) returns a scalar bool, fine.
        • If pycall("operator.eq", x, y) returns a tuple, runtime error. I don't know of a case that does this currently, but it's not impossible.
        • If pycall("operator.eq", x, y) returns a ndarray, it's ok because Octave computes the all truth value of a matrix when using &&, but it might be better to be explicit about this.
        • If pycall("operator.eq", x, y) returns a pyobject, logic error, because pyobjects don't convert to logical appropriately (yet).
    1. Abhinav Tripathi author

      If I do this in Octave:

      >> s = pyeval ("numpy.array ((1, 2, 3))" );
      >> ss = pyeval ("[1, 2, 3]" );
      >> isequal (s, ss)
      ans = 0
      

      It doesn't throw an error for comparing!?
      If a user if using pyexec to compare the 2 values using completely python code then it should probably be his worry to handle that...

      1. Mike Miller repo owner

        That is the third bullet point in my list above.

        My point is not about a user using pyexec for equality. My question is about Python in general, is there a way to compare any two Python objects for equality and always get a scalar boolean value?

        The Data Model docs suggest that if attempts to cast the result of __eq__ to a bool. That looks like

        res = res && pycall ("bool", pycall ("operator.eq", varargin{1}, varargin{i}));
        error: pycall: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
        

        for us. This causes an error when either or both arguments are a numpy array. Does that clarify my concerns?

          1. Mike Miller repo owner

            Please try it and see for yourself. Look at the intermediate value returned by the equality comparison and see how this might cause problems. Either step into the function with the Octave debugger or use something like this debug change to your for loop:

                tmp = pycall ("operator.eq", varargin{1}, varargin{i})
                printf ("Comparison #%d is %sa scalar\n", i-1, {"", "NOT "}{2 - isscalar (tmp)});
                res = res && tmp
            

            Here's what I see:

            >> isequal (pyobject (1), 1, 1, 1)
            tmp = 1
            Comparison #1 is a scalar
            res = 0
            tmp = 1
            Comparison #2 is a scalar
            res = 0
            tmp = 1
            Comparison #3 is a scalar
            res = 0
            ans = 0
            

            That looks good for a simple scalar object. Here's what I see when working on a numpy array:

            >> isequal (pyobject ([1, 2, 3]), [1, 2, 3])
            tmp =
            
              1  1  1
            
            Comparison #1 is NOT a scalar
            res = 0
            ans = 0
            

            This is not good. The only reason there is no error is because Octave automatically does an all() on the value to convert it to a boolean scalar. This is fragile and I would rather take steps to ensure that the equality comparison of two Python objects always returns a scalar.

    1. Mike Miller repo owner

      Can you move the early-return condition before the pycall? That will be more efficient, save a few operations in the case where not all arguments are pyobjects, and save the superfluous extra check on the last iteration.

      Can you also add an %!error test for isequal (pyobject ())?

      Thanks for your patience and rewrites, with those two small changes I can take this.