1. Pierre Carbonnelle
  2. pyDatalog
  3. Issues
Issue #8 new

Unintuitive errors on nested queries

Leo Antunes
created an issue

Hi Pierre,

I have facts that are declared over python objects like this (simplified) example:

from pyDatalog import pyDatalog

class Something():
    pass

s = Something()

pyDatalog.create_terms('pred')

+pred(s)

This works just fine. However, if the object itself tries to do stuff based on information from pyDatalog, we get a pretty unhelpful error like this:

<repr(<pyDatalog.pyParser.Query at 0x7f7f9680f2b0>) failed: AttributeError: 'NoneType' object has no attribute 'append'>

An example for this is if the object declares a rich __eq__ method to check for equality based on facts declared for this object. Then whenever we have (O1 == O2) in the query, we get the error above, which I imagine happens because during the answering of the original query, a pyEngine.ask() is called a second time.
Another example is if the object's __str__ uses queries to describe itself (since pyDatalog uses the object's textual representation internally)

To make matters worse, if the query is inside of a function and you just check the used variables for content, you won't even see there was an error. Like this:

query_that_calls_nested_query(X)
print(X.data)
# the query silently failed and we didn't see it

In order to actually see it, you have to check for the return of the query direcly, since the query "masks" the exception as a result.

Ideally, I imagine pyDatalog could support nested queries, but since I see that the code is full of globals, I guess it would be pretty hard to realize. If that assumption is correct, I'd suggest at least two things

  1. make pyEngine.ask() check if it's a nested call and abort with a descriptive error and a full traceback (as opposed to this "hidden" query error)
  2. add a tip to the documentation about this limitation (sorry if there is already a mention to it, but I couldn't find it). Bonus points if you can list exactly what methods in an object must not include queries (at least __str__, __eq__, maybe __contains__?)

Also, feel free to downgrade this to an "enhancement" if you feel it better describes the issue. I was just a bit frustrated after hours of bug hunting ;)

Cheers

PS.: this is not a regression, it affects both 0.14.5 and 0.14.6

Comments (12)

  1. Leo Antunes reporter

    Sure. My first idea was something like this

    from pyDatalog import pyDatalog, Logic
    
    pyDatalog.create_terms('has_some_property, X, Y')
    class Something():
        def __init__(self, prop):
            self.prop = prop
            +has_some_property(self, prop)
        def __eq__(self, other):
            has_some_property(other, X)
            return X.data == self.prop
        def __hash__(self):
            return id(self) # just to demonstrate
    
    s = Something('a')
    

    Now making this query should show you what I mean:

    has_some_property(X, Y)
    

    Of course this can be rewritten so that the equality test is done explicitly in the rules' body, but in my case the test is a bit more complex and would mean a lot of code duplication. So I hope it's understandable why I wanted to do things this way.

    But again: I understand if this is not feasible, my main point is that the error should be move obvious.

  2. Leo Antunes reporter

    Great! That seems to have done it! Apparently it wasn't as hard as I imagined! :)

    As a bonus this seems to allow us to write python-predicates that do nested queries, like this:

    from pyDatalog import pyDatalog
    
    pyDatalog.create_terms('a, X, Y')
    
    @pyDatalog.predicate()
    def b2(x, y):
        a(X, Y)
        yield(X.v(),Y.v())
    pyDatalog.create_terms('b')
    
    +a(1,2)
    b(1, X)
    

    I imagine this doesn't perform particularly well, but I'm sure there are reasons for using it.

    Huge thanks for the quick help!

  3. Leo Antunes reporter

    Oh, wait, it seems there's a regression: the variable used in the query seems to contain the value it held in the nested query. Like so:

    pred(X)   # returns 'a'
    X.data    # returns 'b'
    

    This doesn't happen in the contrived example above, but it seems to be the case in my more complex code. I still haven't managed to create a minimal example for the problem though.

  4. Leo Antunes reporter

    Hi Pierre, Unfortunately I still didn't manage to find a minimal example. I'm a bit swamped at the moment so I had to concentrate on other things. As a workaround I'm simply saving the result of the query. That is, the following works just fine and is equivalent to the code in my comment above:

    ans = pred(X)
    ans[0][0]
    

    Whenever I get the time to try and pinpoint this issue further, I'll let you know!

  5. Leo Antunes reporter

    I'm getting a lot of regressions between b1a5df9 and e62fe95 and unfortunately don't have the time right now to bissect them (it may be something wrong on my side). Give me a few more days and I'll try to get back to you about this. Regardless: thanks for all the help!

  6. Log in to comment