Cant JSON serialize sqlalchemy.util.langhelpers.symbol

Issue #3422 resolved
Ken Sheppardson created an issue

Starting with version 0.9.0, simplejson sees util.symbol objects as integers, causing it to generate invalid JSON.

We're trying to serialize stack traces and function arguments as JSON, and although we've defined a custom JSONEncoder default() method to handle non-standard data types, we're getting invalid JSON like:

...
"args": [
    "<sqlalchemy.orm.attributes.ScalarAttributeImpl object at 0x4966090>",
    "<sqlalchemy.orm.state.InstanceState object at 0x52ca650>",
    "{'_data': {'follower_count': None, 'followers': None, 'following': None, 'locale': u'zh_hans', 'region': None, 'timezone': u'Africa/Cairo', 'utc_offset': Decimal('2.0')}, '_modified': {'locale': u'zh_hans'}, '_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x52ca650>, 'locale': u'zh_hans'}",
    symbol('PASSIVE_OFF')
],
...

...because while our custom default() method handles data types simplejson doesn't recognize, it thinks "symbol('PASSIVE_OFF')" is an int, and dumps the raw value into the JSON.

To reproduce (using python 2.7.6, sqlalchemy 1.0.4, simplejson 3.6.5):

Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlalchemy as sa
>>> import json
>>> sym = sa.util.symbol("PASSIVE_OFF")
>>> data = ['str1', 123, sym]
>>> print json.JSONEncoder().encode(data)
["str1", 123, symbol('PASSIVE_OFF')]
>>> print sym
symbol('PASSIVE_OFF')
>>> type(sym)
<class 'sqlalchemy.util.langhelpers.symbol'>
>>> isinstance(sym,int)
True

Python 3 handles this slightly differently: rather than "symbol('PASSIVE_OFF')" in the JSON, we see integers like -4343837943729882877 in the output of encode().

Comments (3)

  1. Mike Bayer repo owner

    IMO this is a bug in Python's json library. This is more than just serializing it correctly, these are constants that are compared using "is", so you need to pull them from the module.

    However, I'm not able to get it to recognize the int subclass, in the script below "default" fails:

    from sqlalchemy import util
    import sys
    import json
    
    
    def _str_to_symbol(dct):
        if '__sa_symbol__' in dct:
            module = sys.modules[dct['module']]
            return getattr(module, dct['name'])
        else:
            return dct
    
    
    def _symbol_to_str(obj):
        if isinstance(obj, util.symbol):
            return {
                "__sa_symbol__": 'true',
                'module': obj.__module__, 'name': obj.name}
        else:
            return obj
    
    
    def json_loads(string):
        return json.loads(string, object_hook=_str_to_symbol)
    
    
    def json_dumps(obj):
        return json.dumps(obj, default=_symbol_to_str)
    
    
    from sqlalchemy.orm import attributes
    sym = attributes.ATTR_WAS_SET
    
    dumped = json_dumps([1, 2, sym])
    loaded = json_loads(dumped)
    assert loaded[2] is attributes.ATTR_WAS_SET
    

    Though I can't see why "need to be able to JSON serialize the internals of the library" has to be a supported use case. If these constants were just simple ints, your program would still be failing, as these need to be the same int value; it's not safe to rely on "1 is 1" in Python, I've seen this fail on Pypy under cloudy circumstances.

  2. Ken Sheppardson reporter

    Mike,

    Though I can't see why "need to be able to JSON serialize the internals of the library" has to be a supported use case.

    For context, this came up here at Rollbar, where we're trying capture as much app context information as is possible/practical when an exception's thrown.

    We have a hack/workaround for our issue (https://github.com/rollbar/pyrollbar/issues/60)

    ...and I've created a simplejson issue (https://github.com/simplejson/simplejson/issues/118)

    I'm not sure if I can really add anything else.

  3. Log in to comment