- changed status to wontfix
get UnicodeEncodingError when printing repr of a unicode string
Issue #2495
resolved
In 0.8.0b1
get this error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 26: ordinal not in range(128)
when trying to print the representation of a unicode string.
the input data it fails on is:
"u'Clinton Hits \\u2018Black Helicopters\\u2019 Crowd to Push Sea Treaty'"
My model class:
########################################################################
class CERssEntry(Base):
"""
CERssEntry Class
Create some class properties before initilization
"""
__tablename__ = "rssentries"
id = Column(Integer, primary_key=True)
title = Column(Unicode(255), nullable=False)
link = Column(Unicode(255), nullable=False)
published = Column(DateTime, nullable=False)
rssfeed_id = Column(Integer, ForeignKey('rssfeeds.id'))
# creates a bidirectional relationship
# from CERssEntry to CeRssFeed it's Many-to-One
# from CERssFeed to CERssEntry it's One-to-Many
rss_feed = relation(CERssFeed, backref=backref('rssentries', order_by=id))
#----------------------------------------------------------------------
def __init__(self, title, link, published):
"""Constructor"""
self.title = title
self.link = link
self.published = published
def __repr__(self):
return "<CERssEntry('%s', '%s', '%s')>" % (self.title, self.link, self.published)
in the def repr(self): it will fail to print self.title
Workaround:
in order to get it to print the title I needed to do:
def __repr__(self):
return "<CERssEntry('%s', '%s', '%s')>" % (repr(self.title.__repr__()), self.link, self.published)
note the change from:
self.title
to:
repr(self.title.__repr__())
Hope this helps.
Comments (6)
-
repo owner -
Account Deleted I made your suggested changes and it still fails.
-
repo owner Here's a script:
class CERssEntry(object): def __init__(self, title, link, published): self.title = title self.link = link self.published = published def repr_one(self): return "<CERssEntry('%s', '%s', '%s')>" % (self.title, self.link, self.published) def repr_two(self): return "<CERssEntry(%r, %r, %r)>" % (self.title, self.link, self.published) entry= CERssEntry(title=u'Clinton Hits \u2018Black Helicopters\u2019 Crowd to Push Sea Treaty', link="x", published="y") try: print entry.repr_one() except UnicodeEncodeError: print "yup!" print entry.repr_one().encode('utf-8') print entry.repr_two()
output:
classics-MacBook-Pro:sqlalchemy classic$ python test.py yup! <CERssEntry('Clinton Hits ‘Black Helicopters’ Crowd to Push Sea Treaty', 'x', 'y')> <CERssEntry(u'Clinton Hits \u2018Black Helicopters\u2019 Crowd to Push Sea Treaty', 'x', 'y')>
-
Account Deleted I get different output.. no failure and not sure why:
[(Sat May 26 14:47:39) ~/tmp](sidha@varga)% python test.py <CERssEntry('Clinton Hits ‘Black Helicopters’ Crowd to Push Sea Treaty', 'x', 'y')> <CERssEntry('Clinton Hits ‘Black Helicopters’ Crowd to Push Sea Treaty', 'x', 'y')> <CERssEntry(u'Clinton Hits \u2018Black Helicopters\u2019 Crowd to Push Sea Treaty', 'x', 'y')>
anyway, I've settled on the workaround for now.
-
repo owner your default encoding might be utf-8. but this would mean your original symptom shouldn't be occurring...
-
repo owner - removed milestone
Removing milestone: 0.8.0b1 (automated comment)
- Log in to comment
This seems basically like you're failing to encode the string before printing it. Python can't print a pure unicode string with non ascii characters in it, as the default encoding is "ascii".
Try it like this:
However, the spirit of repr() is that you're producing Python code, so really the repr() should just be this:
There's no SQLAlchemy bug here AFAICT.