Moin crashed on Cyrillic URLS (internal links)

buzmakov avatarbuzmakov created an issue

When I try to go on page wich contain cyrillic symbols like http://localhost:8080/Домашняя moin2 raise "UnicodeEncodeError" Full stacktrace:

UnicodeEncodeError
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

Traceback (most recent call last)
File "/home/makov/opt/moin-2.0/env/lib/python2.7/site-packages/flask/app.py", line 1518, in __call__
return self.wsgi_app(environ, start_response)
File "/home/makov/opt/moin-2.0/env/lib/python2.7/site-packages/flask/app.py", line 1506, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/home/makov/opt/moin-2.0/env/lib/python2.7/site-packages/flask/app.py", line 1504, in wsgi_app
response = self.full_dispatch_request()
File "/home/makov/opt/moin-2.0/env/lib/python2.7/site-packages/flask/app.py", line 1264, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/makov/opt/moin-2.0/env/lib/python2.7/site-packages/flask/app.py", line 1262, in full_dispatch_request
rv = self.dispatch_request()
File "/home/makov/opt/moin-2.0/env/lib/python2.7/site-packages/flask/app.py", line 1248, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/makov/opt/moin-2.0/MoinMoin/apps/frontend/views.py", line 197, in show_item
flaskg.user.addTrail(item_name)
File "/home/makov/opt/moin-2.0/MoinMoin/user.py", line 652, in addTrail
item_name = getInterwikiName(item_name)
File "/home/makov/opt/moin-2.0/MoinMoin/util/interwiki.py", line 135, in getInterwikiName
return "{0}:{1}".format(app.cfg.interwikiname, item_name)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

Comments (10)

  1. Thomas Waldmann

    fix:

    diff -r adc727efd464 MoinMoin/util/interwiki.py
    --- a/MoinMoin/util/interwiki.py        Wed Dec 07 19:41:49 2011 +0100
    +++ b/MoinMoin/util/interwiki.py        Sat Dec 10 18:18:36 2011 +0100
    @@ -132,7 +132,7 @@
         :rtype: unicode
         :returns: wiki_name:item_name
         """
    -    return "{0}:{1}".format(app.cfg.interwikiname, item_name)
    +    return u"{0}:{1}".format(app.cfg.interwikiname, item_name) 
    

    problem: str.format is used at a thousand places this way, needs global review / fixing.

  2. Anonymous

    With the above fix and this one, I am able to use unicode in item_name:

    diff -r 5606fc37570b MoinMoin/signalling/log.py
    --- a/MoinMoin/signalling/log.py        Sun Apr 22 13:26:48 2012 +0200
    +++ b/MoinMoin/signalling/log.py        Sun Apr 22 15:07:54 2012 +0200
    @@ -15,7 +15,7 @@
     @item_displayed.connect_via(ANY)
     def log_item_displayed(app, item_name):
         wiki_name = app.cfg.interwikiname
    -    logging.info("item {0}:{1} displayed".format(wiki_name, item_name))
    +    logging.info(u"item {0}:{1} displayed".format(wiki_name, item_name))
    
    
  3. Ivan Gavrilov

    Also log_item_modified function should be changed as well.

    diff -r ac1059572d80 MoinMoin/signalling/log.py
    --- a/MoinMoin/signalling/log.py        Fri Jun 29 16:04:11 2012 +0200
    +++ b/MoinMoin/signalling/log.py        Sat Jul 07 15:55:16 2012 +0400
    @@ -15,9 +15,9 @@
     @item_displayed.connect_via(ANY)
     def log_item_displayed(app, item_name):
         wiki_name = app.cfg.interwikiname
    -    logging.info("item {0}:{1} displayed".format(wiki_name, item_name))
    +    logging.info(u"item {0}:{1} displayed".format(wiki_name, item_name))
     
     @item_modified.connect_via(ANY)
     def log_item_modified(app, item_name):
         wiki_name = app.cfg.interwikiname
    -    logging.info("item {0}:{1} modified".format(wiki_name, item_name))
    +    logging.info(u"item {0}:{1} modified".format(wiki_name, item_name))
    

    FYI:

    $ grep -R --include="*.py" "\.format" |wc -l

    461

  4. Thomas Waldmann

    note to whoever will fix this: you need to know how unicode vs. str works in python to fully understand the problem.

    But you don't need to know cyrillic. Testing can be done with any non-ascii character. :)

  5. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.