1. David Jean Louis
  2. polib
Issue #19 resolved

Disappearing newline characters

Seraphim Mellos
created an issue

Hello,

I just found what appears to be a bug in polib which affects all versions after 0.6. In msgids, there are cases where trailing newlines are removed from the pofile when using the str() method to convert the loaded PO/POT back to text. Here is an example:

I have a POT file with a single PO entry containing this:

{{{

, python-format

msgid "" "There was an error running your transaction for the following reason: %s\n" msgstr ""

}}}

However when I load it in the python interpreter I get these results:

{{{

for entry in pot: entry.msgid u'There was an error running your transaction for the following reason: %s\n'

}}}

which seems correct but:

{{{

print pot.str()

SOME DESCRIPTIVE TITLE.

Copyright (C) YEAR Red Hat, Inc.

This file is distributed under the same license as the PACKAGE package.

FIRST AUTHOR EMAIL@ADDRESS, YEAR.

, fuzzy

msgid "" msgstr "" "Project-Id-Version: test\n" "Report-Msgid-Bugs-To: testing@example.com\n" "POT-Creation-Date: 2011-02-10 11:42-0500\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME EMAIL@ADDRESS\n" "Language-Team: LANGUAGE LL@li.org\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Language: \n" "Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n"

, python-format

msgid "" "There was an error running your transaction for the following reason: " "%s " msgstr ""

}}}

in which you can see that the trailing '\n' in the msgid is missing. I've tried it with 0.6 and 0.6.2 and both seem to be affected so it probably has to do with the changes from 0.5.5 to 0.6.

If you need any more info let me know.

Cheers

Comments (10)

  1. David Jean Louis repo owner

    The problem is in the textwrap module :(

    >>> import textwrap
    >>> textwrap.wrap('"There was an error running your transaction for the following reason: %s\n"')
    ['"There was an error running your transaction for the following reason:', '%s "']
    

    I'm really starting to hate this textwrap module...

  2. Seraphim Mellos reporter

    It seems that wrap() converts all whitespace characters (including newlines) to spaces and there's no option to change this behavior. One way would be to trick the wrapper by renaming newline characters to "something" and turn them back into newlines after the wrapping but if that "something" is longer than two characters, then this will affect the wrapping as well.

    All in all, I don't think that a clean solution using textwrap is possible :/

  3. Seraphim Mellos reporter

    This should work but it's like the conversion thing I mentioned which means it'll affect the wrapping (by adding more characters in a sentence). But I guess this isn't such a big deal after all :P

  4. David Jean Louis repo owner

    Well, this seems to work, but the problem, as you noted, is that the wrapwidth is wrong since the escape() call adds an extra character to the string...

    Not sure what's the best solution, I'd say it's better than to loose characters... Another solution is to revert to no wrapping at all, because this is starting to be a nightmare :(

  5. Seraphim Mellos reporter

    I'd say that wrapping the escaped sentence and then converting back is the way to go since wrapping usually occurs in spaces or hyphens so the escaped sequences won't mess with the wrapping that much unless a sentence has way too many such sequences. Anyway, the decision is up to you :)

  6. Log in to comment