- edited description
IOError on reading obsolete "previous msgid" entries
I was trying to load a PO file from GNOME at http://l10n.gnome.org/POT/evolution.master/evolution.master.ca.po (attached), and I got the following error:
{{{
!python
dpm@el-far:~$ ipython In [1]: import polib
In [2]: po = polib.pofile('/home/dpm/evolution.master.ca.po')
IOError Traceback (most recent call last)
/home/dpm/<ipython console> in <module>()
/home/dpm/polib.py in pofile(pofile, kwargs)
100 file (optional, default: False
).
101 """
--> 102 return _pofile_or_mofile(pofile, 'pofile', kwargs)
103
104 # }}}
/home/dpm/polib.py in _pofile_or_mofile(f, type, **kwargs) 71 check_for_duplicates=kwargs.get('check_for_duplicates', False) 72 ) ---> 73 instance = parser.parse() 74 instance.wrapwidth = kwargs.get('wrapwidth', 78) 75 return instance
/home/dpm/polib.py in parse(self) 1261 else: 1262 raise IOError('Syntax error in po file %s (line %s)' % \ -> 1263 (self.instance.fpath, i)) 1264 1265 if self.current_entry:
IOError: Syntax error in po file /home/dpm/evolution.master.ca.po (line 23110)
}}}
It seems polib is crashing on the following entry, in particular at the #~| msgid ""
line:
{{{
, fuzzy
~| msgid ""
~| "Error on %s\n"
~| "%s"
~ msgid ""
~ "Error on %s: %s\n"
~ "%s"
~ msgstr ""
~ "S'ha produït un error en %s:\n"
~ "%s"
}}}
Looking at other files on the GNOME l10n site, I can see more instances of {{{#~|}}}. These seem to be generated automatically by a gettext tool (probably msgmerge) when marking "previous msgid" fuzzy entries as obsolete.
Looking at http://www.gnu.org/software/gettext/manual/gettext.html#PO-Files says nothing on the format of obsolete entries, so I understand that the docs leave a wee bit too much room for guessing in the implementation of a parser.
In any case, if it's generated by a gettext tool, it would be good if polib would either ignore or treat #~|
instances as obsolete entries instead of raising an exception.
Thanks!
Comments (7)
-
-
- edited description
-
- edited description
-
- changed title to IOError on reading obsolete "previous msgid" entries
-
On the other hand, it seems that msgmerge or whatever generates the
#~|
entries tends to create mismatched msgids without msgstr. I've observed this on the Catalan (above), Spanish and Simplified Chinese versions of that same PO file:Spanish:
#~| msgid "Attendees" #~ msgid "Attendee_s" #~ msgstr "Participante_s"
Simplified Chinese:
#~| msgid "An error occurred while printing" #~ msgid "An error occurred while sending." #~ msgstr "发送时出现了一个错误。"
So I'm wondering whether to simplify things obsolete previous msgid entries (i.e. those starting with
#~|
) should be ignored altogether. -
- changed status to open
Hi David !
Sorry for the delay...
I'm not sure either what's the best solution, ignoring seems the easiest thing to do, would you mind working on a patch for this ? If not, no problem, but it may take some time till I work on it.
regards,
-- David
-
- changed status to resolved
- Log in to comment