Name parsing gets confused when a part of the name starts with a character in braces
Consider these examples:
>>> from pybtex.database import Person
>>> print Person('Charles Darwin')
Person(u'Darwin, Charles')
OK. But what if I want the abbreviation to be "Ch. Darwin" instead of "C. Darwin"?
>>> print Person('{Ch}arles Darwin')
Person(u'{Ch}arles Darwin')
Too bad, "{Ch}arles" in interpreted as a last name". You can say I'm trying to game the system, but what other option is there with special characters?
>>> print Person('{\AA}ke Jonsson')
Person(u'Jonsson, {\\AA}ke')
>>> print Person(u'{Å}ke Jonsson')
Person(u'{\xc5}ke Jonsson')
It works fine with TeX-encoded characters, but these are converted early when processing a bib file and the second form is actually what is seen by the time the names in a .bib file are split. At least as far as I can see when using the sphinxcontrib-bibtex plugin, but in any case, in my opinion, for the purpose of name parsing:
{Ch}
should be considered as an uppercase letter
{Å}
should be considered as an uppercase letter
{\relax D}
could be considered as a lowercase letter (in case I want a "De" particle to be parsed as a "von" part).
May I suggest the following change in the is_von_name
function in database/__init__.py
, which seems to do what I want?:
if brace_level == 0 and char.isalpha():
return char.islower()
elif brace_level == 1 and char.startswith('\\'):
return special_char_islower(char)
elif brace_level == 1 and char.isalpha():
return char.islower()
(the last two lines are my addition)
Comments (2)
-
-
Note: It seems braces anywhere in the name are problematic, not just if a name starts with a brace.
- Log in to comment
Possibly related: https://github.com/mcmtroffaes/sphinxcontrib-bibtex/issues/105
Consider the following bib entry, it was giving me problems when using sphinx + sphinxcontrib-bibtex: