The unicodedata module itself is doing this correctly, it just seems to be the unicode methods that cause problems.
Unsure what you mean. For example, unicodedata/unicodedb_5_2_0.py will crash with an IndexError (a segfault after translation) if you call isspace(x) for a value of x not in the official unicode range. So I would say instead that unicodedata is not doing anything about this by itself.
Carl Friedrich Bolz-Tereick
I am saying that a lot of the functions in the applevel unicodedata module deal with too large characters in a non-crashing way: