Named HTML entities are much neater and much easier to comprehend than numeric entities. And because they fall within the ASCII range, they're much safer to use in multiple contexts than Unicode and its various encodings (UTF-8 and such).
This module helps convert from numerical HTML entites and Unicode characters that fall outside the normal ASCII range into named entities.
from namedentities import named_entities u = u'both em\u2014and–dashes…' print named_entities(u)
from namedentities import named_entities u = 'both em\u2014and–dashes…' print(named_entities(u)) # same result
Or using the six cross-version compatibility library, either one:
from namedentities import named_entities import six u = six.u('both em\u2014and–dashes…') six.print_(named_entities(u)) # same result
- Doesn't attempt to encode <, >, or & (or their numerical equivalents) to avoid interfering with HTML escaping.
- This is basically a packaging of Ian Beck's work. Thank you, Ian!
pip install namedentities
To easy_install under a specific Python version (3.3 in this example):
python3.3 -m easy_install namedentities
(You may need to prefix these with "sudo " to authorize installation.)