python-clinic / Doc / library / locale.rst

:mod:`locale` --- Internationalization services

The :mod:`locale` module opens access to the POSIX locale database and functionality. The POSIX locale mechanism allows programmers to deal with certain cultural issues in an application, without requiring the programmer to know all the specifics of each country where the software is executed.

The :mod:`locale` module is implemented on top of the :mod:`_locale` module, which in turn uses an ANSI C locale implementation if available.

The :mod:`locale` module defines the following exception and functions:


>>> import locale
>>> loc = locale.getlocale() # get current locale
# use German locale; name might vary with platform
>>> locale.setlocale(locale.LC_ALL, 'de_DE')
>>> locale.strcoll('f\xe4n', 'foo') # compare a string containing an umlaut
>>> locale.setlocale(locale.LC_ALL, '') # use user's preferred locale
>>> locale.setlocale(locale.LC_ALL, 'C') # use default (C) locale
>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale

Background, details, hints, tips and caveats

The C standard defines the locale as a program-wide property that may be relatively expensive to change. On top of that, some implementation are broken in such a way that frequent locale changes may cause core dumps. This makes the locale somewhat painful to use correctly.

Initially, when a program is started, the locale is the C locale, no matter what the user's preferred locale is. There is one exception: the :data:`LC_CTYPE` category is changed at startup to set the current locale encoding to the user's preferred locale encoding. The program must explicitly say that it wants the user's preferred locale settings for other categories by calling setlocale(LC_ALL, '').

It is generally a bad idea to call :func:`setlocale` in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.

If, when coding a module for general use, you need a locale independent version of an operation that is affected by the locale (such as certain formats used with :func:`time.strftime`), you will have to find a way to do it without using the standard library routine. Even better is convincing yourself that using locale settings is okay. Only as a last resort should you document that your module is not compatible with non-C locale settings.

The only way to perform numeric operations according to the locale is to use the special functions defined by this module: :func:`atof`, :func:`atoi`, :func:`.format`, :func:`.str`.

There is no way to perform case conversions and character classifications according to the locale. For (Unicode) text strings these are done according to the character value only, while for byte strings, the conversions and classifications are done according to the ASCII value of the byte, and bytes whose high bit is set (i.e., non-ASCII bytes) are never converted or considered part of a character class such as letter or whitespace.

For extension writers and programs that embed Python

Extension modules should never call :func:`setlocale`, except to find out what the current locale is. But since the return value can only be used portably to restore it, that is not very useful (except perhaps to find out whether or not the locale is C).

When Python code uses the :mod:`locale` module to change the locale, this also affects the embedding application. If the embedding application doesn't want this to happen, it should remove the :mod:`_locale` extension module (which does all the work) from the table of built-in modules in the :file:`config.c` file, and make sure that the :mod:`_locale` module is not accessible as a shared library.

Access to message catalogs

The locale module exposes the C library's gettext interface on systems that provide this interface. It consists of the functions :func:`gettext`, :func:`dgettext`, :func:`dcgettext`, :func:`textdomain`, :func:`bindtextdomain`, and :func:`bind_textdomain_codeset`. These are similar to the same functions in the :mod:`gettext` module, but use the C library's binary format for message catalogs, and the C library's search algorithms for locating message catalogs.

Python applications should normally find no need to invoke these functions, and should use :mod:`gettext` instead. A known exception to this rule are applications that link with additional C libraries which internally invoke :c:func:`gettext` or :func:`dcgettext`. For these applications, it may be necessary to bind the text domain, so that the libraries can properly locate their message catalogs.