1
:mod:`gettext` --- Multilingual internationalization services
2
=============================================================
5
:synopsis: Multilingual internationalization services.
6
.. moduleauthor:: Barry A. Warsaw <barry@python.org>
7
.. sectionauthor:: Barry A. Warsaw <barry@python.org>
9
**Source code:** :source:`Lib/gettext.py`
13
The :mod:`gettext` module provides internationalization (I18N) and localization
14
(L10N) services for your Python modules and applications. It supports both the
15
GNU ``gettext`` message catalog API and a higher level, class-based API that may
16
be more appropriate for Python files. The interface described below allows you
17
to write your module and application messages in one natural language, and
18
provide a catalog of translated messages for running under different natural
21
Some hints on localizing your Python modules and applications are also given.
24
GNU :program:`gettext` API
25
--------------------------
27
The :mod:`gettext` module defines the following API, which is very similar to
28
the GNU :program:`gettext` API. If you use this API you will affect the
29
translation of your entire application globally. Often this is what you want if
30
your application is monolingual, with the choice of language dependent on the
31
locale of your user. If you are localizing a Python module, or if your
32
application needs to switch languages on the fly, you probably want to use the
33
class-based API instead.
36
.. function:: bindtextdomain(domain, localedir=None)
38
Bind the *domain* to the locale directory *localedir*. More concretely,
39
:mod:`gettext` will look for binary :file:`.mo` files for the given domain using
40
the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where
41
*languages* is searched for in the environment variables :envvar:`LANGUAGE`,
42
:envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively.
44
If *localedir* is omitted or ``None``, then the current binding for *domain* is
48
.. function:: bind_textdomain_codeset(domain, codeset=None)
50
Bind the *domain* to *codeset*, changing the encoding of strings returned by the
51
:func:`gettext` family of functions. If *codeset* is omitted, then the current
55
.. function:: textdomain(domain=None)
57
Change or query the current global domain. If *domain* is ``None``, then the
58
current global domain is returned, otherwise the global domain is set to
59
*domain*, which is returned.
62
.. function:: gettext(message)
64
Return the localized translation of *message*, based on the current global
65
domain, language, and locale directory. This function is usually aliased as
66
:func:`_` in the local namespace (see examples below).
69
.. function:: lgettext(message)
71
Equivalent to :func:`gettext`, but the translation is returned in the
72
preferred system encoding, if no other encoding was explicitly set with
73
:func:`bind_textdomain_codeset`.
76
.. function:: dgettext(domain, message)
78
Like :func:`gettext`, but look the message up in the specified *domain*.
81
.. function:: ldgettext(domain, message)
83
Equivalent to :func:`dgettext`, but the translation is returned in the
84
preferred system encoding, if no other encoding was explicitly set with
85
:func:`bind_textdomain_codeset`.
88
.. function:: ngettext(singular, plural, n)
90
Like :func:`gettext`, but consider plural forms. If a translation is found,
91
apply the plural formula to *n*, and return the resulting message (some
92
languages have more than two plural forms). If no translation is found, return
93
*singular* if *n* is 1; return *plural* otherwise.
95
The Plural formula is taken from the catalog header. It is a C or Python
96
expression that has a free variable *n*; the expression evaluates to the index
97
of the plural in the catalog. See
98
`the GNU gettext documentation <https://www.gnu.org/software/gettext/manual/gettext.html>`__
99
for the precise syntax to be used in :file:`.po` files and the
100
formulas for a variety of languages.
103
.. function:: lngettext(singular, plural, n)
105
Equivalent to :func:`ngettext`, but the translation is returned in the
106
preferred system encoding, if no other encoding was explicitly set with
107
:func:`bind_textdomain_codeset`.
110
.. function:: dngettext(domain, singular, plural, n)
112
Like :func:`ngettext`, but look the message up in the specified *domain*.
115
.. function:: ldngettext(domain, singular, plural, n)
117
Equivalent to :func:`dngettext`, but the translation is returned in the
118
preferred system encoding, if no other encoding was explicitly set with
119
:func:`bind_textdomain_codeset`.
122
Note that GNU :program:`gettext` also defines a :func:`dcgettext` method, but
123
this was deemed not useful and so it is currently unimplemented.
125
Here's an example of typical usage for this API::
128
gettext.bindtextdomain('myapplication', '/path/to/my/language/directory')
129
gettext.textdomain('myapplication')
132
print(_('This is a translatable string.'))
138
The class-based API of the :mod:`gettext` module gives you more flexibility and
139
greater convenience than the GNU :program:`gettext` API. It is the recommended
140
way of localizing your Python applications and modules. :mod:`gettext` defines
141
a "translations" class which implements the parsing of GNU :file:`.mo` format
142
files, and has methods for returning strings. Instances of this "translations"
143
class can also install themselves in the built-in namespace as the function
147
.. function:: find(domain, localedir=None, languages=None, all=False)
149
This function implements the standard :file:`.mo` file search algorithm. It
150
takes a *domain*, identical to what :func:`textdomain` takes. Optional
151
*localedir* is as in :func:`bindtextdomain` Optional *languages* is a list of
152
strings, where each string is a language code.
154
If *localedir* is not given, then the default system locale directory is used.
155
[#]_ If *languages* is not given, then the following environment variables are
156
searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and
157
:envvar:`LANG`. The first one returning a non-empty value is used for the
158
*languages* variable. The environment variables should contain a colon separated
159
list of languages, which will be split on the colon to produce the expected list
160
of language code strings.
162
:func:`find` then expands and normalizes the languages, and then iterates
163
through them, searching for an existing file built of these components:
165
:file:`{localedir}/{language}/LC_MESSAGES/{domain}.mo`
167
The first such file name that exists is returned by :func:`find`. If no such
168
file is found, then ``None`` is returned. If *all* is given, it returns a list
169
of all file names, in the order in which they appear in the languages list or
170
the environment variables.
173
.. function:: translation(domain, localedir=None, languages=None, class_=None, fallback=False, codeset=None)
175
Return a :class:`Translations` instance based on the *domain*, *localedir*,
176
and *languages*, which are first passed to :func:`find` to get a list of the
177
associated :file:`.mo` file paths. Instances with identical :file:`.mo` file
178
names are cached. The actual class instantiated is either *class_* if
179
provided, otherwise :class:`GNUTranslations`. The class's constructor must
180
take a single :term:`file object` argument. If provided, *codeset* will change
181
the charset used to encode translated strings in the :meth:`lgettext` and
182
:meth:`lngettext` methods.
184
If multiple files are found, later files are used as fallbacks for earlier ones.
185
To allow setting the fallback, :func:`copy.copy` is used to clone each
186
translation object from the cache; the actual instance data is still shared with
189
If no :file:`.mo` file is found, this function raises :exc:`OSError` if
190
*fallback* is false (which is the default), and returns a
191
:class:`NullTranslations` instance if *fallback* is true.
193
.. versionchanged:: 3.3
194
:exc:`IOError` used to be raised instead of :exc:`OSError`.
197
.. function:: install(domain, localedir=None, codeset=None, names=None)
199
This installs the function :func:`_` in Python's builtins namespace, based on
200
*domain*, *localedir*, and *codeset* which are passed to the function
203
For the *names* parameter, please see the description of the translation
204
object's :meth:`~NullTranslations.install` method.
206
As seen below, you usually mark the strings in your application that are
207
candidates for translation, by wrapping them in a call to the :func:`_`
208
function, like this::
210
print(_('This string will be translated.'))
212
For convenience, you want the :func:`_` function to be installed in Python's
213
builtins namespace, so it is easily accessible in all modules of your
217
The :class:`NullTranslations` class
218
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
220
Translation classes are what actually implement the translation of original
221
source file message strings to translated message strings. The base class used
222
by all translation classes is :class:`NullTranslations`; this provides the basic
223
interface you can use to write your own specialized translation classes. Here
224
are the methods of :class:`NullTranslations`:
227
.. class:: NullTranslations(fp=None)
229
Takes an optional :term:`file object` *fp*, which is ignored by the base class.
230
Initializes "protected" instance variables *_info* and *_charset* which are set
231
by derived classes, as well as *_fallback*, which is set through
232
:meth:`add_fallback`. It then calls ``self._parse(fp)`` if *fp* is not
235
.. method:: _parse(fp)
237
No-op'd in the base class, this method takes file object *fp*, and reads
238
the data from the file, initializing its message catalog. If you have an
239
unsupported message catalog file format, you should override this method
240
to parse your format.
243
.. method:: add_fallback(fallback)
245
Add *fallback* as the fallback object for the current translation object.
246
A translation object should consult the fallback if it cannot provide a
247
translation for a given message.
250
.. method:: gettext(message)
252
If a fallback has been set, forward :meth:`gettext` to the fallback.
253
Otherwise, return the translated message. Overridden in derived classes.
256
.. method:: lgettext(message)
258
If a fallback has been set, forward :meth:`lgettext` to the fallback.
259
Otherwise, return the translated message. Overridden in derived classes.
262
.. method:: ngettext(singular, plural, n)
264
If a fallback has been set, forward :meth:`ngettext` to the fallback.
265
Otherwise, return the translated message. Overridden in derived classes.
268
.. method:: lngettext(singular, plural, n)
270
If a fallback has been set, forward :meth:`lngettext` to the fallback.
271
Otherwise, return the translated message. Overridden in derived classes.
276
Return the "protected" :attr:`_info` variable.
279
.. method:: charset()
281
Return the "protected" :attr:`_charset` variable, which is the encoding of
282
the message catalog file.
285
.. method:: output_charset()
287
Return the "protected" :attr:`_output_charset` variable, which defines the
288
encoding used to return translated messages in :meth:`lgettext` and
292
.. method:: set_output_charset(charset)
294
Change the "protected" :attr:`_output_charset` variable, which defines the
295
encoding used to return translated messages.
298
.. method:: install(names=None)
300
This method installs :meth:`self.gettext` into the built-in namespace,
303
If the *names* parameter is given, it must be a sequence containing the
304
names of functions you want to install in the builtins namespace in
305
addition to :func:`_`. Supported names are ``'gettext'`` (bound to
306
:meth:`self.gettext`), ``'ngettext'`` (bound to :meth:`self.ngettext`),
307
``'lgettext'`` and ``'lngettext'``.
309
Note that this is only one way, albeit the most convenient way, to make
310
the :func:`_` function available to your application. Because it affects
311
the entire application globally, and specifically the built-in namespace,
312
localized modules should never install :func:`_`. Instead, they should use
313
this code to make :func:`_` available to their module::
316
t = gettext.translation('mymodule', ...)
319
This puts :func:`_` only in the module's global namespace and so only
320
affects calls within this module.
323
The :class:`GNUTranslations` class
324
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
326
The :mod:`gettext` module provides one additional class derived from
327
:class:`NullTranslations`: :class:`GNUTranslations`. This class overrides
328
:meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files
329
in both big-endian and little-endian format.
331
:class:`GNUTranslations` parses optional meta-data out of the translation
332
catalog. It is convention with GNU :program:`gettext` to include meta-data as
333
the translation for the empty string. This meta-data is in :rfc:`822`\ -style
334
``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the
335
key ``Content-Type`` is found, then the ``charset`` property is used to
336
initialize the "protected" :attr:`_charset` instance variable, defaulting to
337
``None`` if not found. If the charset encoding is specified, then all message
338
ids and message strings read from the catalog are converted to Unicode using
339
this encoding, else ASCII encoding is assumed.
341
Since message ids are read as Unicode strings too, all :meth:`*gettext` methods
342
will assume message ids as Unicode strings, not byte strings.
344
The entire set of key/value pairs are placed into a dictionary and set as the
345
"protected" :attr:`_info` instance variable.
347
If the :file:`.mo` file's magic number is invalid, or if other problems occur
348
while reading the file, instantiating a :class:`GNUTranslations` class can raise
351
The following methods are overridden from the base class implementation:
354
.. method:: GNUTranslations.gettext(message)
356
Look up the *message* id in the catalog and return the corresponding message
357
string, as a Unicode string. If there is no entry in the catalog for the
358
*message* id, and a fallback has been set, the look up is forwarded to the
359
fallback's :meth:`gettext` method. Otherwise, the *message* id is returned.
362
.. method:: GNUTranslations.lgettext(message)
364
Equivalent to :meth:`gettext`, but the translation is returned as a
365
bytestring encoded in the selected output charset, or in the preferred system
366
encoding if no encoding was explicitly set with :meth:`set_output_charset`.
369
.. method:: GNUTranslations.ngettext(singular, plural, n)
371
Do a plural-forms lookup of a message id. *singular* is used as the message id
372
for purposes of lookup in the catalog, while *n* is used to determine which
373
plural form to use. The returned message string is a Unicode string.
375
If the message id is not found in the catalog, and a fallback is specified, the
376
request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when
377
*n* is 1 *singular* is returned, and *plural* is returned in all other cases.
381
n = len(os.listdir('.'))
382
cat = GNUTranslations(somefile)
383
message = cat.ngettext(
384
'There is %(num)d file in this directory',
385
'There are %(num)d files in this directory',
389
.. method:: GNUTranslations.lngettext(singular, plural, n)
391
Equivalent to :meth:`gettext`, but the translation is returned as a
392
bytestring encoded in the selected output charset, or in the preferred system
393
encoding if no encoding was explicitly set with :meth:`set_output_charset`.
396
Solaris message catalog support
397
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
399
The Solaris operating system defines its own binary :file:`.mo` file format, but
400
since no documentation can be found on this format, it is not supported at this
404
The Catalog constructor
405
^^^^^^^^^^^^^^^^^^^^^^^
407
.. index:: single: GNOME
409
GNOME uses a version of the :mod:`gettext` module by James Henstridge, but this
410
version has a slightly different API. Its documented usage was::
413
cat = gettext.Catalog(domain, localedir)
415
print(_('hello world'))
417
For compatibility with this older module, the function :func:`Catalog` is an
418
alias for the :func:`translation` function described above.
420
One difference between this module and Henstridge's: his catalog objects
421
supported access through a mapping API, but this appears to be unused and so is
422
not currently supported.
425
Internationalizing your programs and modules
426
--------------------------------------------
428
Internationalization (I18N) refers to the operation by which a program is made
429
aware of multiple languages. Localization (L10N) refers to the adaptation of
430
your program, once internationalized, to the local language and cultural habits.
431
In order to provide multilingual messages for your Python programs, you need to
432
take the following steps:
434
#. prepare your program or module by specially marking translatable strings
436
#. run a suite of tools over your marked files to generate raw messages catalogs
438
#. create language specific translations of the message catalogs
440
#. use the :mod:`gettext` module so that message strings are properly translated
442
In order to prepare your code for I18N, you need to look at all the strings in
443
your files. Any string that needs to be translated should be marked by wrapping
444
it in ``_('...')`` --- that is, a call to the function :func:`_`. For example::
446
filename = 'mylog.txt'
447
message = _('writing a log message')
448
fp = open(filename, 'w')
452
In this example, the string ``'writing a log message'`` is marked as a candidate
453
for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not.
455
There are a few tools to extract the strings meant for translation.
456
The original GNU :program:`gettext` only supported C or C++ source
457
code but its extended version :program:`xgettext` scans code written
458
in a number of languages, including Python, to find strings marked as
459
translatable. `Babel <http://babel.pocoo.org/>`__ is a Python
460
internationalization library that includes a :file:`pybabel` script to
461
extract and compile message catalogs. François Pinard's program
462
called :program:`xpot` does a similar job and is available as part of
463
his `po-utils package <http://po-utils.progiciels-bpi.ca/>`__.
465
(Python also includes pure-Python versions of these programs, called
466
:program:`pygettext.py` and :program:`msgfmt.py`; some Python distributions
467
will install them for you. :program:`pygettext.py` is similar to
468
:program:`xgettext`, but only understands Python source code and
469
cannot handle other programming languages such as C or C++.
470
:program:`pygettext.py` supports a command-line interface similar to
471
:program:`xgettext`; for details on its use, run ``pygettext.py
472
--help``. :program:`msgfmt.py` is binary compatible with GNU
473
:program:`msgfmt`. With these two programs, you may not need the GNU
474
:program:`gettext` package to internationalize your Python
477
:program:`xgettext`, :program:`pygettext`, and similar tools generate
478
:file:`.po` files that are message catalogs. They are structured
479
:human-readable files that contain every marked string in the source
480
:code, along with a placeholder for the translated versions of these
483
Copies of these :file:`.po` files are then handed over to the
484
individual human translators who write translations for every
485
supported natural language. They send back the completed
486
language-specific versions as a :file:`<language-name>.po` file that's
487
compiled into a machine-readable :file:`.mo` binary catalog file using
488
the :program:`msgfmt` program. The :file:`.mo` files are used by the
489
:mod:`gettext` module for the actual translation processing at
492
How you use the :mod:`gettext` module in your code depends on whether you are
493
internationalizing a single module or your entire application. The next two
494
sections will discuss each case.
497
Localizing your module
498
^^^^^^^^^^^^^^^^^^^^^^
500
If you are localizing your module, you must take care not to make global
501
changes, e.g. to the built-in namespace. You should not use the GNU ``gettext``
502
API but instead the class-based API.
504
Let's say your module is called "spam" and the module's various natural language
505
translation :file:`.mo` files reside in :file:`/usr/share/locale` in GNU
506
:program:`gettext` format. Here's what you would put at the top of your
510
t = gettext.translation('spam', '/usr/share/locale')
514
Localizing your application
515
^^^^^^^^^^^^^^^^^^^^^^^^^^^
517
If you are localizing your application, you can install the :func:`_` function
518
globally into the built-in namespace, usually in the main driver file of your
519
application. This will let all your application-specific files just use
520
``_('...')`` without having to explicitly install it in each file.
522
In the simple case then, you need only add the following bit of code to the main
523
driver file of your application::
526
gettext.install('myapplication')
528
If you need to set the locale directory, you can pass it into the
529
:func:`install` function::
532
gettext.install('myapplication', '/usr/share/locale')
535
Changing languages on the fly
536
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
538
If your program needs to support many languages at the same time, you may want
539
to create multiple translation instances and then switch between them
540
explicitly, like so::
544
lang1 = gettext.translation('myapplication', languages=['en'])
545
lang2 = gettext.translation('myapplication', languages=['fr'])
546
lang3 = gettext.translation('myapplication', languages=['de'])
548
# start by using language1
551
# ... time goes by, user selects language 2
554
# ... more time goes by, user selects language 3
558
Deferred translations
559
^^^^^^^^^^^^^^^^^^^^^
561
In most coding situations, strings are translated where they are coded.
562
Occasionally however, you need to mark strings for translation, but defer actual
563
translation until later. A classic example is::
565
animals = ['mollusk',
574
Here, you want to mark the strings in the ``animals`` list as being
575
translatable, but you don't actually want to translate them until they are
578
Here is one way you can handle this situation::
580
def _(message): return message
582
animals = [_('mollusk'),
594
This works because the dummy definition of :func:`_` simply returns the string
595
unchanged. And this dummy definition will temporarily override any definition
596
of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take
597
care, though if you have a previous definition of :func:`_` in the local
600
Note that the second use of :func:`_` will not identify "a" as being
601
translatable to the :program:`gettext` program, because the parameter
602
is not a string literal.
604
Another way to handle this is with the following example::
606
def N_(message): return message
608
animals = [N_('mollusk'),
618
In this case, you are marking translatable strings with the function
619
:func:`N_`, which won't conflict with any definition of :func:`_`.
620
However, you will need to teach your message extraction program to
621
look for translatable strings marked with :func:`N_`. :program:`xgettext`,
622
:program:`pygettext`, ``pybabel extract``, and :program:`xpot` all
623
support this through the use of the :option:`-k` command-line switch.
624
The choice of :func:`N_` here is totally arbitrary; it could have just
625
as easily been :func:`MarkThisStringForTranslation`.
631
The following people contributed code, feedback, design suggestions, previous
632
implementations, and valuable experience to the creation of this module:
638
* Juan David Ibáñez Palomar
650
.. rubric:: Footnotes
652
.. [#] The default locale directory is system dependent; for example, on RedHat Linux
653
it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`.
654
The :mod:`gettext` module does not try to support these system dependent
655
defaults; instead its default is :file:`sys.prefix/share/locale`. For this
656
reason, it is always best to call :func:`bindtextdomain` with an explicit
657
absolute path at the start of your application.
659
.. [#] See the footnote for :func:`bindtextdomain` above.