1
@node Locales, Message Translation, Character Set Handling, Top
2
@c %MENU% The country and language can affect the behavior of library functions
3
@chapter Locales and Internationalization
5
Different countries and cultures have varying conventions for how to
6
communicate. These conventions range from very simple ones, such as the
7
format for representing dates and times, to very complex ones, such as
10
@cindex internationalization
12
@dfn{Internationalization} of software means programming it to be able
13
to adapt to the user's favorite conventions. In @w{ISO C},
14
internationalization works by means of @dfn{locales}. Each locale
15
specifies a collection of conventions, one convention for each purpose.
16
The user chooses a set of conventions by specifying a locale (via
17
environment variables).
19
All programs inherit the chosen locale as part of their environment.
20
Provided the programs are written to obey the choice of locale, they
21
will follow the conventions preferred by the user.
24
* Effects of Locale:: Actions affected by the choice of
26
* Choosing Locale:: How the user specifies a locale.
27
* Locale Categories:: Different purposes for which you can
29
* Setting the Locale:: How a program specifies the locale
30
with library functions.
31
* Standard Locales:: Locale names available on all systems.
32
* Locale Information:: How to access the information for the locale.
33
* Formatting Numbers:: A dedicated function to format numbers.
34
* Yes-or-No Questions:: Check a Response against the locale.
37
@node Effects of Locale, Choosing Locale, , Locales
38
@section What Effects a Locale Has
40
Each locale specifies conventions for several purposes, including the
45
What multibyte character sequences are valid, and how they are
46
interpreted (@pxref{Character Set Handling}).
49
Classification of which characters in the local character set are
50
considered alphabetic, and upper- and lower-case conversion conventions
51
(@pxref{Character Handling}).
54
The collating sequence for the local language and character set
55
(@pxref{Collation Functions}).
58
Formatting of numbers and currency amounts (@pxref{General Numeric}).
61
Formatting of dates and times (@pxref{Formatting Calendar Time}).
64
What language to use for output, including error messages
65
(@pxref{Message Translation}).
68
What language to use for user answers to yes-or-no questions
69
(@pxref{Yes-or-No Questions}).
72
What language to use for more complex user input.
73
(The C library doesn't yet help you implement this.)
76
Some aspects of adapting to the specified locale are handled
77
automatically by the library subroutines. For example, all your program
78
needs to do in order to use the collating sequence of the chosen locale
79
is to use @code{strcoll} or @code{strxfrm} to compare strings.
81
Other aspects of locales are beyond the comprehension of the library.
82
For example, the library can't automatically translate your program's
83
output messages into other languages. The only way you can support
84
output in the user's favorite language is to program this more or less
85
by hand. The C library provides functions to handle translations for
86
multiple languages easily.
88
This chapter discusses the mechanism by which you can modify the current
89
locale. The effects of the current locale on specific library functions
90
are discussed in more detail in the descriptions of those functions.
92
@node Choosing Locale, Locale Categories, Effects of Locale, Locales
93
@section Choosing a Locale
95
The simplest way for the user to choose a locale is to set the
96
environment variable @code{LANG}. This specifies a single locale to use
97
for all purposes. For example, a user could specify a hypothetical
98
locale named @samp{espana-castellano} to use the standard conventions of
101
The set of locales supported depends on the operating system you are
102
using, and so do their names. We can't make any promises about what
103
locales will exist, except for one standard locale called @samp{C} or
104
@samp{POSIX}. Later we will describe how to construct locales.
105
@comment (@pxref{Building Locale Files}).
107
@cindex combining locales
108
A user also has the option of specifying different locales for different
109
purposes---in effect, choosing a mixture of multiple locales.
111
For example, the user might specify the locale @samp{espana-castellano}
112
for most purposes, but specify the locale @samp{usa-english} for
113
currency formatting. This might make sense if the user is a
114
Spanish-speaking American, working in Spanish, but representing monetary
115
amounts in US dollars.
117
Note that both locales @samp{espana-castellano} and @samp{usa-english},
118
like all locales, would include conventions for all of the purposes to
119
which locales apply. However, the user can choose to use each locale
120
for a particular subset of those purposes.
122
@node Locale Categories, Setting the Locale, Choosing Locale, Locales
123
@section Categories of Activities that Locales Affect
124
@cindex categories for locales
125
@cindex locale categories
127
The purposes that locales serve are grouped into @dfn{categories}, so
128
that a user or a program can choose the locale for each category
129
independently. Here is a table of categories; each name is both an
130
environment variable that a user can set, and a macro name that you can
131
use as an argument to @code{setlocale}.
137
This category applies to collation of strings (functions @code{strcoll}
138
and @code{strxfrm}); see @ref{Collation Functions}.
143
This category applies to classification and conversion of characters,
144
and to multibyte and wide characters;
145
see @ref{Character Handling}, and @ref{Character Set Handling}.
150
This category applies to formatting monetary values; see @ref{General Numeric}.
155
This category applies to formatting numeric values that are not
156
monetary; see @ref{General Numeric}.
161
This category applies to formatting date and time values; see
162
@ref{Formatting Calendar Time}.
167
This category applies to selecting the language used in the user
168
interface for message translation (@pxref{The Uniforum approach};
169
@pxref{Message catalogs a la X/Open}) and contains regular expressions
170
for affirmative and negative responses.
175
This is not an environment variable; it is only a macro that you can use
176
with @code{setlocale} to set a single locale for all purposes. Setting
177
this environment variable overwrites all selections by the other
178
@code{LC_*} variables or @code{LANG}.
183
If this environment variable is defined, its value specifies the locale
184
to use for all purposes except as overridden by the variables above.
188
When developing the message translation functions it was felt that the
189
functionality provided by the variables above is not sufficient. For
190
example, it should be possible to specify more than one locale name.
191
Take a Swedish user who better speaks German than English, and a program
192
whose messages are output in English by default. It should be possible
193
to specify that the first choice of language is Swedish, the second
194
German, and if this also fails to use English. This is
195
possible with the variable @code{LANGUAGE}. For further description of
196
this GNU extension see @ref{Using gettextized software}.
198
@node Setting the Locale, Standard Locales, Locale Categories, Locales
199
@section How Programs Set the Locale
201
A C program inherits its locale environment variables when it starts up.
202
This happens automatically. However, these variables do not
203
automatically control the locale used by the library functions, because
204
@w{ISO C} says that all programs start by default in the standard @samp{C}
205
locale. To use the locales specified by the environment, you must call
206
@code{setlocale}. Call it as follows:
209
setlocale (LC_ALL, "");
213
to select a locale based on the user choice of the appropriate
214
environment variables.
216
@cindex changing the locale
217
@cindex locale, changing
218
You can also use @code{setlocale} to specify a particular locale, for
219
general use or for a specific category.
222
The symbols in this section are defined in the header file @file{locale.h}.
226
@deftypefun {char *} setlocale (int @var{category}, const char *@var{locale})
227
The function @code{setlocale} sets the current locale for category
228
@var{category} to @var{locale}. A list of all the locales the system
229
provides can be created by running
236
If @var{category} is @code{LC_ALL}, this specifies the locale for all
237
purposes. The other possible values of @var{category} specify an
238
single purpose (@pxref{Locale Categories}).
240
You can also use this function to find out the current locale by passing
241
a null pointer as the @var{locale} argument. In this case,
242
@code{setlocale} returns a string that is the name of the locale
243
currently selected for category @var{category}.
245
The string returned by @code{setlocale} can be overwritten by subsequent
246
calls, so you should make a copy of the string (@pxref{Copying and
247
Concatenation}) if you want to save it past any further calls to
248
@code{setlocale}. (The standard library is guaranteed never to call
249
@code{setlocale} itself.)
251
You should not modify the string returned by @code{setlocale}. It might
252
be the same string that was passed as an argument in a previous call to
253
@code{setlocale}. One requirement is that the @var{category} must be
254
the same in the call the string was returned and the one when the string
255
is passed in as @var{locale} parameter.
257
When you read the current locale for category @code{LC_ALL}, the value
258
encodes the entire combination of selected locales for all categories.
259
In this case, the value is not just a single locale name. In fact, we
260
don't make any promises about what it looks like. But if you specify
261
the same ``locale name'' with @code{LC_ALL} in a subsequent call to
262
@code{setlocale}, it restores the same combination of locale selections.
264
To be sure you can use the returned string encoding the currently selected
265
locale at a later time, you must make a copy of the string. It is not
266
guaranteed that the returned pointer remains valid over time.
268
When the @var{locale} argument is not a null pointer, the string returned
269
by @code{setlocale} reflects the newly-modified locale.
271
If you specify an empty string for @var{locale}, this means to read the
272
appropriate environment variable and use its value to select the locale
275
If a nonempty string is given for @var{locale}, then the locale of that
276
name is used if possible.
278
If you specify an invalid locale name, @code{setlocale} returns a null
279
pointer and leaves the current locale unchanged.
282
Here is an example showing how you might use @code{setlocale} to
283
temporarily switch to a new locale.
292
with_other_locale (char *new_locale,
293
void (*subroutine) (int),
296
char *old_locale, *saved_locale;
298
/* @r{Get the name of the current locale.} */
299
old_locale = setlocale (LC_ALL, NULL);
301
/* @r{Copy the name so it won't be clobbered by @code{setlocale}.} */
302
saved_locale = strdup (old_locale);
303
if (saved_locale == NULL)
304
fatal ("Out of memory");
306
/* @r{Now change the locale and do some stuff with it.} */
307
setlocale (LC_ALL, new_locale);
308
(*subroutine) (argument);
310
/* @r{Restore the original locale.} */
311
setlocale (LC_ALL, saved_locale);
316
@strong{Portability Note:} Some @w{ISO C} systems may define additional
317
locale categories, and future versions of the library will do so. For
318
portability, assume that any symbol beginning with @samp{LC_} might be
319
defined in @file{locale.h}.
321
@node Standard Locales, Locale Information, Setting the Locale, Locales
322
@section Standard Locales
324
The only locale names you can count on finding on all operating systems
325
are these three standard ones:
329
This is the standard C locale. The attributes and behavior it provides
330
are specified in the @w{ISO C} standard. When your program starts up, it
331
initially uses this locale by default.
334
This is the standard POSIX locale. Currently, it is an alias for the
338
The empty name says to select a locale based on environment variables.
339
@xref{Locale Categories}.
342
Defining and installing named locales is normally a responsibility of
343
the system administrator at your site (or the person who installed the
344
GNU C library). It is also possible for the user to create private
345
locales. All this will be discussed later when describing the tool to
347
@comment (@pxref{Building Locale Files}).
349
If your program needs to use something other than the @samp{C} locale,
350
it will be more portable if you use whatever locale the user specifies
351
with the environment, rather than trying to specify some non-standard
352
locale explicitly by name. Remember, different machines might have
353
different sets of locales installed.
355
@node Locale Information, Formatting Numbers, Standard Locales, Locales
356
@section Accessing Locale Information
358
There are several ways to access locale information. The simplest
359
way is to let the C library itself do the work. Several of the
360
functions in this library implicitly access the locale data, and use
361
what information is provided by the currently selected locale. This is
362
how the locale model is meant to work normally.
364
As an example take the @code{strftime} function, which is meant to nicely
365
format date and time information (@pxref{Formatting Calendar Time}).
366
Part of the standard information contained in the @code{LC_TIME}
367
category is the names of the months. Instead of requiring the
368
programmer to take care of providing the translations the
369
@code{strftime} function does this all by itself. @code{%A}
370
in the format string is replaced by the appropriate weekday
371
name of the locale currently selected by @code{LC_TIME}. This is an
372
easy example, and wherever possible functions do things automatically
375
But there are quite often situations when there is simply no function
376
to perform the task, or it is simply not possible to do the work
377
automatically. For these cases it is necessary to access the
378
information in the locale directly. To do this the C library provides
379
two functions: @code{localeconv} and @code{nl_langinfo}. The former is
380
part of @w{ISO C} and therefore portable, but has a brain-damaged
381
interface. The second is part of the Unix interface and is portable in
382
as far as the system follows the Unix standards.
385
* The Lame Way to Locale Data:: ISO C's @code{localeconv}.
386
* The Elegant and Fast Way:: X/Open's @code{nl_langinfo}.
389
@node The Lame Way to Locale Data, The Elegant and Fast Way, ,Locale Information
390
@subsection @code{localeconv}: It is portable but @dots{}
392
Together with the @code{setlocale} function the @w{ISO C} people
393
invented the @code{localeconv} function. It is a masterpiece of poor
394
design. It is expensive to use, not extendable, and not generally
395
usable as it provides access to only @code{LC_MONETARY} and
396
@code{LC_NUMERIC} related information. Nevertheless, if it is
397
applicable to a given situation it should be used since it is very
398
portable. The function @code{strfmon} formats monetary amounts
399
according to the selected locale using this information.
401
@cindex monetary value formatting
402
@cindex numeric value formatting
406
@deftypefun {struct lconv *} localeconv (void)
407
The @code{localeconv} function returns a pointer to a structure whose
408
components contain information about how numeric and monetary values
409
should be formatted in the current locale.
411
You should not modify the structure or its contents. The structure might
412
be overwritten by subsequent calls to @code{localeconv}, or by calls to
413
@code{setlocale}, but no other function in the library overwrites this
419
@deftp {Data Type} {struct lconv}
420
@code{localeconv}'s return value is of this data type. Its elements are
421
described in the following subsections.
424
If a member of the structure @code{struct lconv} has type @code{char},
425
and the value is @code{CHAR_MAX}, it means that the current locale has
426
no value for that parameter.
429
* General Numeric:: Parameters for formatting numbers and
431
* Currency Symbol:: How to print the symbol that identifies an
432
amount of money (e.g. @samp{$}).
433
* Sign of Money Amount:: How to print the (positive or negative) sign
434
for a monetary amount, if one exists.
437
@node General Numeric, Currency Symbol, , The Lame Way to Locale Data
438
@subsubsection Generic Numeric Formatting Parameters
440
These are the standard members of @code{struct lconv}; there may be
444
@item char *decimal_point
445
@itemx char *mon_decimal_point
446
These are the decimal-point separators used in formatting non-monetary
447
and monetary quantities, respectively. In the @samp{C} locale, the
448
value of @code{decimal_point} is @code{"."}, and the value of
449
@code{mon_decimal_point} is @code{""}.
450
@cindex decimal-point separator
452
@item char *thousands_sep
453
@itemx char *mon_thousands_sep
454
These are the separators used to delimit groups of digits to the left of
455
the decimal point in formatting non-monetary and monetary quantities,
456
respectively. In the @samp{C} locale, both members have a value of
457
@code{""} (the empty string).
460
@itemx char *mon_grouping
461
These are strings that specify how to group the digits to the left of
462
the decimal point. @code{grouping} applies to non-monetary quantities
463
and @code{mon_grouping} applies to monetary quantities. Use either
464
@code{thousands_sep} or @code{mon_thousands_sep} to separate the digit
466
@cindex grouping of digits
468
Each member of these strings is to be interpreted as an integer value of
469
type @code{char}. Successive numbers (from left to right) give the
470
sizes of successive groups (from right to left, starting at the decimal
471
point.) The last member is either @code{0}, in which case the previous
472
member is used over and over again for all the remaining groups, or
473
@code{CHAR_MAX}, in which case there is no more grouping---or, put
474
another way, any remaining digits form one large group without
477
For example, if @code{grouping} is @code{"\04\03\02"}, the correct
478
grouping for the number @code{123456787654321} is @samp{12}, @samp{34},
479
@samp{56}, @samp{78}, @samp{765}, @samp{4321}. This uses a group of 4
480
digits at the end, preceded by a group of 3 digits, preceded by groups
481
of 2 digits (as many as needed). With a separator of @samp{,}, the
482
number would be printed as @samp{12,34,56,78,765,4321}.
484
A value of @code{"\03"} indicates repeated groups of three digits, as
485
normally used in the U.S.
487
In the standard @samp{C} locale, both @code{grouping} and
488
@code{mon_grouping} have a value of @code{""}. This value specifies no
491
@item char int_frac_digits
492
@itemx char frac_digits
493
These are small integers indicating how many fractional digits (to the
494
right of the decimal point) should be displayed in a monetary value in
495
international and local formats, respectively. (Most often, both
496
members have the same value.)
498
In the standard @samp{C} locale, both of these members have the value
499
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
500
what to do when you find this value; we recommend printing no
501
fractional digits. (This locale also specifies the empty string for
502
@code{mon_decimal_point}, so printing any fractional digits would be
506
@node Currency Symbol, Sign of Money Amount, General Numeric, The Lame Way to Locale Data
507
@subsubsection Printing the Currency Symbol
508
@cindex currency symbols
510
These members of the @code{struct lconv} structure specify how to print
511
the symbol to identify a monetary value---the international analog of
512
@samp{$} for US dollars.
514
Each country has two standard currency symbols. The @dfn{local currency
515
symbol} is used commonly within the country, while the
516
@dfn{international currency symbol} is used internationally to refer to
517
that country's currency when it is necessary to indicate the country
520
For example, many countries use the dollar as their monetary unit, and
521
when dealing with international currencies it's important to specify
522
that one is dealing with (say) Canadian dollars instead of U.S. dollars
523
or Australian dollars. But when the context is known to be Canada,
524
there is no need to make this explicit---dollar amounts are implicitly
525
assumed to be in Canadian dollars.
528
@item char *currency_symbol
529
The local currency symbol for the selected locale.
531
In the standard @samp{C} locale, this member has a value of @code{""}
532
(the empty string), meaning ``unspecified''. The ISO standard doesn't
533
say what to do when you find this value; we recommend you simply print
534
the empty string as you would print any other string pointed to by this
537
@item char *int_curr_symbol
538
The international currency symbol for the selected locale.
540
The value of @code{int_curr_symbol} should normally consist of a
541
three-letter abbreviation determined by the international standard
542
@cite{ISO 4217 Codes for the Representation of Currency and Funds},
543
followed by a one-character separator (often a space).
545
In the standard @samp{C} locale, this member has a value of @code{""}
546
(the empty string), meaning ``unspecified''. We recommend you simply print
547
the empty string as you would print any other string pointed to by this
550
@item char p_cs_precedes
551
@itemx char n_cs_precedes
552
@itemx char int_p_cs_precedes
553
@itemx char int_n_cs_precedes
554
These members are @code{1} if the @code{currency_symbol} or
555
@code{int_curr_symbol} strings should precede the value of a monetary
556
amount, or @code{0} if the strings should follow the value. The
557
@code{p_cs_precedes} and @code{int_p_cs_precedes} members apply to
558
positive amounts (or zero), and the @code{n_cs_precedes} and
559
@code{int_n_cs_precedes} members apply to negative amounts.
561
In the standard @samp{C} locale, all of these members have a value of
562
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
563
what to do when you find this value. We recommend printing the
564
currency symbol before the amount, which is right for most countries.
565
In other words, treat all nonzero values alike in these members.
567
The members with the @code{int_} prefix apply to the
568
@code{int_curr_symbol} while the other two apply to
569
@code{currency_symbol}.
571
@item char p_sep_by_space
572
@itemx char n_sep_by_space
573
@itemx char int_p_sep_by_space
574
@itemx char int_n_sep_by_space
575
These members are @code{1} if a space should appear between the
576
@code{currency_symbol} or @code{int_curr_symbol} strings and the
577
amount, or @code{0} if no space should appear. The
578
@code{p_sep_by_space} and @code{int_p_sep_by_space} members apply to
579
positive amounts (or zero), and the @code{n_sep_by_space} and
580
@code{int_n_sep_by_space} members apply to negative amounts.
582
In the standard @samp{C} locale, all of these members have a value of
583
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
584
what you should do when you find this value; we suggest you treat it as
585
1 (print a space). In other words, treat all nonzero values alike in
588
The members with the @code{int_} prefix apply to the
589
@code{int_curr_symbol} while the other two apply to
590
@code{currency_symbol}. There is one specialty with the
591
@code{int_curr_symbol}, though. Since all legal values contain a space
592
at the end the string one either printf this space (if the currency
593
symbol must appear in front and must be separated) or one has to avoid
594
printing this character at all (especially when at the end of the
598
@node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data
599
@subsubsection Printing the Sign of a Monetary Amount
601
These members of the @code{struct lconv} structure specify how to print
602
the sign (if any) of a monetary value.
605
@item char *positive_sign
606
@itemx char *negative_sign
607
These are strings used to indicate positive (or zero) and negative
608
monetary quantities, respectively.
610
In the standard @samp{C} locale, both of these members have a value of
611
@code{""} (the empty string), meaning ``unspecified''.
613
The ISO standard doesn't say what to do when you find this value; we
614
recommend printing @code{positive_sign} as you find it, even if it is
615
empty. For a negative value, print @code{negative_sign} as you find it
616
unless both it and @code{positive_sign} are empty, in which case print
617
@samp{-} instead. (Failing to indicate the sign at all seems rather
620
@item char p_sign_posn
621
@itemx char n_sign_posn
622
@itemx char int_p_sign_posn
623
@itemx char int_n_sign_posn
624
These members are small integers that indicate how to
625
position the sign for nonnegative and negative monetary quantities,
626
respectively. (The string used by the sign is what was specified with
627
@code{positive_sign} or @code{negative_sign}.) The possible values are
632
The currency symbol and quantity should be surrounded by parentheses.
635
Print the sign string before the quantity and currency symbol.
638
Print the sign string after the quantity and currency symbol.
641
Print the sign string right before the currency symbol.
644
Print the sign string right after the currency symbol.
647
``Unspecified''. Both members have this value in the standard
651
The ISO standard doesn't say what you should do when the value is
652
@code{CHAR_MAX}. We recommend you print the sign after the currency
655
The members with the @code{int_} prefix apply to the
656
@code{int_curr_symbol} while the other two apply to
657
@code{currency_symbol}.
660
@node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information
661
@subsection Pinpoint Access to Locale Data
663
When writing the X/Open Portability Guide the authors realized that the
664
@code{localeconv} function is not enough to provide reasonable access to
665
locale information. The information which was meant to be available
666
in the locale (as later specified in the POSIX.1 standard) requires more
667
ways to access it. Therefore the @code{nl_langinfo} function
672
@deftypefun {char *} nl_langinfo (nl_item @var{item})
673
The @code{nl_langinfo} function can be used to access individual
674
elements of the locale categories. Unlike the @code{localeconv}
675
function, which returns all the information, @code{nl_langinfo}
676
lets the caller select what information it requires. This is very
677
fast and it is not a problem to call this function multiple times.
679
A second advantage is that in addition to the numeric and monetary
680
formatting information, information from the
681
@code{LC_TIME} and @code{LC_MESSAGES} categories is available.
684
The type @code{nl_type} is defined in @file{nl_types.h}. The argument
685
@var{item} is a numeric value defined in the header @file{langinfo.h}.
686
The X/Open standard defines the following values:
690
@code{nl_langinfo} returns a string with the name of the coded character
691
set used in the selected locale.
700
@code{nl_langinfo} returns the abbreviated weekday name. @code{ABDAY_1}
701
corresponds to Sunday.
709
Similar to @code{ABDAY_1} etc., but here the return value is the
710
unabbreviated weekday name.
723
The return value is abbreviated name of the month. @code{ABMON_1}
724
corresponds to January.
737
Similar to @code{ABMON_1} etc., but here the month names are not abbreviated.
738
Here the first value @code{MON_1} also corresponds to January.
741
The return values are strings which can be used in the representation of time
742
as an hour from 1 to 12 plus an am/pm specifier.
744
Note that in locales which do not use this time representation
745
these strings might be empty, in which case the am/pm format
746
cannot be used at all.
748
The return value can be used as a format string for @code{strftime} to
749
represent time and date in a locale-specific way.
751
The return value can be used as a format string for @code{strftime} to
752
represent a date in a locale-specific way.
754
The return value can be used as a format string for @code{strftime} to
755
represent time in a locale-specific way.
757
The return value can be used as a format string for @code{strftime} to
758
represent time in the am/pm format.
760
Note that if the am/pm format does not make any sense for the
761
selected locale, the return value might be the same as the one for
764
The return value represents the era used in the current locale.
766
Most locales do not define this value. An example of a locale which
767
does define this value is the Japanese one. In Japan, the traditional
768
representation of dates includes the name of the era corresponding to
769
the then-emperor's reign.
771
Normally it should not be necessary to use this value directly.
772
Specifying the @code{E} modifier in their format strings causes the
773
@code{strftime} functions to use this information. The format of the
774
returned string is not specified, and therefore you should not assume
775
knowledge of it on different systems.
777
The return value gives the year in the relevant era of the locale.
778
As for @code{ERA} it should not be necessary to use this value directly.
780
This return value can be used as a format string for @code{strftime} to
781
represent dates and times in a locale-specific era-based way.
783
This return value can be used as a format string for @code{strftime} to
784
represent a date in a locale-specific era-based way.
786
This return value can be used as a format string for @code{strftime} to
787
represent time in a locale-specific era-based way.
789
The return value is a representation of up to @math{100} values used to
790
represent the values @math{0} to @math{99}. As for @code{ERA} this
791
value is not intended to be used directly, but instead indirectly
792
through the @code{strftime} function. When the modifier @code{O} is
793
used in a format which would otherwise use numerals to represent hours,
794
minutes, seconds, weekdays, months, or weeks, the appropriate value for
795
the locale is used instead.
796
@item INT_CURR_SYMBOL
797
The same as the value returned by @code{localeconv} in the
798
@code{int_curr_symbol} element of the @code{struct lconv}.
799
@item CURRENCY_SYMBOL
801
The same as the value returned by @code{localeconv} in the
802
@code{currency_symbol} element of the @code{struct lconv}.
804
@code{CRNCYSTR} is a deprecated alias still required by Unix98.
805
@item MON_DECIMAL_POINT
806
The same as the value returned by @code{localeconv} in the
807
@code{mon_decimal_point} element of the @code{struct lconv}.
808
@item MON_THOUSANDS_SEP
809
The same as the value returned by @code{localeconv} in the
810
@code{mon_thousands_sep} element of the @code{struct lconv}.
812
The same as the value returned by @code{localeconv} in the
813
@code{mon_grouping} element of the @code{struct lconv}.
815
The same as the value returned by @code{localeconv} in the
816
@code{positive_sign} element of the @code{struct lconv}.
818
The same as the value returned by @code{localeconv} in the
819
@code{negative_sign} element of the @code{struct lconv}.
820
@item INT_FRAC_DIGITS
821
The same as the value returned by @code{localeconv} in the
822
@code{int_frac_digits} element of the @code{struct lconv}.
824
The same as the value returned by @code{localeconv} in the
825
@code{frac_digits} element of the @code{struct lconv}.
827
The same as the value returned by @code{localeconv} in the
828
@code{p_cs_precedes} element of the @code{struct lconv}.
830
The same as the value returned by @code{localeconv} in the
831
@code{p_sep_by_space} element of the @code{struct lconv}.
833
The same as the value returned by @code{localeconv} in the
834
@code{n_cs_precedes} element of the @code{struct lconv}.
836
The same as the value returned by @code{localeconv} in the
837
@code{n_sep_by_space} element of the @code{struct lconv}.
839
The same as the value returned by @code{localeconv} in the
840
@code{p_sign_posn} element of the @code{struct lconv}.
842
The same as the value returned by @code{localeconv} in the
843
@code{n_sign_posn} element of the @code{struct lconv}.
845
@item INT_P_CS_PRECEDES
846
The same as the value returned by @code{localeconv} in the
847
@code{int_p_cs_precedes} element of the @code{struct lconv}.
848
@item INT_P_SEP_BY_SPACE
849
The same as the value returned by @code{localeconv} in the
850
@code{int_p_sep_by_space} element of the @code{struct lconv}.
851
@item INT_N_CS_PRECEDES
852
The same as the value returned by @code{localeconv} in the
853
@code{int_n_cs_precedes} element of the @code{struct lconv}.
854
@item INT_N_SEP_BY_SPACE
855
The same as the value returned by @code{localeconv} in the
856
@code{int_n_sep_by_space} element of the @code{struct lconv}.
857
@item INT_P_SIGN_POSN
858
The same as the value returned by @code{localeconv} in the
859
@code{int_p_sign_posn} element of the @code{struct lconv}.
860
@item INT_N_SIGN_POSN
861
The same as the value returned by @code{localeconv} in the
862
@code{int_n_sign_posn} element of the @code{struct lconv}.
866
The same as the value returned by @code{localeconv} in the
867
@code{decimal_point} element of the @code{struct lconv}.
869
The name @code{RADIXCHAR} is a deprecated alias still used in Unix98.
872
The same as the value returned by @code{localeconv} in the
873
@code{thousands_sep} element of the @code{struct lconv}.
875
The name @code{THOUSEP} is a deprecated alias still used in Unix98.
877
The same as the value returned by @code{localeconv} in the
878
@code{grouping} element of the @code{struct lconv}.
880
The return value is a regular expression which can be used with the
881
@code{regex} function to recognize a positive response to a yes/no
882
question. The GNU C library provides the @code{rpmatch} function for
883
easier handling in applications.
885
The return value is a regular expression which can be used with the
886
@code{regex} function to recognize a negative response to a yes/no
889
The return value is a locale-specific translation of the positive response
890
to a yes/no question.
892
Using this value is deprecated since it is a very special case of
893
message translation, and is better handled by the message
894
translation functions (@pxref{Message Translation}).
896
The use of this symbol is deprecated. Instead message translation
899
The return value is a locale-specific translation of the negative response
900
to a yes/no question. What is said for @code{YESSTR} is also true here.
902
The use of this symbol is deprecated. Instead message translation
906
The file @file{langinfo.h} defines a lot more symbols but none of them
907
is official. Using them is not portable, and the format of the
908
return values might change. Therefore we recommended you not use
911
Note that the return value for any valid argument can be used for
912
in all situations (with the possible exception of the am/pm time formatting
913
codes). If the user has not selected any locale for the
914
appropriate category, @code{nl_langinfo} returns the information from the
915
@code{"C"} locale. It is therefore possible to use this function as
916
shown in the example below.
918
If the argument @var{item} is not valid, a pointer to an empty string is
922
An example of @code{nl_langinfo} usage is a function which has to
923
print a given date and time in a locale-specific way. At first one
924
might think that, since @code{strftime} internally uses the locale
925
information, writing something like the following is enough:
929
i18n_time_n_data (char *s, size_t len, const struct tm *tp)
931
return strftime (s, len, "%X %D", tp);
935
The format contains no weekday or month names and therefore is
936
internationally usable. Wrong! The output produced is something like
937
@code{"hh:mm:ss MM/DD/YY"}. This format is only recognizable in the
938
USA. Other countries use different formats. Therefore the function
939
should be rewritten like this:
943
i18n_time_n_data (char *s, size_t len, const struct tm *tp)
945
return strftime (s, len, nl_langinfo (D_T_FMT), tp);
949
Now it uses the date and time format of the locale
950
selected when the program runs. If the user selects the locale
951
correctly there should never be a misunderstanding over the time and
954
@node Formatting Numbers, Yes-or-No Questions, Locale Information, Locales
955
@section A dedicated function to format numbers
957
We have seen that the structure returned by @code{localeconv} as well as
958
the values given to @code{nl_langinfo} allow you to retrieve the various
959
pieces of locale-specific information to format numbers and monetary
960
amounts. We have also seen that the underlying rules are quite complex.
962
Therefore the X/Open standards introduce a function which uses such
963
locale information, making it easier for the user to format
964
numbers according to these rules.
966
@deftypefun ssize_t strfmon (char *@var{s}, size_t @var{maxsize}, const char *@var{format}, @dots{})
967
The @code{strfmon} function is similar to the @code{strftime} function
968
in that it takes a buffer, its size, a format string,
969
and values to write into the buffer as text in a form specified
970
by the format string. Like @code{strftime}, the function
971
also returns the number of bytes written into the buffer.
973
There are two differences: @code{strfmon} can take more than one
974
argument, and, of course, the format specification is different. Like
975
@code{strftime}, the format string consists of normal text, which is
976
output as is, and format specifiers, which are indicated by a @samp{%}.
977
Immediately after the @samp{%}, you can optionally specify various flags
978
and formatting information before the main formatting character, in a
979
similar way to @code{printf}:
983
Immediately following the @samp{%} there can be one or more of the
986
@item @samp{=@var{f}}
987
The single byte character @var{f} is used for this field as the numeric
988
fill character. By default this character is a space character.
989
Filling with this character is only performed if a left precision
990
is specified. It is not just to fill to the given field width.
992
The number is printed without grouping the digits according to the rules
993
of the current locale. By default grouping is enabled.
994
@item @samp{+}, @samp{(}
995
At most one of these flags can be used. They select which format to
996
represent the sign of a currency amount. By default, and if
997
@samp{+} is given, the locale equivalent of @math{+}/@math{-} is used. If
998
@samp{(} is given, negative amounts are enclosed in parentheses. The
999
exact format is determined by the values of the @code{LC_MONETARY}
1000
category of the locale selected at program runtime.
1002
The output will not contain the currency symbol.
1004
The output will be formatted left-justified instead of right-justified if
1005
it does not fill the entire field width.
1009
The next part of a specification is an optional field width. If no
1010
width is specified @math{0} is taken. During output, the function first
1011
determines how much space is required. If it requires at least as many
1012
characters as given by the field width, it is output using as much space
1013
as necessary. Otherwise, it is extended to use the full width by
1014
filling with the space character. The presence or absence of the
1015
@samp{-} flag determines the side at which such padding occurs. If
1016
present, the spaces are added at the right making the output
1017
left-justified, and vice versa.
1019
So far the format looks familiar, being similar to the @code{printf} and
1020
@code{strftime} formats. However, the next two optional fields
1021
introduce something new. The first one is a @samp{#} character followed
1022
by a decimal digit string. The value of the digit string specifies the
1023
number of @emph{digit} positions to the left of the decimal point (or
1024
equivalent). This does @emph{not} include the grouping character when
1025
the @samp{^} flag is not given. If the space needed to print the number
1026
does not fill the whole width, the field is padded at the left side with
1027
the fill character, which can be selected using the @samp{=} flag and by
1028
default is a space. For example, if the field width is selected as 6
1029
and the number is @math{123}, the fill character is @samp{*} the result
1030
will be @samp{***123}.
1032
The second optional field starts with a @samp{.} (period) and consists
1033
of another decimal digit string. Its value describes the number of
1034
characters printed after the decimal point. The default is selected
1035
from the current locale (@code{frac_digits}, @code{int_frac_digits}, see
1036
@pxref{General Numeric}). If the exact representation needs more digits
1037
than given by the field width, the displayed value is rounded. If the
1038
number of fractional digits is selected to be zero, no decimal point is
1041
As a GNU extension, the @code{strfmon} implementation in the GNU libc
1042
allows an optional @samp{L} next as a format modifier. If this modifier
1043
is given, the argument is expected to be a @code{long double} instead of
1044
a @code{double} value.
1046
Finally, the last component is a format specifier. There are three
1051
Use the locale's rules for formatting an international currency value.
1053
Use the locale's rules for formatting a national currency value.
1055
Place a @samp{%} in the output. There must be no flag, width
1056
specifier or modifier given, only @samp{%%} is allowed.
1059
As for @code{printf}, the function reads the format string
1060
from left to right and uses the values passed to the function following
1061
the format string. The values are expected to be either of type
1062
@code{double} or @code{long double}, depending on the presence of the
1063
modifier @samp{L}. The result is stored in the buffer pointed to by
1064
@var{s}. At most @var{maxsize} characters are stored.
1066
The return value of the function is the number of characters stored in
1067
@var{s}, including the terminating @code{NULL} byte. If the number of
1068
characters stored would exceed @var{maxsize}, the function returns
1069
@math{-1} and the content of the buffer @var{s} is unspecified. In this
1070
case @code{errno} is set to @code{E2BIG}.
1073
A few examples should make clear how the function works. It is
1074
assumed that all the following pieces of code are executed in a program
1075
which uses the USA locale (@code{en_US}). The simplest
1076
form of the format is this:
1079
strfmon (buf, 100, "@@%n@@%n@@%n@@", 123.45, -567.89, 12345.678);
1083
The output produced is
1085
"@@$123.45@@-$567.89@@$12,345.68@@"
1088
We can notice several things here. First, the widths of the output
1089
numbers are different. We have not specified a width in the format
1090
string, and so this is no wonder. Second, the third number is printed
1091
using thousands separators. The thousands separator for the
1092
@code{en_US} locale is a comma. The number is also rounded.
1093
@math{.678} is rounded to @math{.68} since the format does not specify a
1094
precision and the default value in the locale is @math{2}. Finally,
1095
note that the national currency symbol is printed since @samp{%n} was
1096
used, not @samp{i}. The next example shows how we can align the output.
1099
strfmon (buf, 100, "@@%=*11n@@%=*11n@@%=*11n@@", 123.45, -567.89, 12345.678);
1103
The output this time is:
1106
"@@ $123.45@@ -$567.89@@ $12,345.68@@"
1109
Two things stand out. Firstly, all fields have the same width (eleven
1110
characters) since this is the width given in the format and since no
1111
number required more characters to be printed. The second important
1112
point is that the fill character is not used. This is correct since the
1113
white space was not used to achieve a precision given by a @samp{#}
1114
modifier, but instead to fill to the given width. The difference
1115
becomes obvious if we now add a width specification.
1118
strfmon (buf, 100, "@@%=*11#5n@@%=*11#5n@@%=*11#5n@@",
1119
123.45, -567.89, 12345.678);
1126
"@@ $***123.45@@-$***567.89@@ $12,456.68@@"
1129
Here we can see that all the currency symbols are now aligned, and that
1130
the space between the currency sign and the number is filled with the
1131
selected fill character. Note that although the width is selected to be
1132
@math{5} and @math{123.45} has three digits left of the decimal point,
1133
the space is filled with three asterisks. This is correct since, as
1134
explained above, the width does not include the positions used to store
1135
thousands separators. One last example should explain the remaining
1139
strfmon (buf, 100, "@@%=0(16#5.3i@@%=0(16#5.3i@@%=0(16#5.3i@@",
1140
123.45, -567.89, 12345.678);
1144
This rather complex format string produces the following output:
1147
"@@ USD 000123,450 @@(USD 000567.890)@@ USD 12,345.678 @@"
1150
The most noticeable change is the alternative way of representing
1151
negative numbers. In financial circles this is often done using
1152
parentheses, and this is what the @samp{(} flag selected. The fill
1153
character is now @samp{0}. Note that this @samp{0} character is not
1154
regarded as a numeric zero, and therefore the first and second numbers
1155
are not printed using a thousands separator. Since we used the format
1156
specifier @samp{i} instead of @samp{n}, the international form of the
1157
currency symbol is used. This is a four letter string, in this case
1158
@code{"USD "}. The last point is that since the precision right of the
1159
decimal point is selected to be three, the first and second numbers are
1160
printed with an extra zero at the end and the third number is printed
1163
@node Yes-or-No Questions, , Formatting Numbers , Locales
1164
@section Yes-or-No Questions
1166
Some non GUI programs ask a yes-or-no question. If the messages
1167
(especially the questions) are translated into foreign languages, be
1168
sure that you localize the answers too. It would be very bad habit to
1169
ask a question in one language and request the answer in another, often
1172
The GNU C library contains @code{rpmatch} to give applications easy
1173
access to the corresponding locale definitions.
1177
@deftypefun int rpmatch (const char *@var{response})
1178
The function @code{rpmatch} checks the string in @var{response} whether
1179
or not it is a correct yes-or-no answer and if yes, which one. The
1180
check uses the @code{YESEXPR} and @code{NOEXPR} data in the
1181
@code{LC_MESSAGES} category of the currently selected locale. The
1182
return value is as follows:
1186
The user entered an affirmative answer.
1189
The user entered a negative answer.
1192
The answer matched neither the @code{YESEXPR} nor the @code{NOEXPR}
1196
This function is not standardized but available beside in GNU libc at
1197
least also in the IBM AIX library.
1201
This function would normally be used like this:
1205
/* @r{Use a safe default.} */
1208
fputs (gettext ("Do you really want to do this? "), stdout);
1210
/* @r{Prepare the @code{getline} call.} */
1213
while (getline (&line, &len, stdin) >= 0)
1215
/* @r{Check the response.} */
1216
int res = rpmatch (line);
1219
/* @r{We got a definitive answer.} */
1225
/* @r{Free what @code{getline} allocated.} */
1229
Note that the loop continues until an read error is detected or until a
1230
definitive (positive or negative) answer is read.