forked from bminor/glibc
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathlocale.texi
1483 lines (1271 loc) · 58.4 KB
/
locale.texi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
@node Locales, Message Translation, Character Set Handling, Top
@c %MENU% The country and language can affect the behavior of library functions
@chapter Locales and Internationalization
Different countries and cultures have varying conventions for how to
communicate. These conventions range from very simple ones, such as the
format for representing dates and times, to very complex ones, such as
the language spoken.
@cindex internationalization
@cindex locales
@dfn{Internationalization} of software means programming it to be able
to adapt to the user's favorite conventions. In @w{ISO C},
internationalization works by means of @dfn{locales}. Each locale
specifies a collection of conventions, one convention for each purpose.
The user chooses a set of conventions by specifying a locale (via
environment variables).
All programs inherit the chosen locale as part of their environment.
Provided the programs are written to obey the choice of locale, they
will follow the conventions preferred by the user.
@menu
* Effects of Locale:: Actions affected by the choice of
locale.
* Choosing Locale:: How the user specifies a locale.
* Locale Categories:: Different purposes for which you can
select a locale.
* Setting the Locale:: How a program specifies the locale
with library functions.
* Standard Locales:: Locale names available on all systems.
* Locale Names:: Format of system-specific locale names.
* Locale Information:: How to access the information for the locale.
* Formatting Numbers:: A dedicated function to format numbers.
* Yes-or-No Questions:: Check a Response against the locale.
@end menu
@node Effects of Locale, Choosing Locale, , Locales
@section What Effects a Locale Has
Each locale specifies conventions for several purposes, including the
following:
@itemize @bullet
@item
What multibyte character sequences are valid, and how they are
interpreted (@pxref{Character Set Handling}).
@item
Classification of which characters in the local character set are
considered alphabetic, and upper- and lower-case conversion conventions
(@pxref{Character Handling}).
@item
The collating sequence for the local language and character set
(@pxref{Collation Functions}).
@item
Formatting of numbers and currency amounts (@pxref{General Numeric}).
@item
Formatting of dates and times (@pxref{Formatting Calendar Time}).
@item
What language to use for output, including error messages
(@pxref{Message Translation}).
@item
What language to use for user answers to yes-or-no questions
(@pxref{Yes-or-No Questions}).
@item
What language to use for more complex user input.
(The C library doesn't yet help you implement this.)
@end itemize
Some aspects of adapting to the specified locale are handled
automatically by the library subroutines. For example, all your program
needs to do in order to use the collating sequence of the chosen locale
is to use @code{strcoll} or @code{strxfrm} to compare strings.
Other aspects of locales are beyond the comprehension of the library.
For example, the library can't automatically translate your program's
output messages into other languages. The only way you can support
output in the user's favorite language is to program this more or less
by hand. The C library provides functions to handle translations for
multiple languages easily.
This chapter discusses the mechanism by which you can modify the current
locale. The effects of the current locale on specific library functions
are discussed in more detail in the descriptions of those functions.
@node Choosing Locale, Locale Categories, Effects of Locale, Locales
@section Choosing a Locale
The simplest way for the user to choose a locale is to set the
environment variable @code{LANG}. This specifies a single locale to use
for all purposes. For example, a user could specify a hypothetical
locale named @samp{espana-castellano} to use the standard conventions of
most of Spain.
The set of locales supported depends on the operating system you are
using, and so do their names, except that the standard locale called
@samp{C} or @samp{POSIX} always exist. @xref{Locale Names}.
In order to force the system to always use the default locale, the
user can set the @code{LC_ALL} environment variable to @samp{C}.
@cindex combining locales
A user also has the option of specifying different locales for
different purposes---in effect, choosing a mixture of multiple
locales. @xref{Locale Categories}.
For example, the user might specify the locale @samp{espana-castellano}
for most purposes, but specify the locale @samp{usa-english} for
currency formatting. This might make sense if the user is a
Spanish-speaking American, working in Spanish, but representing monetary
amounts in US dollars.
Note that both locales @samp{espana-castellano} and @samp{usa-english},
like all locales, would include conventions for all of the purposes to
which locales apply. However, the user can choose to use each locale
for a particular subset of those purposes.
@node Locale Categories, Setting the Locale, Choosing Locale, Locales
@section Locale Categories
@cindex categories for locales
@cindex locale categories
The purposes that locales serve are grouped into @dfn{categories}, so
that a user or a program can choose the locale for each category
independently. Here is a table of categories; each name is both an
environment variable that a user can set, and a macro name that you can
use as the first argument to @code{setlocale}.
The contents of the environment variable (or the string in the second
argument to @code{setlocale}) has to be a valid locale name.
@xref{Locale Names}.
@vtable @code
@item LC_COLLATE
@standards{ISO, locale.h}
This category applies to collation of strings (functions @code{strcoll}
and @code{strxfrm}); see @ref{Collation Functions}.
@item LC_CTYPE
@standards{ISO, locale.h}
This category applies to classification and conversion of characters,
and to multibyte and wide characters;
see @ref{Character Handling}, and @ref{Character Set Handling}.
@item LC_MONETARY
@standards{ISO, locale.h}
This category applies to formatting monetary values; see @ref{General Numeric}.
@item LC_NUMERIC
@standards{ISO, locale.h}
This category applies to formatting numeric values that are not
monetary; see @ref{General Numeric}.
@item LC_TIME
@standards{ISO, locale.h}
This category applies to formatting date and time values; see
@ref{Formatting Calendar Time}.
@item LC_MESSAGES
@standards{XOPEN, locale.h}
This category applies to selecting the language used in the user
interface for message translation (@pxref{The Uniforum approach};
@pxref{Message catalogs a la X/Open}) and contains regular expressions
for affirmative and negative responses.
@item LC_ALL
@standards{ISO, locale.h}
This is not a category; it is only a macro that you can use
with @code{setlocale} to set a single locale for all purposes. Setting
this environment variable overwrites all selections by the other
@code{LC_*} variables or @code{LANG}.
@item LANG
@standards{ISO, locale.h}
If this environment variable is defined, its value specifies the locale
to use for all purposes except as overridden by the variables above.
@end vtable
@vindex LANGUAGE
When developing the message translation functions it was felt that the
functionality provided by the variables above is not sufficient. For
example, it should be possible to specify more than one locale name.
Take a Swedish user who better speaks German than English, and a program
whose messages are output in English by default. It should be possible
to specify that the first choice of language is Swedish, the second
German, and if this also fails to use English. This is
possible with the variable @code{LANGUAGE}. For further description of
this GNU extension see @ref{Using gettextized software}.
@node Setting the Locale, Standard Locales, Locale Categories, Locales
@section How Programs Set the Locale
A C program inherits its locale environment variables when it starts up.
This happens automatically. However, these variables do not
automatically control the locale used by the library functions, because
@w{ISO C} says that all programs start by default in the standard @samp{C}
locale. To use the locales specified by the environment, you must call
@code{setlocale}. Call it as follows:
@smallexample
setlocale (LC_ALL, "");
@end smallexample
@noindent
to select a locale based on the user choice of the appropriate
environment variables.
@cindex changing the locale
@cindex locale, changing
You can also use @code{setlocale} to specify a particular locale, for
general use or for a specific category.
@pindex locale.h
The symbols in this section are defined in the header file @file{locale.h}.
@deftypefun {char *} setlocale (int @var{category}, const char *@var{locale})
@standards{ISO, locale.h}
@safety{@prelim{}@mtunsafe{@mtasuconst{:@mtslocale{}} @mtsenv{}}@asunsafe{@asuinit{} @asulock{} @ascuheap{} @asucorrupt{}}@acunsafe{@acuinit{} @acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
@c Uses of the global locale object are unguarded in functions that
@c ought to be MT-Safe, so we're ruling out the use of this function
@c once threads are started. It takes a write lock itself, but it may
@c return a pointer loaded from the global locale object after releasing
@c the lock, or before taking it.
@c setlocale @mtasuconst:@mtslocale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
@c libc_rwlock_wrlock @asulock @aculock
@c libc_rwlock_unlock @aculock
@c getenv LOCPATH @mtsenv
@c malloc @ascuheap @acsmem
@c free @ascuheap @acsmem
@c new_composite_name ok
@c setdata ok
@c setname ok
@c _nl_find_locale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
@c getenv LC_ALL and LANG @mtsenv
@c _nl_load_locale_from_archive @ascuheap @acucorrupt @acsmem @acsfd
@c sysconf _SC_PAGE_SIZE ok
@c _nl_normalize_codeset @ascuheap @acsmem
@c isalnum_l ok (C locale)
@c isdigit_l ok (C locale)
@c malloc @ascuheap @acsmem
@c tolower_l ok (C locale)
@c open_not_cancel_2 @acsfd
@c fxstat64 ok
@c close_not_cancel_no_status ok
@c __mmap64 @acsmem
@c calculate_head_size ok
@c __munmap ok
@c compute_hashval ok
@c qsort dup @acucorrupt
@c rangecmp ok
@c malloc @ascuheap @acsmem
@c strdup @ascuheap @acsmem
@c _nl_intern_locale_data @ascuheap @acsmem
@c malloc @ascuheap @acsmem
@c free @ascuheap @acsmem
@c _nl_expand_alias @ascuheap @asulock @acsmem @acsfd @aculock
@c libc_lock_lock @asulock @aculock
@c bsearch ok
@c alias_compare ok
@c strcasecmp ok
@c read_alias_file @ascuheap @asulock @acsmem @acsfd @aculock
@c fopen @ascuheap @asulock @acsmem @acsfd @aculock
@c fsetlocking ok
@c feof_unlocked ok
@c fgets_unlocked ok
@c isspace ok (locale mutex is locked)
@c extend_alias_table @ascuheap @acsmem
@c realloc @ascuheap @acsmem
@c realloc @ascuheap @acsmem
@c fclose @ascuheap @asulock @acsmem @acsfd @aculock
@c qsort @ascuheap @acsmem
@c alias_compare dup
@c libc_lock_unlock @aculock
@c _nl_explode_name @ascuheap @acsmem
@c _nl_find_language ok
@c _nl_normalize_codeset dup @ascuheap @acsmem
@c _nl_make_l10nflist @ascuheap @acsmem
@c malloc @ascuheap @acsmem
@c free @ascuheap @acsmem
@c __argz_stringify ok
@c __argz_count ok
@c __argz_next ok
@c _nl_load_locale @ascuheap @acsmem @acsfd
@c open_not_cancel_2 @acsfd
@c __fxstat64 ok
@c close_not_cancel_no_status ok
@c mmap @acsmem
@c malloc @ascuheap @acsmem
@c read_not_cancel ok
@c free @ascuheap @acsmem
@c _nl_intern_locale_data dup @ascuheap @acsmem
@c munmap ok
@c __gconv_compare_alias @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
@c __gconv_read_conf @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
@c (libc_once-initializes gconv_cache and gconv_path_envvar; they're
@c never modified afterwards)
@c __gconv_load_cache @ascuheap @acsmem @acsfd
@c getenv GCONV_PATH @mtsenv
@c open_not_cancel @acsfd
@c __fxstat64 ok
@c close_not_cancel_no_status ok
@c mmap @acsmem
@c malloc @ascuheap @acsmem
@c __read ok
@c free @ascuheap @acsmem
@c munmap ok
@c __gconv_get_path @asulock @ascuheap @aculock @acsmem @acsfd
@c getcwd @ascuheap @acsmem @acsfd
@c libc_lock_lock @asulock @aculock
@c malloc @ascuheap @acsmem
@c strtok_r ok
@c libc_lock_unlock @aculock
@c read_conf_file @ascuheap @asucorrupt @asulock @acsmem @acucorrupt @acsfd @aculock
@c fopen @ascuheap @asulock @acsmem @acsfd @aculock
@c fsetlocking ok
@c feof_unlocked ok
@c getdelim @ascuheap @asucorrupt @acsmem @acucorrupt
@c isspace_l ok (C locale)
@c add_alias
@c isspace_l ok (C locale)
@c toupper_l ok (C locale)
@c add_alias2 dup @ascuheap @acucorrupt @acsmem
@c add_module @ascuheap @acsmem
@c isspace_l ok (C locale)
@c toupper_l ok (C locale)
@c strtol ok (@mtslocale but we hold the locale lock)
@c tfind __gconv_alias_db ok
@c __gconv_alias_compare dup ok
@c calloc @ascuheap @acsmem
@c insert_module dup @ascuheap
@c __tfind ok (because the tree is read only by then)
@c __gconv_alias_compare dup ok
@c insert_module @ascuheap
@c free @ascuheap
@c add_alias2 @ascuheap @acucorrupt @acsmem
@c detect_conflict ok, reads __gconv_modules_db
@c malloc @ascuheap @acsmem
@c tsearch __gconv_alias_db @ascuheap @acucorrupt @acsmem [exclusive tree, no @mtsrace]
@c __gconv_alias_compare ok
@c free @ascuheap
@c __gconv_compare_alias_cache ok
@c find_module_idx ok
@c do_lookup_alias ok
@c __tfind ok (because the tree is read only by then)
@c __gconv_alias_compare ok
@c strndup @ascuheap @acsmem
@c strcasecmp_l ok (C locale)
The function @code{setlocale} sets the current locale for category
@var{category} to @var{locale}.
If @var{category} is @code{LC_ALL}, this specifies the locale for all
purposes. The other possible values of @var{category} specify a
single purpose (@pxref{Locale Categories}).
You can also use this function to find out the current locale by passing
a null pointer as the @var{locale} argument. In this case,
@code{setlocale} returns a string that is the name of the locale
currently selected for category @var{category}.
The string returned by @code{setlocale} can be overwritten by subsequent
calls, so you should make a copy of the string (@pxref{Copying Strings
and Arrays}) if you want to save it past any further calls to
@code{setlocale}. (The standard library is guaranteed never to call
@code{setlocale} itself.)
You should not modify the string returned by @code{setlocale}. It might
be the same string that was passed as an argument in a previous call to
@code{setlocale}. One requirement is that the @var{category} must be
the same in the call the string was returned and the one when the string
is passed in as @var{locale} parameter.
When you read the current locale for category @code{LC_ALL}, the value
encodes the entire combination of selected locales for all categories.
If you specify the same ``locale name'' with @code{LC_ALL} in a
subsequent call to @code{setlocale}, it restores the same combination
of locale selections.
To be sure you can use the returned string encoding the currently selected
locale at a later time, you must make a copy of the string. It is not
guaranteed that the returned pointer remains valid over time.
When the @var{locale} argument is not a null pointer, the string returned
by @code{setlocale} reflects the newly-modified locale.
If you specify an empty string for @var{locale}, this means to read the
appropriate environment variable and use its value to select the locale
for @var{category}.
If a nonempty string is given for @var{locale}, then the locale of that
name is used if possible.
The effective locale name (either the second argument to
@code{setlocale}, or if the argument is an empty string, the name
obtained from the process environment) must be a valid locale name.
@xref{Locale Names}.
If you specify an invalid locale name, @code{setlocale} returns a null
pointer and leaves the current locale unchanged.
@end deftypefun
Here is an example showing how you might use @code{setlocale} to
temporarily switch to a new locale.
@smallexample
#include <stddef.h>
#include <locale.h>
#include <stdlib.h>
#include <string.h>
void
with_other_locale (char *new_locale,
void (*subroutine) (int),
int argument)
@{
char *old_locale, *saved_locale;
/* @r{Get the name of the current locale.} */
old_locale = setlocale (LC_ALL, NULL);
/* @r{Copy the name so it won't be clobbered by @code{setlocale}.} */
saved_locale = strdup (old_locale);
if (saved_locale == NULL)
fatal ("Out of memory");
/* @r{Now change the locale and do some stuff with it.} */
setlocale (LC_ALL, new_locale);
(*subroutine) (argument);
/* @r{Restore the original locale.} */
setlocale (LC_ALL, saved_locale);
free (saved_locale);
@}
@end smallexample
@strong{Portability Note:} Some @w{ISO C} systems may define additional
locale categories, and future versions of the library will do so. For
portability, assume that any symbol beginning with @samp{LC_} might be
defined in @file{locale.h}.
@node Standard Locales, Locale Names, Setting the Locale, Locales
@section Standard Locales
The only locale names you can count on finding on all operating systems
are these three standard ones:
@table @code
@item "C"
This is the standard C locale. The attributes and behavior it provides
are specified in the @w{ISO C} standard. When your program starts up, it
initially uses this locale by default.
@item "POSIX"
This is the standard POSIX locale. Currently, it is an alias for the
standard C locale.
@item ""
The empty name says to select a locale based on environment variables.
@xref{Locale Categories}.
@end table
Defining and installing named locales is normally a responsibility of
the system administrator at your site (or the person who installed
@theglibc{}). It is also possible for the user to create private
locales. All this will be discussed later when describing the tool to
do so.
@comment (@pxref{Building Locale Files}).
If your program needs to use something other than the @samp{C} locale,
it will be more portable if you use whatever locale the user specifies
with the environment, rather than trying to specify some non-standard
locale explicitly by name. Remember, different machines might have
different sets of locales installed.
@node Locale Names, Locale Information, Standard Locales, Locales
@section Locale Names
The following command prints a list of locales supported by the
system:
@pindex locale
@smallexample
locale -a
@end smallexample
@strong{Portability Note:} With the notable exception of the standard
locale names @samp{C} and @samp{POSIX}, locale names are
system-specific.
Most locale names follow XPG syntax and consist of up to four parts:
@smallexample
@var{language}[_@var{territory}[.@var{codeset}]][@@@var{modifier}]
@end smallexample
Beside the first part, all of them are allowed to be missing. If the
full specified locale is not found, less specific ones are looked for.
The various parts will be stripped off, in the following order:
@enumerate
@item
codeset
@item
normalized codeset
@item
territory
@item
modifier
@end enumerate
For example, the locale name @samp{de_AT.iso885915@@euro} denotes a
German-language locale for use in Austria, using the ISO-8859-15
(Latin-9) character set, and with the Euro as the currency symbol.
In addition to locale names which follow XPG syntax, systems may
provide aliases such as @samp{german}. Both categories of names must
not contain the slash character @samp{/}.
If the locale name starts with a slash @samp{/}, it is treated as a
path relative to the configured locale directories; see @code{LOCPATH}
below. The specified path must not contain a component @samp{..}, or
the name is invalid, and @code{setlocale} will fail.
@strong{Portability Note:} POSIX suggests that if a locale name starts
with a slash @samp{/}, it is resolved as an absolute path. However,
@theglibc{} treats it as a relative path under the directories listed
in @code{LOCPATH} (or the default locale directory if @code{LOCPATH}
is unset).
Locale names which are longer than an implementation-defined limit are
invalid and cause @code{setlocale} to fail.
As a special case, locale names used with @code{LC_ALL} can combine
several locales, reflecting different locale settings for different
categories. For example, you might want to use a U.S. locale with ISO
A4 paper format, so you set @code{LANG} to @samp{en_US.UTF-8}, and
@code{LC_PAPER} to @samp{de_DE.UTF-8}. In this case, the
@code{LC_ALL}-style combined locale name is
@smallexample
LC_CTYPE=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_PAPER=de_DE.UTF-8;@dots{}
@end smallexample
followed by other category settings not shown here.
@vindex LOCPATH
The path used for finding locale data can be set using the
@code{LOCPATH} environment variable. This variable lists the
directories in which to search for locale definitions, separated by a
colon @samp{:}.
The default path for finding locale data is system specific. A typical
value for the @code{LOCPATH} default is:
@smallexample
/usr/share/locale
@end smallexample
The value of @code{LOCPATH} is ignored by privileged programs for
security reasons, and only the default directory is used.
@node Locale Information, Formatting Numbers, Locale Names, Locales
@section Accessing Locale Information
There are several ways to access locale information. The simplest
way is to let the C library itself do the work. Several of the
functions in this library implicitly access the locale data, and use
what information is provided by the currently selected locale. This is
how the locale model is meant to work normally.
As an example take the @code{strftime} function, which is meant to nicely
format date and time information (@pxref{Formatting Calendar Time}).
Part of the standard information contained in the @code{LC_TIME}
category is the names of the months. Instead of requiring the
programmer to take care of providing the translations the
@code{strftime} function does this all by itself. @code{%A}
in the format string is replaced by the appropriate weekday
name of the locale currently selected by @code{LC_TIME}. This is an
easy example, and wherever possible functions do things automatically
in this way.
But there are quite often situations when there is simply no function
to perform the task, or it is simply not possible to do the work
automatically. For these cases it is necessary to access the
information in the locale directly. To do this the C library provides
two functions: @code{localeconv} and @code{nl_langinfo}. The former is
part of @w{ISO C} and therefore portable, but has a brain-damaged
interface. The second is part of the Unix interface and is portable in
as far as the system follows the Unix standards.
@menu
* The Lame Way to Locale Data:: ISO C's @code{localeconv}.
* The Elegant and Fast Way:: X/Open's @code{nl_langinfo}.
@end menu
@node The Lame Way to Locale Data, The Elegant and Fast Way, ,Locale Information
@subsection @code{localeconv}: It is portable but @dots{}
Together with the @code{setlocale} function the @w{ISO C} people
invented the @code{localeconv} function. It is a masterpiece of poor
design. It is expensive to use, not extensible, and not generally
usable as it provides access to only @code{LC_MONETARY} and
@code{LC_NUMERIC} related information. Nevertheless, if it is
applicable to a given situation it should be used since it is very
portable. The function @code{strfmon} formats monetary amounts
according to the selected locale using this information.
@pindex locale.h
@cindex monetary value formatting
@cindex numeric value formatting
@deftypefun {struct lconv *} localeconv (void)
@standards{ISO, locale.h}
@safety{@prelim{}@mtunsafe{@mtasurace{:localeconv} @mtslocale{}}@asunsafe{}@acsafe{}}
@c This function reads from multiple components of the locale object,
@c without synchronization, while writing to the static buffer it uses
@c as the return value.
The @code{localeconv} function returns a pointer to a structure whose
components contain information about how numeric and monetary values
should be formatted in the current locale.
You should not modify the structure or its contents. The structure might
be overwritten by subsequent calls to @code{localeconv}, or by calls to
@code{setlocale}, but no other function in the library overwrites this
value.
@end deftypefun
@deftp {Data Type} {struct lconv}
@standards{ISO, locale.h}
@code{localeconv}'s return value is of this data type. Its elements are
described in the following subsections.
@end deftp
If a member of the structure @code{struct lconv} has type @code{char},
and the value is @code{CHAR_MAX}, it means that the current locale has
no value for that parameter.
@menu
* General Numeric:: Parameters for formatting numbers and
currency amounts.
* Currency Symbol:: How to print the symbol that identifies an
amount of money (e.g. @samp{$}).
* Sign of Money Amount:: How to print the (positive or negative) sign
for a monetary amount, if one exists.
@end menu
@node General Numeric, Currency Symbol, , The Lame Way to Locale Data
@subsubsection Generic Numeric Formatting Parameters
These are the standard members of @code{struct lconv}; there may be
others.
@table @code
@item char *decimal_point
@itemx char *mon_decimal_point
These are the decimal-point separators used in formatting non-monetary
and monetary quantities, respectively. In the @samp{C} locale, the
value of @code{decimal_point} is @code{"."}, and the value of
@code{mon_decimal_point} is @code{""}.
@cindex decimal-point separator
@item char *thousands_sep
@itemx char *mon_thousands_sep
These are the separators used to delimit groups of digits to the left of
the decimal point in formatting non-monetary and monetary quantities,
respectively. In the @samp{C} locale, both members have a value of
@code{""} (the empty string).
@item char *grouping
@itemx char *mon_grouping
These are strings that specify how to group the digits to the left of
the decimal point. @code{grouping} applies to non-monetary quantities
and @code{mon_grouping} applies to monetary quantities. Use either
@code{thousands_sep} or @code{mon_thousands_sep} to separate the digit
groups.
@cindex grouping of digits
Each member of these strings is to be interpreted as an integer value of
type @code{char}. Successive numbers (from left to right) give the
sizes of successive groups (from right to left, starting at the decimal
point.) The last member is either @code{0}, in which case the previous
member is used over and over again for all the remaining groups, or
@code{CHAR_MAX}, in which case there is no more grouping---or, put
another way, any remaining digits form one large group without
separators.
For example, if @code{grouping} is @code{"\04\03\02"}, the correct
grouping for the number @code{123456787654321} is @samp{12}, @samp{34},
@samp{56}, @samp{78}, @samp{765}, @samp{4321}. This uses a group of 4
digits at the end, preceded by a group of 3 digits, preceded by groups
of 2 digits (as many as needed). With a separator of @samp{,}, the
number would be printed as @samp{12,34,56,78,765,4321}.
A value of @code{"\03"} indicates repeated groups of three digits, as
normally used in the U.S.
In the standard @samp{C} locale, both @code{grouping} and
@code{mon_grouping} have a value of @code{""}. This value specifies no
grouping at all.
@item char int_frac_digits
@itemx char frac_digits
These are small integers indicating how many fractional digits (to the
right of the decimal point) should be displayed in a monetary value in
international and local formats, respectively. (Most often, both
members have the same value.)
In the standard @samp{C} locale, both of these members have the value
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
what to do when you find this value; we recommend printing no
fractional digits. (This locale also specifies the empty string for
@code{mon_decimal_point}, so printing any fractional digits would be
confusing!)
@end table
@node Currency Symbol, Sign of Money Amount, General Numeric, The Lame Way to Locale Data
@subsubsection Printing the Currency Symbol
@cindex currency symbols
These members of the @code{struct lconv} structure specify how to print
the symbol to identify a monetary value---the international analog of
@samp{$} for US dollars.
Each country has two standard currency symbols. The @dfn{local currency
symbol} is used commonly within the country, while the
@dfn{international currency symbol} is used internationally to refer to
that country's currency when it is necessary to indicate the country
unambiguously.
For example, many countries use the dollar as their monetary unit, and
when dealing with international currencies it's important to specify
that one is dealing with (say) Canadian dollars instead of U.S. dollars
or Australian dollars. But when the context is known to be Canada,
there is no need to make this explicit---dollar amounts are implicitly
assumed to be in Canadian dollars.
@table @code
@item char *currency_symbol
The local currency symbol for the selected locale.
In the standard @samp{C} locale, this member has a value of @code{""}
(the empty string), meaning ``unspecified''. The ISO standard doesn't
say what to do when you find this value; we recommend you simply print
the empty string as you would print any other string pointed to by this
variable.
@item char *int_curr_symbol
The international currency symbol for the selected locale.
The value of @code{int_curr_symbol} should normally consist of a
three-letter abbreviation determined by the international standard
@cite{ISO 4217 Codes for the Representation of Currency and Funds},
followed by a one-character separator (often a space).
In the standard @samp{C} locale, this member has a value of @code{""}
(the empty string), meaning ``unspecified''. We recommend you simply print
the empty string as you would print any other string pointed to by this
variable.
@item char p_cs_precedes
@itemx char n_cs_precedes
@itemx char int_p_cs_precedes
@itemx char int_n_cs_precedes
These members are @code{1} if the @code{currency_symbol} or
@code{int_curr_symbol} strings should precede the value of a monetary
amount, or @code{0} if the strings should follow the value. The
@code{p_cs_precedes} and @code{int_p_cs_precedes} members apply to
positive amounts (or zero), and the @code{n_cs_precedes} and
@code{int_n_cs_precedes} members apply to negative amounts.
In the standard @samp{C} locale, all of these members have a value of
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
what to do when you find this value. We recommend printing the
currency symbol before the amount, which is right for most countries.
In other words, treat all nonzero values alike in these members.
The members with the @code{int_} prefix apply to the
@code{int_curr_symbol} while the other two apply to
@code{currency_symbol}.
@item char p_sep_by_space
@itemx char n_sep_by_space
@itemx char int_p_sep_by_space
@itemx char int_n_sep_by_space
These members are @code{1} if a space should appear between the
@code{currency_symbol} or @code{int_curr_symbol} strings and the
amount, or @code{0} if no space should appear. The
@code{p_sep_by_space} and @code{int_p_sep_by_space} members apply to
positive amounts (or zero), and the @code{n_sep_by_space} and
@code{int_n_sep_by_space} members apply to negative amounts.
In the standard @samp{C} locale, all of these members have a value of
@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
what you should do when you find this value; we suggest you treat it as
1 (print a space). In other words, treat all nonzero values alike in
these members.
The members with the @code{int_} prefix apply to the
@code{int_curr_symbol} while the other two apply to
@code{currency_symbol}. There is one specialty with the
@code{int_curr_symbol}, though. Since all legal values contain a space
at the end of the string one either prints this space (if the currency
symbol must appear in front and must be separated) or one has to avoid
printing this character at all (especially when at the end of the
string).
@end table
@node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data
@subsubsection Printing the Sign of a Monetary Amount
These members of the @code{struct lconv} structure specify how to print
the sign (if any) of a monetary value.
@table @code
@item char *positive_sign
@itemx char *negative_sign
These are strings used to indicate positive (or zero) and negative
monetary quantities, respectively.
In the standard @samp{C} locale, both of these members have a value of
@code{""} (the empty string), meaning ``unspecified''.
The ISO standard doesn't say what to do when you find this value; we
recommend printing @code{positive_sign} as you find it, even if it is
empty. For a negative value, print @code{negative_sign} as you find it
unless both it and @code{positive_sign} are empty, in which case print
@samp{-} instead. (Failing to indicate the sign at all seems rather
unreasonable.)
@item char p_sign_posn
@itemx char n_sign_posn
@itemx char int_p_sign_posn
@itemx char int_n_sign_posn
These members are small integers that indicate how to
position the sign for nonnegative and negative monetary quantities,
respectively. (The string used for the sign is what was specified with
@code{positive_sign} or @code{negative_sign}.) The possible values are
as follows:
@table @code
@item 0
The currency symbol and quantity should be surrounded by parentheses.
@item 1
Print the sign string before the quantity and currency symbol.
@item 2
Print the sign string after the quantity and currency symbol.
@item 3
Print the sign string right before the currency symbol.
@item 4
Print the sign string right after the currency symbol.
@item CHAR_MAX
``Unspecified''. Both members have this value in the standard
@samp{C} locale.
@end table
The ISO standard doesn't say what you should do when the value is
@code{CHAR_MAX}. We recommend you print the sign after the currency
symbol.
The members with the @code{int_} prefix apply to the
@code{int_curr_symbol} while the other two apply to
@code{currency_symbol}.
@end table
@node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information
@subsection Pinpoint Access to Locale Data
When writing the X/Open Portability Guide the authors realized that the
@code{localeconv} function is not enough to provide reasonable access to
locale information. The information which was meant to be available
in the locale (as later specified in the POSIX.1 standard) requires more
ways to access it. Therefore the @code{nl_langinfo} function
was introduced.
@deftypefun {char *} nl_langinfo (nl_item @var{item})
@standards{XOPEN, langinfo.h}
@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
@c It calls _nl_langinfo_l with the current locale, which returns a
@c pointer into constant strings defined in locale data structures.
The @code{nl_langinfo} function can be used to access individual
elements of the locale categories. Unlike the @code{localeconv}
function, which returns all the information, @code{nl_langinfo}
lets the caller select what information it requires. This is very
fast and it is not a problem to call this function multiple times.
A second advantage is that in addition to the numeric and monetary
formatting information, information from the
@code{LC_TIME} and @code{LC_MESSAGES} categories is available.
@pindex langinfo.h
The type @code{nl_item} is defined in @file{nl_types.h}. The argument
@var{item} is a numeric value defined in the header @file{langinfo.h}.
The X/Open standard defines the following values:
@vtable @code
@item CODESET
@code{nl_langinfo} returns a string with the name of the coded character
set used in the selected locale.
@item ABDAY_1
@itemx ABDAY_2
@itemx ABDAY_3
@itemx ABDAY_4
@itemx ABDAY_5
@itemx ABDAY_6
@itemx ABDAY_7
@code{nl_langinfo} returns the abbreviated weekday name. @code{ABDAY_1}
corresponds to Sunday.
@item DAY_1
@itemx DAY_2
@itemx DAY_3
@itemx DAY_4
@itemx DAY_5
@itemx DAY_6
@itemx DAY_7
Similar to @code{ABDAY_1}, etc.,@: but here the return value is the
unabbreviated weekday name.
@item ABMON_1
@itemx ABMON_2
@itemx ABMON_3
@itemx ABMON_4
@itemx ABMON_5
@itemx ABMON_6
@itemx ABMON_7
@itemx ABMON_8
@itemx ABMON_9
@itemx ABMON_10
@itemx ABMON_11
@itemx ABMON_12
The return value is the abbreviated name of the month, in the
grammatical form used when the month forms part of a complete date.
@code{ABMON_1} corresponds to January.
@item MON_1
@itemx MON_2
@itemx MON_3
@itemx MON_4
@itemx MON_5
@itemx MON_6
@itemx MON_7
@itemx MON_8
@itemx MON_9
@itemx MON_10
@itemx MON_11
@itemx MON_12
Similar to @code{ABMON_1}, etc.,@: but here the month names are not
abbreviated. Here the first value @code{MON_1} also corresponds to
January.
@item ALTMON_1
@itemx ALTMON_2
@itemx ALTMON_3
@itemx ALTMON_4
@itemx ALTMON_5
@itemx ALTMON_6
@itemx ALTMON_7
@itemx ALTMON_8
@itemx ALTMON_9
@itemx ALTMON_10
@itemx ALTMON_11
@itemx ALTMON_12
Similar to @code{MON_1}, etc.,@: but here the month names are in the
grammatical form used when the month is named by itself. The
@code{strftime} functions use these month names for the conversion
specifier @code{%OB} (@pxref{Formatting Calendar Time}).
Note that not all languages need two different forms of the month names,
so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
may or may not be the same, depending on the locale.
@strong{NB:} @code{ABALTMON_@dots{}} constants corresponding to the
@code{%Ob} conversion specifier are not currently provided, but are
expected to be in a future release. In the meantime, it is possible
to use @code{_NL_ABALTMON_@dots{}}.
@item AM_STR
@itemx PM_STR
The return values are strings which can be used in the representation of time
as an hour from 1 to 12 plus an am/pm specifier.
Note that in locales which do not use this time representation
these strings might be empty, in which case the am/pm format
cannot be used at all.
@item D_T_FMT
The return value can be used as a format string for @code{strftime} to
represent time and date in a locale-specific way.
@item D_FMT
The return value can be used as a format string for @code{strftime} to
represent a date in a locale-specific way.
@item T_FMT
The return value can be used as a format string for @code{strftime} to
represent time in a locale-specific way.
@item T_FMT_AMPM