1
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html401/loose.dtd">
3
<!-- This file documents the GNU C library.
5
This is Edition 0.12, last updated 2007-10-27,
6
of The GNU C Library Reference Manual, for version
7
2.8 (Ubuntu EGLIBC 2.12~20100519-0ubuntu1~ppa1) .
9
Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002,
10
2003, 2007, 2008 Free Software Foundation, Inc.
12
Permission is granted to copy, distribute and/or modify this document
13
under the terms of the GNU Free Documentation License, Version 1.2 or
14
any later version published by the Free Software Foundation; with the
15
Invariant Sections being "Free Software Needs Free Documentation"
16
and "GNU Lesser General Public License", the Front-Cover texts being
17
"A GNU Manual", and with the Back-Cover Texts as in (a) below. A
18
copy of the license is included in the section entitled "GNU Free
19
Documentation License".
21
(a) The FSF's Back-Cover Text is: "You have the freedom to
22
copy and modify this GNU manual. Buying copies from the FSF
23
supports it in developing GNU and promoting software freedom."
25
<!-- Created on May 20, 2010 by texi2html 1.82
26
texi2html was written by:
27
Lionel Cons <Lionel.Cons@cern.ch> (original author)
28
Karl Berry <karl@freefriends.org>
29
Olaf Bachmann <obachman@mathematik.uni-kl.de>
31
Maintained by: Many creative people.
32
Send bugs and suggestions to <texi2html-bug@nongnu.org>
35
<title>The GNU C Library: 4. Character Handling</title>
37
<meta name="description" content="The GNU C Library: 4. Character Handling">
38
<meta name="keywords" content="The GNU C Library: 4. Character Handling">
39
<meta name="resource-type" content="document">
40
<meta name="distribution" content="global">
41
<meta name="Generator" content="texi2html 1.82">
42
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
43
<style type="text/css">
45
a.summary-letter {text-decoration: none}
46
blockquote.smallquotation {font-size: smaller}
47
pre.display {font-family: serif}
48
pre.format {font-family: serif}
49
pre.menu-comment {font-family: serif}
50
pre.menu-preformatted {font-family: serif}
51
pre.smalldisplay {font-family: serif; font-size: smaller}
52
pre.smallexample {font-size: smaller}
53
pre.smallformat {font-family: serif; font-size: smaller}
54
pre.smalllisp {font-size: smaller}
55
span.roman {font-family:serif; font-weight:normal;}
56
span.sansserif {font-family:sans-serif; font-weight:normal;}
57
ul.toc {list-style: none}
64
<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
66
<a name="Character-Handling"></a>
67
<table cellpadding="1" cellspacing="1" border="0">
68
<tr><td valign="middle" align="left">[<a href="libc_3.html#Page-Lock-Functions" title="Previous section in reading order"> < </a>]</td>
69
<td valign="middle" align="left">[<a href="#Classification-of-Characters" title="Next section in reading order"> > </a>]</td>
70
<td valign="middle" align="left"> </td>
71
<td valign="middle" align="left">[<a href="libc_3.html#Memory" title="Beginning of this chapter or previous chapter"> << </a>]</td>
72
<td valign="middle" align="left">[<a href="libc.html#Top" title="Up section"> Up </a>]</td>
73
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
74
<td valign="middle" align="left"> </td>
75
<td valign="middle" align="left"> </td>
76
<td valign="middle" align="left"> </td>
77
<td valign="middle" align="left"> </td>
78
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
79
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
80
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
81
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
83
<a name="Character-Handling-1"></a>
84
<h1 class="chapter">4. Character Handling</h1>
86
<p>Programs that work with characters and strings often need to classify a
87
character—is it alphabetic, is it a digit, is it whitespace, and so
88
on—and perform case conversion operations on characters. The
89
functions in the header file ‘<tt>ctype.h</tt>’ are provided for this
91
<a name="index-ctype_002eh"></a>
93
<p>Since the choice of locale and character set can alter the
94
classifications of particular character codes, all of these functions
95
are affected by the current locale. (More precisely, they are affected
96
by the locale currently selected for character classification—the
97
<code>LC_CTYPE</code> category; see <a href="libc_7.html#Locale-Categories">Categories of Activities that Locales Affect</a>.)
99
<p>The ISO C standard specifies two different sets of functions. The
100
one set works on <code>char</code> type characters, the other one on
101
<code>wchar_t</code> wide characters (see section <a href="libc_6.html#Extended-Char-Intro">Introduction to Extended Characters</a>).
103
<table class="menu" border="0" cellspacing="0">
104
<tr><td align="left" valign="top"><a href="#Classification-of-Characters">4.1 Classification of Characters</a></td><td> </td><td align="left" valign="top"> Testing whether characters are
105
letters, digits, punctuation, etc.
107
<tr><td align="left" valign="top"><a href="#Case-Conversion">4.2 Case Conversion</a></td><td> </td><td align="left" valign="top"> Case mapping, and the like.
109
<tr><td align="left" valign="top"><a href="#Classification-of-Wide-Characters">4.3 Character class determination for wide characters</a></td><td> </td><td align="left" valign="top"></td></tr>
110
<tr><td align="left" valign="top"><a href="#Using-Wide-Char-Classes">4.4 Notes on using the wide character classes</a></td><td> </td><td align="left" valign="top"></td></tr>
111
<tr><td align="left" valign="top"><a href="#Wide-Character-Case-Conversion">4.5 Mapping of wide characters.</a></td><td> </td><td align="left" valign="top"></td></tr>
115
<a name="Classification-of-Characters"></a>
116
<table cellpadding="1" cellspacing="1" border="0">
117
<tr><td valign="middle" align="left">[<a href="#Character-Handling" title="Previous section in reading order"> < </a>]</td>
118
<td valign="middle" align="left">[<a href="#Case-Conversion" title="Next section in reading order"> > </a>]</td>
119
<td valign="middle" align="left"> </td>
120
<td valign="middle" align="left">[<a href="#Character-Handling" title="Beginning of this chapter or previous chapter"> << </a>]</td>
121
<td valign="middle" align="left">[<a href="#Character-Handling" title="Up section"> Up </a>]</td>
122
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
123
<td valign="middle" align="left"> </td>
124
<td valign="middle" align="left"> </td>
125
<td valign="middle" align="left"> </td>
126
<td valign="middle" align="left"> </td>
127
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
128
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
129
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
130
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
132
<a name="Classification-of-Characters-1"></a>
133
<h2 class="section">4.1 Classification of Characters</h2>
134
<a name="index-character-testing"></a>
135
<a name="index-classification-of-characters"></a>
136
<a name="index-predicates-on-characters"></a>
137
<a name="index-character-predicates"></a>
139
<p>This section explains the library functions for classifying characters.
140
For example, <code>isalpha</code> is the function to test for an alphabetic
141
character. It takes one argument, the character to test, and returns a
142
nonzero integer if the character is alphabetic, and zero otherwise. You
143
would use it like this:
145
<table><tr><td> </td><td><pre class="smallexample">if (isalpha (c))
146
printf ("The character `%c' is alphabetic.\n", c);
147
</pre></td></tr></table>
149
<p>Each of the functions in this section tests for membership in a
150
particular class of characters; each has a name starting with ‘<samp>is</samp>’.
151
Each of them takes one argument, which is a character to test, and
152
returns an <code>int</code> which is treated as a boolean value. The
153
character argument is passed as an <code>int</code>, and it may be the
154
constant value <code>EOF</code> instead of a real character.
156
<p>The attributes of any given character can vary between locales.
157
See section <a href="libc_7.html#Locales">Locales and Internationalization</a>, for more information on locales.
159
<p>These functions are declared in the header file ‘<tt>ctype.h</tt>’.
160
<a name="index-ctype_002eh-1"></a>
162
<a name="index-lower_002dcase-character"></a>
164
<dt><a name="index-islower"></a><u>Function:</u> int <b>islower</b><i> (int <var>c</var>)</i></dt>
165
<dd><p>Returns true if <var>c</var> is a lower-case letter. The letter need not be
166
from the Latin alphabet, any alphabet representable is valid.
169
<a name="index-upper_002dcase-character"></a>
171
<dt><a name="index-isupper"></a><u>Function:</u> int <b>isupper</b><i> (int <var>c</var>)</i></dt>
172
<dd><p>Returns true if <var>c</var> is an upper-case letter. The letter need not be
173
from the Latin alphabet, any alphabet representable is valid.
176
<a name="index-alphabetic-character"></a>
178
<dt><a name="index-isalpha"></a><u>Function:</u> int <b>isalpha</b><i> (int <var>c</var>)</i></dt>
179
<dd><p>Returns true if <var>c</var> is an alphabetic character (a letter). If
180
<code>islower</code> or <code>isupper</code> is true of a character, then
181
<code>isalpha</code> is also true.
183
<p>In some locales, there may be additional characters for which
184
<code>isalpha</code> is true—letters which are neither upper case nor lower
185
case. But in the standard <code>"C"</code> locale, there are no such
186
additional characters.
189
<a name="index-digit-character"></a>
190
<a name="index-decimal-digit-character"></a>
192
<dt><a name="index-isdigit"></a><u>Function:</u> int <b>isdigit</b><i> (int <var>c</var>)</i></dt>
193
<dd><p>Returns true if <var>c</var> is a decimal digit (‘<samp>0</samp>’ through ‘<samp>9</samp>’).
196
<a name="index-alphanumeric-character"></a>
198
<dt><a name="index-isalnum"></a><u>Function:</u> int <b>isalnum</b><i> (int <var>c</var>)</i></dt>
199
<dd><p>Returns true if <var>c</var> is an alphanumeric character (a letter or
200
number); in other words, if either <code>isalpha</code> or <code>isdigit</code> is
201
true of a character, then <code>isalnum</code> is also true.
204
<a name="index-hexadecimal-digit-character"></a>
206
<dt><a name="index-isxdigit"></a><u>Function:</u> int <b>isxdigit</b><i> (int <var>c</var>)</i></dt>
207
<dd><p>Returns true if <var>c</var> is a hexadecimal digit.
208
Hexadecimal digits include the normal decimal digits ‘<samp>0</samp>’ through
209
‘<samp>9</samp>’ and the letters ‘<samp>A</samp>’ through ‘<samp>F</samp>’ and
210
‘<samp>a</samp>’ through ‘<samp>f</samp>’.
213
<a name="index-punctuation-character"></a>
215
<dt><a name="index-ispunct"></a><u>Function:</u> int <b>ispunct</b><i> (int <var>c</var>)</i></dt>
216
<dd><p>Returns true if <var>c</var> is a punctuation character.
217
This means any printing character that is not alphanumeric or a space
221
<a name="index-whitespace-character"></a>
223
<dt><a name="index-isspace"></a><u>Function:</u> int <b>isspace</b><i> (int <var>c</var>)</i></dt>
224
<dd><p>Returns true if <var>c</var> is a <em>whitespace</em> character. In the standard
225
<code>"C"</code> locale, <code>isspace</code> returns true for only the standard
226
whitespace characters:
228
<dl compact="compact">
229
<dt> <code>' '</code></dt>
233
<dt> <code>'\f'</code></dt>
237
<dt> <code>'\n'</code></dt>
241
<dt> <code>'\r'</code></dt>
242
<dd><p>carriage return
245
<dt> <code>'\t'</code></dt>
246
<dd><p>horizontal tab
249
<dt> <code>'\v'</code></dt>
255
<a name="index-blank-character"></a>
257
<dt><a name="index-isblank"></a><u>Function:</u> int <b>isblank</b><i> (int <var>c</var>)</i></dt>
258
<dd><p>Returns true if <var>c</var> is a blank character; that is, a space or a tab.
259
This function was originally a GNU extension, but was added in ISO C99.
262
<a name="index-graphic-character"></a>
264
<dt><a name="index-isgraph"></a><u>Function:</u> int <b>isgraph</b><i> (int <var>c</var>)</i></dt>
265
<dd><p>Returns true if <var>c</var> is a graphic character; that is, a character
266
that has a glyph associated with it. The whitespace characters are not
270
<a name="index-printing-character"></a>
272
<dt><a name="index-isprint"></a><u>Function:</u> int <b>isprint</b><i> (int <var>c</var>)</i></dt>
273
<dd><p>Returns true if <var>c</var> is a printing character. Printing characters
274
include all the graphic characters, plus the space (‘<samp> </samp>’) character.
277
<a name="index-control-character"></a>
279
<dt><a name="index-iscntrl"></a><u>Function:</u> int <b>iscntrl</b><i> (int <var>c</var>)</i></dt>
280
<dd><p>Returns true if <var>c</var> is a control character (that is, a character that
281
is not a printing character).
284
<a name="index-ASCII-character"></a>
286
<dt><a name="index-isascii"></a><u>Function:</u> int <b>isascii</b><i> (int <var>c</var>)</i></dt>
287
<dd><p>Returns true if <var>c</var> is a 7-bit <code>unsigned char</code> value that fits
288
into the US/UK ASCII character set. This function is a BSD extension
289
and is also an SVID extension.
293
<a name="Case-Conversion"></a>
294
<table cellpadding="1" cellspacing="1" border="0">
295
<tr><td valign="middle" align="left">[<a href="#Classification-of-Characters" title="Previous section in reading order"> < </a>]</td>
296
<td valign="middle" align="left">[<a href="#Classification-of-Wide-Characters" title="Next section in reading order"> > </a>]</td>
297
<td valign="middle" align="left"> </td>
298
<td valign="middle" align="left">[<a href="#Character-Handling" title="Beginning of this chapter or previous chapter"> << </a>]</td>
299
<td valign="middle" align="left">[<a href="#Character-Handling" title="Up section"> Up </a>]</td>
300
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
301
<td valign="middle" align="left"> </td>
302
<td valign="middle" align="left"> </td>
303
<td valign="middle" align="left"> </td>
304
<td valign="middle" align="left"> </td>
305
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
306
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
307
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
308
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
310
<a name="Case-Conversion-1"></a>
311
<h2 class="section">4.2 Case Conversion</h2>
312
<a name="index-character-case-conversion"></a>
313
<a name="index-case-conversion-of-characters"></a>
314
<a name="index-converting-case-of-characters"></a>
316
<p>This section explains the library functions for performing conversions
317
such as case mappings on characters. For example, <code>toupper</code>
318
converts any character to upper case if possible. If the character
319
can’t be converted, <code>toupper</code> returns it unchanged.
321
<p>These functions take one argument of type <code>int</code>, which is the
322
character to convert, and return the converted character as an
323
<code>int</code>. If the conversion is not applicable to the argument given,
324
the argument is returned unchanged.
326
<p><strong>Compatibility Note:</strong> In pre-ISO C dialects, instead of
327
returning the argument unchanged, these functions may fail when the
328
argument is not suitable for the conversion. Thus for portability, you
329
may need to write <code>islower(c) ? toupper(c) : c</code> rather than just
330
<code>toupper(c)</code>.
332
<p>These functions are declared in the header file ‘<tt>ctype.h</tt>’.
333
<a name="index-ctype_002eh-2"></a>
336
<dt><a name="index-tolower"></a><u>Function:</u> int <b>tolower</b><i> (int <var>c</var>)</i></dt>
337
<dd><p>If <var>c</var> is an upper-case letter, <code>tolower</code> returns the corresponding
338
lower-case letter. If <var>c</var> is not an upper-case letter,
339
<var>c</var> is returned unchanged.
343
<dt><a name="index-toupper"></a><u>Function:</u> int <b>toupper</b><i> (int <var>c</var>)</i></dt>
344
<dd><p>If <var>c</var> is a lower-case letter, <code>toupper</code> returns the corresponding
345
upper-case letter. Otherwise <var>c</var> is returned unchanged.
349
<dt><a name="index-toascii"></a><u>Function:</u> int <b>toascii</b><i> (int <var>c</var>)</i></dt>
350
<dd><p>This function converts <var>c</var> to a 7-bit <code>unsigned char</code> value
351
that fits into the US/UK ASCII character set, by clearing the high-order
352
bits. This function is a BSD extension and is also an SVID extension.
356
<dt><a name="index-_005ftolower"></a><u>Function:</u> int <b>_tolower</b><i> (int <var>c</var>)</i></dt>
357
<dd><p>This is identical to <code>tolower</code>, and is provided for compatibility
358
with the SVID. See section <a href="libc_1.html#SVID">SVID (The System V Interface Description)</a>.
362
<dt><a name="index-_005ftoupper"></a><u>Function:</u> int <b>_toupper</b><i> (int <var>c</var>)</i></dt>
363
<dd><p>This is identical to <code>toupper</code>, and is provided for compatibility
369
<a name="Classification-of-Wide-Characters"></a>
370
<table cellpadding="1" cellspacing="1" border="0">
371
<tr><td valign="middle" align="left">[<a href="#Case-Conversion" title="Previous section in reading order"> < </a>]</td>
372
<td valign="middle" align="left">[<a href="#Using-Wide-Char-Classes" title="Next section in reading order"> > </a>]</td>
373
<td valign="middle" align="left"> </td>
374
<td valign="middle" align="left">[<a href="#Character-Handling" title="Beginning of this chapter or previous chapter"> << </a>]</td>
375
<td valign="middle" align="left">[<a href="#Character-Handling" title="Up section"> Up </a>]</td>
376
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
377
<td valign="middle" align="left"> </td>
378
<td valign="middle" align="left"> </td>
379
<td valign="middle" align="left"> </td>
380
<td valign="middle" align="left"> </td>
381
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
382
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
383
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
384
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
386
<a name="Character-class-determination-for-wide-characters"></a>
387
<h2 class="section">4.3 Character class determination for wide characters</h2>
389
<p>Amendment 1 to ISO C90 defines functions to classify wide
390
characters. Although the original ISO C90 standard already defined
391
the type <code>wchar_t</code>, no functions operating on them were defined.
393
<p>The general design of the classification functions for wide characters
394
is more general. It allows extensions to the set of available
395
classifications, beyond those which are always available. The POSIX
396
standard specifies how extensions can be made, and this is already
397
implemented in the GNU C library implementation of the <code>localedef</code>
400
<p>The character class functions are normally implemented with bitsets,
401
with a bitset per character. For a given character, the appropriate
402
bitset is read from a table and a test is performed as to whether a
403
certain bit is set. Which bit is tested for is determined by the
406
<p>For the wide character classification functions this is made visible.
407
There is a type classification type defined, a function to retrieve this
408
value for a given class, and a function to test whether a given
409
character is in this class, using the classification value. On top of
410
this the normal character classification functions as used for
411
<code>char</code> objects can be defined.
414
<dt><a name="index-wctype_005ft"></a><u>Data type:</u> <b>wctype_t</b></dt>
415
<dd><p>The <code>wctype_t</code> can hold a value which represents a character class.
416
The only defined way to generate such a value is by using the
417
<code>wctype</code> function.
419
<a name="index-wctype_002eh"></a>
420
<p>This type is defined in ‘<tt>wctype.h</tt>’.
424
<dt><a name="index-wctype"></a><u>Function:</u> wctype_t <b>wctype</b><i> (const char *<var>property</var>)</i></dt>
425
<dd><p>The <code>wctype</code> returns a value representing a class of wide
426
characters which is identified by the string <var>property</var>. Beside
427
some standard properties each locale can define its own ones. In case
428
no property with the given name is known for the current locale
429
selected for the <code>LC_CTYPE</code> category, the function returns zero.
431
<p>The properties known in every locale are:
434
<tr><td width="25%"><code>"alnum"</code></td><td width="25%"><code>"alpha"</code></td><td width="25%"><code>"cntrl"</code></td><td width="25%"><code>"digit"</code></td></tr>
435
<tr><td width="25%"><code>"graph"</code></td><td width="25%"><code>"lower"</code></td><td width="25%"><code>"print"</code></td><td width="25%"><code>"punct"</code></td></tr>
436
<tr><td width="25%"><code>"space"</code></td><td width="25%"><code>"upper"</code></td><td width="25%"><code>"xdigit"</code></td></tr>
439
<a name="index-wctype_002eh-1"></a>
440
<p>This function is declared in ‘<tt>wctype.h</tt>’.
443
<p>To test the membership of a character to one of the non-standard classes
444
the ISO C standard defines a completely new function.
447
<dt><a name="index-iswctype"></a><u>Function:</u> int <b>iswctype</b><i> (wint_t <var>wc</var>, wctype_t <var>desc</var>)</i></dt>
448
<dd><p>This function returns a nonzero value if <var>wc</var> is in the character
449
class specified by <var>desc</var>. <var>desc</var> must previously be returned
450
by a successful call to <code>wctype</code>.
452
<a name="index-wctype_002eh-2"></a>
453
<p>This function is declared in ‘<tt>wctype.h</tt>’.
456
<p>To make it easier to use the commonly-used classification functions,
457
they are defined in the C library. There is no need to use
458
<code>wctype</code> if the property string is one of the known character
459
classes. In some situations it is desirable to construct the property
460
strings, and then it is important that <code>wctype</code> can also handle the
463
<a name="index-alphanumeric-character-1"></a>
465
<dt><a name="index-iswalnum"></a><u>Function:</u> int <b>iswalnum</b><i> (wint_t <var>wc</var>)</i></dt>
466
<dd><p>This function returns a nonzero value if <var>wc</var> is an alphanumeric
467
character (a letter or number); in other words, if either <code>iswalpha</code>
468
or <code>iswdigit</code> is true of a character, then <code>iswalnum</code> is also
471
<p>This function can be implemented using
473
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("alnum"))
474
</pre></td></tr></table>
476
<a name="index-wctype_002eh-3"></a>
477
<p>It is declared in ‘<tt>wctype.h</tt>’.
480
<a name="index-alphabetic-character-1"></a>
482
<dt><a name="index-iswalpha"></a><u>Function:</u> int <b>iswalpha</b><i> (wint_t <var>wc</var>)</i></dt>
483
<dd><p>Returns true if <var>wc</var> is an alphabetic character (a letter). If
484
<code>iswlower</code> or <code>iswupper</code> is true of a character, then
485
<code>iswalpha</code> is also true.
487
<p>In some locales, there may be additional characters for which
488
<code>iswalpha</code> is true—letters which are neither upper case nor lower
489
case. But in the standard <code>"C"</code> locale, there are no such
490
additional characters.
492
<p>This function can be implemented using
494
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("alpha"))
495
</pre></td></tr></table>
497
<a name="index-wctype_002eh-4"></a>
498
<p>It is declared in ‘<tt>wctype.h</tt>’.
501
<a name="index-control-character-1"></a>
503
<dt><a name="index-iswcntrl"></a><u>Function:</u> int <b>iswcntrl</b><i> (wint_t <var>wc</var>)</i></dt>
504
<dd><p>Returns true if <var>wc</var> is a control character (that is, a character that
505
is not a printing character).
507
<p>This function can be implemented using
509
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("cntrl"))
510
</pre></td></tr></table>
512
<a name="index-wctype_002eh-5"></a>
513
<p>It is declared in ‘<tt>wctype.h</tt>’.
516
<a name="index-digit-character-1"></a>
518
<dt><a name="index-iswdigit"></a><u>Function:</u> int <b>iswdigit</b><i> (wint_t <var>wc</var>)</i></dt>
519
<dd><p>Returns true if <var>wc</var> is a digit (e.g., ‘<samp>0</samp>’ through ‘<samp>9</samp>’).
520
Please note that this function does not only return a nonzero value for
521
<em>decimal</em> digits, but for all kinds of digits. A consequence is
522
that code like the following will <strong>not</strong> work unconditionally for
525
<table><tr><td> </td><td><pre class="smallexample">n = 0;
526
while (iswdigit (*wc))
531
</pre></td></tr></table>
533
<p>This function can be implemented using
535
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("digit"))
536
</pre></td></tr></table>
538
<a name="index-wctype_002eh-6"></a>
539
<p>It is declared in ‘<tt>wctype.h</tt>’.
542
<a name="index-graphic-character-1"></a>
544
<dt><a name="index-iswgraph"></a><u>Function:</u> int <b>iswgraph</b><i> (wint_t <var>wc</var>)</i></dt>
545
<dd><p>Returns true if <var>wc</var> is a graphic character; that is, a character
546
that has a glyph associated with it. The whitespace characters are not
549
<p>This function can be implemented using
551
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("graph"))
552
</pre></td></tr></table>
554
<a name="index-wctype_002eh-7"></a>
555
<p>It is declared in ‘<tt>wctype.h</tt>’.
558
<a name="index-lower_002dcase-character-1"></a>
560
<dt><a name="index-iswlower"></a><u>Function:</u> int <b>iswlower</b><i> (wint_t <var>wc</var>)</i></dt>
561
<dd><p>Returns true if <var>wc</var> is a lower-case letter. The letter need not be
562
from the Latin alphabet, any alphabet representable is valid.
564
<p>This function can be implemented using
566
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("lower"))
567
</pre></td></tr></table>
569
<a name="index-wctype_002eh-8"></a>
570
<p>It is declared in ‘<tt>wctype.h</tt>’.
573
<a name="index-printing-character-1"></a>
575
<dt><a name="index-iswprint"></a><u>Function:</u> int <b>iswprint</b><i> (wint_t <var>wc</var>)</i></dt>
576
<dd><p>Returns true if <var>wc</var> is a printing character. Printing characters
577
include all the graphic characters, plus the space (‘<samp> </samp>’) character.
579
<p>This function can be implemented using
581
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("print"))
582
</pre></td></tr></table>
584
<a name="index-wctype_002eh-9"></a>
585
<p>It is declared in ‘<tt>wctype.h</tt>’.
588
<a name="index-punctuation-character-1"></a>
590
<dt><a name="index-iswpunct"></a><u>Function:</u> int <b>iswpunct</b><i> (wint_t <var>wc</var>)</i></dt>
591
<dd><p>Returns true if <var>wc</var> is a punctuation character.
592
This means any printing character that is not alphanumeric or a space
595
<p>This function can be implemented using
597
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("punct"))
598
</pre></td></tr></table>
600
<a name="index-wctype_002eh-10"></a>
601
<p>It is declared in ‘<tt>wctype.h</tt>’.
604
<a name="index-whitespace-character-1"></a>
606
<dt><a name="index-iswspace"></a><u>Function:</u> int <b>iswspace</b><i> (wint_t <var>wc</var>)</i></dt>
607
<dd><p>Returns true if <var>wc</var> is a <em>whitespace</em> character. In the standard
608
<code>"C"</code> locale, <code>iswspace</code> returns true for only the standard
609
whitespace characters:
611
<dl compact="compact">
612
<dt> <code>L' '</code></dt>
616
<dt> <code>L'\f'</code></dt>
620
<dt> <code>L'\n'</code></dt>
624
<dt> <code>L'\r'</code></dt>
625
<dd><p>carriage return
628
<dt> <code>L'\t'</code></dt>
629
<dd><p>horizontal tab
632
<dt> <code>L'\v'</code></dt>
637
<p>This function can be implemented using
639
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("space"))
640
</pre></td></tr></table>
642
<a name="index-wctype_002eh-11"></a>
643
<p>It is declared in ‘<tt>wctype.h</tt>’.
646
<a name="index-upper_002dcase-character-1"></a>
648
<dt><a name="index-iswupper"></a><u>Function:</u> int <b>iswupper</b><i> (wint_t <var>wc</var>)</i></dt>
649
<dd><p>Returns true if <var>wc</var> is an upper-case letter. The letter need not be
650
from the Latin alphabet, any alphabet representable is valid.
652
<p>This function can be implemented using
654
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("upper"))
655
</pre></td></tr></table>
657
<a name="index-wctype_002eh-12"></a>
658
<p>It is declared in ‘<tt>wctype.h</tt>’.
661
<a name="index-hexadecimal-digit-character-1"></a>
663
<dt><a name="index-iswxdigit"></a><u>Function:</u> int <b>iswxdigit</b><i> (wint_t <var>wc</var>)</i></dt>
664
<dd><p>Returns true if <var>wc</var> is a hexadecimal digit.
665
Hexadecimal digits include the normal decimal digits ‘<samp>0</samp>’ through
666
‘<samp>9</samp>’ and the letters ‘<samp>A</samp>’ through ‘<samp>F</samp>’ and
667
‘<samp>a</samp>’ through ‘<samp>f</samp>’.
669
<p>This function can be implemented using
671
<table><tr><td> </td><td><pre class="smallexample">iswctype (wc, wctype ("xdigit"))
672
</pre></td></tr></table>
674
<a name="index-wctype_002eh-13"></a>
675
<p>It is declared in ‘<tt>wctype.h</tt>’.
678
<p>The GNU C library also provides a function which is not defined in the
679
ISO C standard but which is available as a version for single byte
682
<a name="index-blank-character-1"></a>
684
<dt><a name="index-iswblank"></a><u>Function:</u> int <b>iswblank</b><i> (wint_t <var>wc</var>)</i></dt>
685
<dd><p>Returns true if <var>wc</var> is a blank character; that is, a space or a tab.
686
This function was originally a GNU extension, but was added in ISO C99.
687
It is declared in ‘<tt>wchar.h</tt>’.
691
<a name="Using-Wide-Char-Classes"></a>
692
<table cellpadding="1" cellspacing="1" border="0">
693
<tr><td valign="middle" align="left">[<a href="#Classification-of-Wide-Characters" title="Previous section in reading order"> < </a>]</td>
694
<td valign="middle" align="left">[<a href="#Wide-Character-Case-Conversion" title="Next section in reading order"> > </a>]</td>
695
<td valign="middle" align="left"> </td>
696
<td valign="middle" align="left">[<a href="#Character-Handling" title="Beginning of this chapter or previous chapter"> << </a>]</td>
697
<td valign="middle" align="left">[<a href="#Character-Handling" title="Up section"> Up </a>]</td>
698
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
699
<td valign="middle" align="left"> </td>
700
<td valign="middle" align="left"> </td>
701
<td valign="middle" align="left"> </td>
702
<td valign="middle" align="left"> </td>
703
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
704
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
705
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
706
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
708
<a name="Notes-on-using-the-wide-character-classes"></a>
709
<h2 class="section">4.4 Notes on using the wide character classes</h2>
711
<p>The first note is probably not astonishing but still occasionally a
712
cause of problems. The <code>isw<var>XXX</var></code> functions can be implemented
713
using macros and in fact, the GNU C library does this. They are still
714
available as real functions but when the ‘<tt>wctype.h</tt>’ header is
715
included the macros will be used. This is the same as the
716
<code>char</code> type versions of these functions.
718
<p>The second note covers something new. It can be best illustrated by a
719
(real-world) example. The first piece of code is an excerpt from the
720
original code. It is truncated a bit but the intention should be clear.
722
<table><tr><td> </td><td><pre class="smallexample">int
723
is_in_class (int c, const char *class)
725
if (strcmp (class, "alnum") == 0)
727
if (strcmp (class, "alpha") == 0)
729
if (strcmp (class, "cntrl") == 0)
734
</pre></td></tr></table>
736
<p>Now, with the <code>wctype</code> and <code>iswctype</code> you can avoid the
737
<code>if</code> cascades, but rewriting the code as follows is wrong:
739
<table><tr><td> </td><td><pre class="smallexample">int
740
is_in_class (int c, const char *class)
742
wctype_t desc = wctype (class);
743
return desc ? iswctype ((wint_t) c, desc) : 0;
745
</pre></td></tr></table>
747
<p>The problem is that it is not guaranteed that the wide character
748
representation of a single-byte character can be found using casting.
749
In fact, usually this fails miserably. The correct solution to this
750
problem is to write the code as follows:
752
<table><tr><td> </td><td><pre class="smallexample">int
753
is_in_class (int c, const char *class)
755
wctype_t desc = wctype (class);
756
return desc ? iswctype (btowc (c), desc) : 0;
758
</pre></td></tr></table>
760
<p>See section <a href="libc_6.html#Converting-a-Character">Converting Single Characters</a>, for more information on <code>btowc</code>.
761
Note that this change probably does not improve the performance
762
of the program a lot since the <code>wctype</code> function still has to make
763
the string comparisons. It gets really interesting if the
764
<code>is_in_class</code> function is called more than once for the
765
same class name. In this case the variable <var>desc</var> could be computed
766
once and reused for all the calls. Therefore the above form of the
767
function is probably not the final one.
771
<a name="Wide-Character-Case-Conversion"></a>
772
<table cellpadding="1" cellspacing="1" border="0">
773
<tr><td valign="middle" align="left">[<a href="#Using-Wide-Char-Classes" title="Previous section in reading order"> < </a>]</td>
774
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next section in reading order"> > </a>]</td>
775
<td valign="middle" align="left"> </td>
776
<td valign="middle" align="left">[<a href="#Character-Handling" title="Beginning of this chapter or previous chapter"> << </a>]</td>
777
<td valign="middle" align="left">[<a href="#Character-Handling" title="Up section"> Up </a>]</td>
778
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
779
<td valign="middle" align="left"> </td>
780
<td valign="middle" align="left"> </td>
781
<td valign="middle" align="left"> </td>
782
<td valign="middle" align="left"> </td>
783
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
784
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
785
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
786
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
788
<a name="Mapping-of-wide-characters_002e"></a>
789
<h2 class="section">4.5 Mapping of wide characters.</h2>
791
<p>The classification functions are also generalized by the ISO C
792
standard. Instead of just allowing the two standard mappings, a
793
locale can contain others. Again, the <code>localedef</code> program
794
already supports generating such locale data files.
797
<dt><a name="index-wctrans_005ft"></a><u>Data Type:</u> <b>wctrans_t</b></dt>
798
<dd><p>This data type is defined as a scalar type which can hold a value
799
representing the locale-dependent character mapping. There is no way to
800
construct such a value apart from using the return value of the
801
<code>wctrans</code> function.
803
<a name="index-wctype_002eh-14"></a>
804
<p>This type is defined in ‘<tt>wctype.h</tt>’.
808
<dt><a name="index-wctrans"></a><u>Function:</u> wctrans_t <b>wctrans</b><i> (const char *<var>property</var>)</i></dt>
809
<dd><p>The <code>wctrans</code> function has to be used to find out whether a named
810
mapping is defined in the current locale selected for the
811
<code>LC_CTYPE</code> category. If the returned value is non-zero, you can use
812
it afterwards in calls to <code>towctrans</code>. If the return value is
813
zero no such mapping is known in the current locale.
815
<p>Beside locale-specific mappings there are two mappings which are
816
guaranteed to be available in every locale:
819
<tr><td width="50%"><code>"tolower"</code></td><td width="50%"><code>"toupper"</code></td></tr>
822
<a name="index-wctype_002eh-15"></a>
823
<p>These functions are declared in ‘<tt>wctype.h</tt>’.
827
<dt><a name="index-towctrans"></a><u>Function:</u> wint_t <b>towctrans</b><i> (wint_t <var>wc</var>, wctrans_t <var>desc</var>)</i></dt>
828
<dd><p><code>towctrans</code> maps the input character <var>wc</var>
829
according to the rules of the mapping for which <var>desc</var> is a
830
descriptor, and returns the value it finds. <var>desc</var> must be
831
obtained by a successful call to <code>wctrans</code>.
833
<a name="index-wctype_002eh-16"></a>
834
<p>This function is declared in ‘<tt>wctype.h</tt>’.
837
<p>For the generally available mappings, the ISO C standard defines
838
convenient shortcuts so that it is not necessary to call <code>wctrans</code>
842
<dt><a name="index-towlower"></a><u>Function:</u> wint_t <b>towlower</b><i> (wint_t <var>wc</var>)</i></dt>
843
<dd><p>If <var>wc</var> is an upper-case letter, <code>towlower</code> returns the corresponding
844
lower-case letter. If <var>wc</var> is not an upper-case letter,
845
<var>wc</var> is returned unchanged.
847
<p><code>towlower</code> can be implemented using
849
<table><tr><td> </td><td><pre class="smallexample">towctrans (wc, wctrans ("tolower"))
850
</pre></td></tr></table>
852
<a name="index-wctype_002eh-17"></a>
853
<p>This function is declared in ‘<tt>wctype.h</tt>’.
857
<dt><a name="index-towupper"></a><u>Function:</u> wint_t <b>towupper</b><i> (wint_t <var>wc</var>)</i></dt>
858
<dd><p>If <var>wc</var> is a lower-case letter, <code>towupper</code> returns the corresponding
859
upper-case letter. Otherwise <var>wc</var> is returned unchanged.
861
<p><code>towupper</code> can be implemented using
863
<table><tr><td> </td><td><pre class="smallexample">towctrans (wc, wctrans ("toupper"))
864
</pre></td></tr></table>
866
<a name="index-wctype_002eh-18"></a>
867
<p>This function is declared in ‘<tt>wctype.h</tt>’.
870
<p>The same warnings given in the last section for the use of the wide
871
character classification functions apply here. It is not possible to
872
simply cast a <code>char</code> type value to a <code>wint_t</code> and use it as an
873
argument to <code>towctrans</code> calls.
875
<table cellpadding="1" cellspacing="1" border="0">
876
<tr><td valign="middle" align="left">[<a href="#Character-Handling" title="Beginning of this chapter or previous chapter"> << </a>]</td>
877
<td valign="middle" align="left">[<a href="libc_5.html#String-and-Array-Utilities" title="Next chapter"> >> </a>]</td>
878
<td valign="middle" align="left"> </td>
879
<td valign="middle" align="left"> </td>
880
<td valign="middle" align="left"> </td>
881
<td valign="middle" align="left"> </td>
882
<td valign="middle" align="left"> </td>
883
<td valign="middle" align="left">[<a href="libc.html#Top" title="Cover (top) of document">Top</a>]</td>
884
<td valign="middle" align="left">[<a href="libc_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
885
<td valign="middle" align="left">[<a href="libc_42.html#Concept-Index" title="Index">Index</a>]</td>
886
<td valign="middle" align="left">[<a href="libc_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
890
This document was generated by <em>root</em> on <em>May 20, 2010</em> using <a href="http://www.nongnu.org/texi2html/"><em>texi2html 1.82</em></a>.