4
.TH GAWK 1 "Nov 24 1994" "Free Software Foundation" "Utility Commands"
6
gawk \- pattern scanning and processing language
9
[ POSIX or GNU style options ]
17
[ POSIX or GNU style options ]
25
is the GNU Project's implementation of the AWK programming language.
26
It conforms to the definition of the language in
27
the \*(PX 1003.2 Command Language And Utilities Standard.
28
This version in turn is based on the description in
29
.IR "The AWK Programming Language" ,
30
by Aho, Kernighan, and Weinberger,
31
with the additional features defined in the System V Release 4 version
35
also provides some GNU-specific extensions.
37
The command line consists of options to
39
itself, the AWK program text (if not supplied via the
43
options), and values to be made
48
pre-defined AWK variables.
52
options may be either the traditional \*(PX one letter options,
53
or the GNU style long options. \*(PX style options start with a single ``\-'',
54
while GNU long options start with ``\-\^\-''.
55
GNU style long options are provided for both GNU-specific features and
56
for \*(PX mandated features. Other implementations of the AWK language
57
are likely to only accept the traditional one letter options.
59
Following the \*(PX standard,
61
options are supplied via arguments to the
65
options may be supplied, or multiple arguments may be supplied together
66
if they are separated by commas, or enclosed in quotes and separated
68
Case is ignored in arguments to the
73
option has a corresponding GNU style long option, as detailed below.
74
Arguments to GNU style long options are either joined with the option
77
sign, with no intervening spaces, or they may be provided in the
78
next command line argument.
81
accepts the following options.
87
.BI \-\^\-field-separator= fs
90
for the input field separator (the value of the
96
\fB\-v\fI var\fB\^=\^\fIval\fR
99
\fB\-\^\-assign=\fIvar\fB\^=\^\fIval\fR
104
before execution of the program begins.
105
Such variable values are available to the
107
block of an AWK program.
110
.BI \-f " program-file"
113
.BI \-\^\-file= program-file
114
Read the AWK program source from the file
116
instead of from the first command line argument.
127
Set various memory limits to the value
131
flag sets the maximum number of fields, and the
133
flag sets the maximum record size. These two flags and the
135
option are from the AT&T Bell Labs research version of \*(UX
141
has no pre-defined limits.
142
.TP \w'\fB\-\^\-copyright\fR'u+1n
150
mode. In compatibility mode,
152
behaves identically to \*(UX
154
none of the GNU-specific extensions are recognized.
156
.BR "GNU EXTENSIONS" ,
157
below, for more information.
170
Print the short version of the GNU copyright information message on
184
Print a relatively short summary of the available options on
186
Per the GNU Coding Standards, these options cause an immediate,
194
Provide warnings about constructs that are
195
dubious or non-portable to other AWK implementations.
197
.\" This option is left undocumented, on purpose.
204
Provide a moment of nostalgia for long time
216
mode, with the following additional restrictions:
221
escape sequences are not recognized.
235
cannot be used in place of
242
.BI "\-W source=" program-text
245
.BI \-\^\-source= program-text
248
as AWK program source code.
249
This option allows the easy intermixing of library functions (used via the
253
options) with source code entered on the command line.
254
It is intended primarily for medium to large size AWK programs used
259
form of this option uses the rest of the command line argument for
263
will be recognized in the same argument.
270
Print version information for this particular copy of
273
This is useful mainly for knowing if the current copy of
276
is up to date with respect to whatever the Free Software Foundation
278
Per the GNU Coding Standards, these options cause an immediate,
282
Signal the end of options. This is useful to allow further arguments to the
283
AWK program itself to start with a ``\-''.
284
This is mainly for consistency with the argument parsing convention used
285
by most other \*(PX programs.
287
In compatibility mode,
288
any other options are flagged as illegal, but are otherwise ignored.
289
In normal operation, as long as program text has been supplied, unknown
290
options are passed on to the AWK program in the
292
array for processing. This is particularly useful for running AWK
293
programs via the ``#!'' executable interpreter mechanism.
294
.SH AWK PROGRAM EXECUTION
296
An AWK program consists of a sequence of pattern-action statements
297
and optional function definitions.
300
\fIpattern\fB { \fIaction statements\fB }\fR
302
\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR
306
first reads the program source from the
311
or from the first non-option argument on the command line.
316
options may be used multiple times on the command line.
318
will read the program text as if all the
320
and command line source texts
321
had been concatenated together. This is useful for building libraries
322
of AWK functions, without having to include them in each new AWK
323
program that uses them. It also provides the ability to mix library
324
functions with command line programs.
326
The environment variable
328
specifies a search path to use when finding source files named with
331
option. If this variable does not exist, the default path is
332
\fB".:/usr/lib/awk:/usr/local/lib/awk"\fR.
333
If a file name given to the
335
option contains a ``/'' character, no path search is performed.
338
executes AWK programs in the following order.
340
all variable assignments specified via the
342
option are performed.
345
compiles the program into an internal form.
348
executes the code in the
351
and then proceeds to read
352
each file named in the
355
If there are no files named on the command line,
357
reads the standard input.
359
If a filename on the command line has the form
361
it is treated as a variable assignment. The variable
363
will be assigned the value
365
(This happens after any
367
block(s) have been run.)
368
Command line variable assignment
369
is most useful for dynamically assigning values to the variables
370
AWK uses to control how input is broken into fields and records. It
371
is also useful for controlling state if multiple passes are needed over
374
If the value of a particular element of
380
For each line in the input,
382
tests to see if it matches any
385
For each pattern that the line matches, the associated
388
The patterns are tested in the order they occur in the program.
390
Finally, after all the input is exhausted,
392
executes the code in the
395
.SH VARIABLES AND FIELDS
396
AWK variables are dynamic; they come into existence when they are
397
first used. Their values are either floating-point numbers or strings,
399
depending upon how they are used. AWK also has one dimensional
400
arrays; arrays with multiple dimensions may be simulated.
401
Several pre-defined variables are set as a program
402
runs; these will be described as needed and summarized below.
405
As each input line is read,
409
using the value of the
411
variable as the field separator.
414
is a single character, fields are separated by that character.
417
is expected to be a full regular expression.
418
In the special case that
420
is a single blank, fields are separated
421
by runs of blanks and/or tabs.
422
Note that the value of
424
(see below) will also affect how fields are split when
426
is a regular expression.
430
variable is set to a space separated list of numbers, each field is
431
expected to have fixed width, and
433
will split up the record using the specified widths. The value of
436
Assigning a new value to
440
and restores the default behavior.
442
Each field in the input line may be referenced by its position,
447
is the whole line. The value of a field may be assigned to as well.
448
Fields need not be referenced by constants:
458
prints the fifth field in the input line.
461
is set to the total number of fields in the input line.
463
References to non-existent fields (i.e. fields after
465
produce the null-string. However, assigning to a non-existent field
468
will increase the value of
470
create any intervening fields with the null string as their value, and
473
to be recomputed, with the fields being separated by the value of
475
References to negative numbered fields cause a fatal error.
476
.SS Built-in Variables
478
AWK's built-in variables are:
480
.TP \w'\fBFIELDWIDTHS\fR'u+1n
482
The number of command line arguments (does not include options to
484
or the program source).
489
of the current file being processed.
492
Array of command line arguments. The array is indexed from
496
Dynamically changing the contents of
498
can control the files used for data.
501
The conversion format for numbers, \fB"%.6g"\fR, by default.
504
An array containing the values of the current environment.
505
The array is indexed by the environment variables, each element being
506
the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be
508
Changing this array does not affect the environment seen by programs which
510
spawns via redirection or the
513
(This may change in a future version of
515
.\" but don't hold your breath...
518
If a system error occurs either doing a redirection for
527
a string describing the error.
530
A white-space separated list of fieldwidths. When set,
532
parses the input into fields of fixed width, instead of using the
535
variable as the field separator.
536
The fixed field width facility is still experimental; expect the
537
semantics to change as
542
The name of the current input file.
543
If no files are specified on the command line, the value of
548
is undefined inside the
553
The input record number in the current input file.
556
The input field separator, a blank by default.
559
Controls the case-sensitivity of all regular expression operations. If
561
has a non-zero value, then pattern matching in rules,
576
pre-defined functions will all ignore case when doing regular expression
579
is not equal to zero,
581
matches all of the strings \fB"ab"\fP, \fB"aB"\fP, \fB"Ab"\fP,
583
As with all AWK variables, the initial value of
585
is zero, so all regular expression operations are normally case-sensitive.
588
The number of fields in the current input record.
591
The total number of input records seen so far.
594
The output format for numbers, \fB"%.6g"\fR, by default.
597
The output field separator, a blank by default.
600
The output record separator, by default a newline.
603
The input record separator, by default a newline.
605
is exceptional in that only the first character of its string
606
value is used for separating records.
607
(This will probably change in a future release of
611
is set to the null string, then records are separated by
615
is set to the null string, then the newline character always acts as
616
a field separator, in addition to whatever value
621
The index of the first character matched by
626
The length of the string matched by
631
The character used to separate multiple subscripts in array
632
elements, by default \fB"\e034"\fR.
635
Arrays are subscripted with an expression between square brackets
637
If the expression is an expression list
638
.RI ( expr ", " expr " ...)"
639
then the array subscript is a string consisting of the
640
concatenation of the (string) value of each expression,
641
separated by the value of the
644
This facility is used to simulate multiply dimensioned
649
i = "A" ;\^ j = "B" ;\^ k = "C"
651
x[i, j, k] = "hello, world\en"
655
assigns the string \fB"hello, world\en"\fR to the element of the array
657
which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in AWK
658
are associative, i.e. indexed by string values.
666
statement to see if an array has an index consisting of a particular
678
If the array has multiple subscripts, use
679
.BR "(i, j) in array" .
683
construct may also be used in a
685
loop to iterate over all the elements of an array.
687
An element may be deleted from an array using the
692
statement may also be used to delete the entire contents of an array.
693
.SS Variable Typing And Conversion
696
may be (floating point) numbers, or strings, or both. How the
697
value of a variable is interpreted depends upon its context. If used in
698
a numeric expression, it will be treated as a number, if used as a string
699
it will be treated as a string.
701
To force a variable to be treated as a number, add 0 to it; to force it
702
to be treated as a string, concatenate it with the null string.
704
When a string must be converted to a number, the conversion is accomplished
707
A number is converted to a string by using the value of
709
as a format string for
711
with the numeric value of the variable as the argument.
712
However, even though all numbers in AWK are floating-point,
715
converted as integers. Thus, given
729
has a string value of \fB"12"\fR and not \fB"12.00"\fR.
732
performs comparisons as follows:
733
If two variables are numeric, they are compared numerically.
734
If one value is numeric and the other has a string value that is a
735
``numeric string,'' then comparisons are also done numerically.
736
Otherwise, the numeric value is converted to a string and a string
737
comparison is performed.
738
Two strings are compared, of course, as strings.
739
According to the \*(PX standard, even if two strings are
740
numeric strings, a numeric comparison is performed. However, this is
741
clearly incorrect, and
745
Uninitialized variables have the numeric value 0 and the string value ""
746
(the null, or empty, string).
747
.SH PATTERNS AND ACTIONS
748
AWK is a line oriented language. The pattern comes first, and then the
749
action. Action statements are enclosed in
753
Either the pattern may be missing, or the action may be missing, but,
754
of course, not both. If the pattern is missing, the action will be
755
executed for every single line of input.
756
A missing action is equivalent to
762
which prints the entire line.
764
Comments begin with the ``#'' character, and continue until the
766
Blank lines may be used to separate statements.
767
Normally, a statement ends with a newline, however, this is not the
768
case for lines ending in
769
a ``,'', ``{'', ``?'', ``:'', ``&&'', or ``||''.
774
also have their statements automatically continued on the following line.
775
In other cases, a line can be continued by ending it with a ``\e'',
776
in which case the newline will be ignored.
778
Multiple statements may
779
be put on one line by separating them with a ``;''.
780
This applies to both the statements within the action part of a
781
pattern-action pair (the usual case),
782
and to the pattern-action statements themselves.
784
AWK patterns may be one of the following:
790
.BI / "regular expression" /
791
.I "relational expression"
792
.IB pattern " && " pattern
793
.IB pattern " || " pattern
794
.IB pattern " ? " pattern " : " pattern
797
.IB pattern1 ", " pattern2
804
are two special kinds of patterns which are not tested against
806
The action parts of all
808
patterns are merged as if all the statements had
809
been written in a single
811
block. They are executed before any
812
of the input is read. Similarly, all the
815
and executed when all the input is exhausted (or when an
817
statement is executed).
821
patterns cannot be combined with other patterns in pattern expressions.
825
patterns cannot have missing action parts.
828
.BI / "regular expression" /
829
patterns, the associated statement is executed for each input line that matches
830
the regular expression.
831
Regular expressions are the same as those in
833
and are summarized below.
836
.I "relational expression"
837
may use any of the operators defined below in the section on actions.
838
These generally test whether certain fields match certain regular expressions.
845
operators are logical AND, logical OR, and logical NOT, respectively, as in C.
846
They do short-circuit evaluation, also as in C, and are used for combining
847
more primitive pattern expressions. As in most languages, parentheses
848
may be used to change the order of evaluation.
852
operator is like the same operator in C. If the first pattern is true
853
then the pattern used for testing is the second pattern, otherwise it is
854
the third. Only one of the second and third patterns is evaluated.
857
.IB pattern1 ", " pattern2
858
form of an expression is called a
859
.IR "range pattern" .
860
It matches all input records starting with a line that matches
862
and continuing until a record that matches
864
inclusive. It does not combine with any other sort of pattern expression.
865
.SS Regular Expressions
866
Regular expressions are the extended kind found in
868
They are composed of characters as follows:
869
.TP \w'\fB[^\fIabc...\fB]\fR'u+2n
871
matches the non-metacharacter
875
matches the literal character
879
matches any character except newline.
882
matches the beginning of a line or a string.
885
matches the end of a line or a string.
888
character class, matches any of the characters
892
negated character class, matches any character except
897
alternation: matches either
903
concatenation: matches
924
The escape sequences that are valid in string constants (see below)
925
are also legal in regular expressions.
927
Action statements are enclosed in braces,
931
Action statements consist of the usual assignment, conditional, and looping
932
statements found in most languages. The operators, control statements,
933
and input/output statements
934
available are patterned after those in C.
937
The operators in AWK, in order of increasing precedence, are
939
.TP "\w'\fB*= /= %= ^=\fR'u+1n"
945
Assignment. Both absolute assignment
946
.BI ( var " = " value )
947
and operator-assignment (the other forms) are supported.
950
The C conditional expression. This has the form
951
.IB expr1 " ? " expr2 " : " expr3\c
954
is true, the value of the expression is
971
Regular expression match, negated match.
973
Do not use a constant regular expression
975
on the left-hand side of a
979
Only use one on the right-hand side. The expression
981
has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR.
994
The regular relational operators.
997
String concatenation.
1000
Addition and subtraction.
1003
Multiplication, division, and modulus.
1006
Unary plus, unary minus, and logical negation.
1009
Exponentiation (\fB**\fR may also be used, and \fB**=\fR for
1010
the assignment operator).
1013
Increment and decrement, both prefix and postfix.
1017
.SS Control Statements
1019
The control statements are
1024
\fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR]
1025
\fBwhile (\fIcondition\fB) \fIstatement \fR
1026
\fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR
1027
\fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR
1028
\fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR
1031
\fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR
1032
\fBdelete \fIarray\^\fR
1033
\fBexit\fR [ \fIexpression\fR ]
1034
\fB{ \fIstatements \fB}
1037
.SS "I/O Statements"
1039
The input/output statements are as follows:
1041
.TP "\w'\fBprintf \fIfmt, expr-list\fR'u+1n"
1042
.BI close( filename )
1043
Close file (or pipe, see below).
1048
from next input record; set
1053
.BI "getline <" file
1064
from next input record; set
1068
.BI getline " var" " <" file
1075
Stop processing the current input record. The next input record
1076
is read and processing starts over with the first pattern in the
1077
AWK program. If the end of the input data is reached, the
1079
block(s), if any, are executed.
1082
Stop processing the current input file. The next input record read
1083
comes from the next input file.
1087
is reset to 1, and processing starts over with the first pattern in the
1088
AWK program. If the end of the input data is reached, the
1090
block(s), if any, are executed.
1093
Prints the current record.
1095
.BI print " expr-list"
1097
Each expression is separated by the value of the
1099
variable. The output record is terminated with the value of the
1103
.BI print " expr-list" " >" file
1104
Prints expressions on
1106
Each expression is separated by the value of the
1108
variable. The output record is terminated with the value of the
1112
.BI printf " fmt, expr-list"
1115
.BI printf " fmt, expr-list" " >" file
1119
.BI system( cmd-line )
1122
and return the exit status.
1123
(This may not be available on non-\*(PX systems.)
1125
Other input/output redirections are also allowed. For
1130
appends output to the
1135
In a similar fashion,
1136
.IB command " | getline"
1141
command will return 0 on end of file, and \-1 on an error.
1142
.SS The \fIprintf\fP\^ Statement
1144
The AWK versions of the
1150
accept the following conversion specification formats:
1153
An \s-1ASCII\s+1 character.
1154
If the argument used for
1156
is numeric, it is treated as a character and printed.
1157
Otherwise, the argument is assumed to be a string, and the only first
1158
character of that string is printed.
1161
A decimal number (the integer part).
1168
A floating point number of the form
1169
.BR [\-]d.ddddddE[+\^\-]dd .
1172
A floating point number of the form
1173
.BR [\-]ddd.dddddd .
1180
conversion, whichever is shorter, with nonsignificant zeros suppressed.
1183
An unsigned octal number (again, an integer).
1189
An unsigned hexadecimal number (an integer).
1202
character; no argument is converted.
1204
There are optional, additional parameters that may lie between the
1206
and the control letter:
1209
The expression should be left-justified within its field.
1212
The field should be padded to this width. If the number has a leading
1213
zero, then the field will be padded with zeros.
1214
Otherwise it is padded with blanks.
1215
This applies even to the non-numeric output formats.
1218
A number indicating the maximum width of strings or digits to the right
1219
of the decimal point.
1225
capabilities of the \*(AN C
1227
routines are supported.
1230
in place of either the
1234
specifications will cause their values to be taken from
1235
the argument list to
1239
.SS Special File Names
1241
When doing I/O redirection from either
1250
recognizes certain special filenames internally. These filenames
1251
allow access to open file descriptors inherited from
1253
parent process (usually the shell).
1254
Other special filenames provide access information about the running
1258
.TP \w'\fB/dev/stdout\fR'u+1n
1260
Reading this file returns the process ID of the current process,
1261
in decimal, terminated with a newline.
1264
Reading this file returns the parent process ID of the current process,
1265
in decimal, terminated with a newline.
1268
Reading this file returns the process group ID of the current process,
1269
in decimal, terminated with a newline.
1272
Reading this file returns a single record terminated with a newline.
1273
The fields are separated with blanks.
1290
If there are any additional fields, they are the group IDs returned by
1292
Multiple groups may not be supported on all systems.
1298
The standard output.
1301
The standard error output.
1304
The file associated with the open file descriptor
1307
These are particularly useful for error messages. For example:
1311
print "You blew it!" > "/dev/stderr"
1315
whereas you would otherwise have to use
1319
print "You blew it!" | "cat 1>&2"
1323
These file names may also be used on the command line to name data files.
1324
.SS Numeric Functions
1326
AWK has the following pre-defined arithmetic functions:
1328
.TP \w'\fBsrand(\^\fIexpr\^\fB)\fR'u+1n
1329
.BI atan2( y , " x" )
1330
returns the arctangent of
1335
returns the cosine in radians.
1338
the exponential function.
1341
truncates to integer.
1344
the natural logarithm function.
1347
returns a random number between 0 and 1.
1350
returns the sine in radians.
1353
the square root function.
1358
as a new seed for the random number generator. If no
1360
is provided, the time of day will be used.
1361
The return value is the previous seed for the random
1363
.SS String Functions
1365
AWK has the following pre-defined string functions:
1367
.TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
1368
\fBgsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR
1369
for each substring matching the regular expression
1373
substitute the string
1375
and return the number of substitutions.
1378
is not supplied, use
1381
.BI index( s , " t" )
1382
returns the index of the string
1391
returns the length of the string
1399
.BI match( s , " r" )
1400
returns the position in
1402
where the regular expression
1406
is not present, and sets the values of
1411
\fBsplit(\fIs\fB, \fIa\fB, \fIr\fB)\fR
1416
on the regular expression
1418
and returns the number of fields. If
1427
.BI sprintf( fmt , " expr-list" )
1432
and returns the resulting string.
1434
\fBsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR
1437
but only the first matching substring is replaced.
1439
\fBsubstr(\fIs\fB, \fIi\fB, \fIn\fB)\fR
1448
is omitted, the rest of
1453
returns a copy of the string
1455
with all the upper-case characters in
1457
translated to their corresponding lower-case counterparts.
1458
Non-alphabetic characters are left unchanged.
1461
returns a copy of the string
1463
with all the lower-case characters in
1465
translated to their corresponding upper-case counterparts.
1466
Non-alphabetic characters are left unchanged.
1469
Since one of the primary uses of AWK programs is processing log files
1470
that contain time stamp information,
1472
provides the following two functions for obtaining time stamps and
1475
.TP "\w'\fBsystime()\fR'u+1n"
1477
returns the current time of day as the number of seconds since the Epoch
1478
(Midnight UTC, January 1, 1970 on \*(PX systems).
1480
\fBstrftime(\fIformat\fR, \fItimestamp\fB)\fR
1483
according to the specification in
1487
should be of the same form as returned by
1491
is missing, the current time of day is used.
1492
See the specification for the
1494
function in \*(AN C for the format conversions that are
1495
guaranteed to be available.
1496
A public-domain version of
1498
and a man page for it are shipped with
1500
if that version was used to build
1502
then all of the conversions described in that man page are available to
1504
.SS String Constants
1506
String constants in AWK are sequences of characters enclosed
1507
between double quotes (\fB"\fR). Within strings, certain
1508
.I "escape sequences"
1509
are recognized, as in C. These are:
1511
.TP \w'\fB\e\^\fIddd\fR'u+1n
1513
A literal backslash.
1516
The ``alert'' character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character.
1536
.BI \ex "\^hex digits"
1537
The character represented by the string of hexadecimal digits following
1540
As in \*(AN C, all following hexadecimal digits are considered part of
1541
the escape sequence.
1542
(This feature should tell us something about language design by committee.)
1543
E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
1546
The character represented by the 1-, 2-, or 3-digit sequence of octal
1547
digits. E.g. \fB"\e033"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
1550
The literal character
1553
The escape sequences may also be used inside constant regular expressions
1555
.B "/[\ \et\ef\en\er\ev]/"
1556
matches whitespace characters).
1558
Functions in AWK are defined as follows:
1561
\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR
1564
Functions are executed when called from within the action parts of regular
1565
pattern-action statements. Actual parameters supplied in the function
1566
call are used to instantiate the formal parameters declared in the function.
1567
Arrays are passed by reference, other variables are passed by value.
1569
Since functions were not originally part of the AWK language, the provision
1570
for local variables is rather clumsy: They are declared as extra parameters
1571
in the parameter list. The convention is to separate local variables from
1572
real parameters by extra spaces in the parameter list. For example:
1577
function f(p, q, a, b) { # a & b are local
1580
/abc/ { ... ; f(1, 2) ; ... }
1585
The left parenthesis in a function call is required
1586
to immediately follow the function name,
1587
without any intervening white space.
1588
This is to avoid a syntactic ambiguity with the concatenation operator.
1589
This restriction does not apply to the built-in functions listed above.
1591
Functions may call each other and may be recursive.
1592
Function parameters used as local variables are initialized
1593
to the null string and the number zero upon function invocation.
1597
may be used in place of
1601
Print and sort the login names of all users:
1605
{ print $1 | "sort" }
1608
Count lines in a file:
1612
END { print nlines }
1615
Precede each line by its number in the file:
1621
Concatenate and line number (a variation on a theme):
1638
.IR "The AWK Programming Language" ,
1639
Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger,
1640
Addison-Wesley, 1988. ISBN 0-201-07981-X.
1642
.IR "The GAWK Manual" ,
1643
Edition 0.15, published by the Free Software Foundation, 1993.
1644
.SH POSIX COMPATIBILITY
1647
is compatibility with the \*(PX standard, as well as with the
1648
latest version of \*(UX
1652
incorporates the following user visible
1653
features which are not described in the AWK book,
1656
in System V Release 4, and are in the \*(PX standard.
1660
option for assigning variables before program execution starts is new.
1661
The book indicates that command line variable assignment happens when
1663
would otherwise open the argument as a file, which is after the
1665
block is executed. However, in earlier implementations, when such an
1666
assignment appeared before any file names, the assignment would happen
1670
block was run. Applications came to depend on this ``feature.''
1673
was changed to match its documentation, this option was added to
1674
accommodate applications that depended upon the old behavior.
1675
(This feature was agreed upon by both the AT&T and GNU developers.)
1679
option for implementation specific features is from the \*(PX standard.
1681
When processing arguments,
1683
uses the special option ``\fB\-\^\-\fP'' to signal the end of
1685
In compatibility mode, it will warn about, but otherwise ignore,
1687
In normal operation, such arguments are passed on to the AWK program for
1690
The AWK book does not define the return value of
1692
The System V Release 4 version of \*(UX
1694
(and the \*(PX standard)
1695
has it return the seed it was using, to allow keeping track
1696
of random number sequences. Therefore
1700
also returns its current seed.
1702
Other new features are:
1713
escape sequences (done originally in
1715
and fed back into AT&T's); the
1719
built-in functions (from AT&T); and the \*(AN C conversion specifications in
1721
(done first in AT&T's version).
1724
has some extensions to \*(PX
1726
They are described in this section. All the extensions described here
1734
The following features of
1736
are not available in
1754
The special file names available for I/O redirection are not recognized.
1761
variables are not special.
1766
variable and its side-effects are not available.
1771
variable and fixed width field splitting.
1774
No path search is performed for files named via the
1776
option. Therefore the
1778
environment variable is not special.
1783
to abandon processing of the current input file.
1788
to delete the entire contents of an array.
1791
The AWK book does not define the return value of the
1796
returns the value from
1800
when closing a file or pipe, respectively.
1811
option is ``t'', then
1813
will be set to the tab character.
1814
Since this is a rather ugly special case, it is not the default behavior.
1815
This behavior also does not occur if
1822
was compiled for debugging, it will
1823
accept the following additional options:
1834
debugging output during program parsing.
1835
This option should only be of interest to the
1837
maintainers, and may not even be compiled into
1840
.SH HISTORICAL FEATURES
1841
There are two features of historical AWK implementations that
1844
First, it is possible to call the
1846
built-in function not only with no argument, but even without parentheses!
1855
is the same as either of
1865
This feature is marked as ``deprecated'' in the \*(PX standard, and
1867
will issue a warning about its use if
1869
is specified on the command line.
1871
The other feature is the use of either the
1875
statements outside the body of a
1880
loop. Traditional AWK implementations have treated such usage as
1885
will support this usage if
1888
.SH ENVIRONMENT VARIABLES
1891
exists in the environment, then
1893
behaves exactly as if
1895
had been specified on the command line.
1900
will issue a warning message to this effect.
1904
option is not necessary given the command line variable assignment feature;
1905
it remains only for backwards compatibility.
1907
If your system actually has support for
1914
files, you may get different output from
1916
than you would get on a system without those files. When
1918
interprets these files internally, it synchronizes output to the standard
1919
output with output to
1921
while on a system with those files, the output is actually to different
1924
.SH VERSION INFORMATION
1925
This man page documents
1929
Starting with the 2.15 version of
1941
options of the 2.11 version are no longer recognized.
1942
This fact will not even be documented in the manual page for the next
1945
The original version of \*(UX
1947
was designed and implemented by Alfred Aho,
1948
Peter Weinberger, and Brian Kernighan of AT&T Bell Labs. Brian Kernighan
1949
continues to maintain and enhance it.
1951
Paul Rubin and Jay Fenlason,
1952
of the Free Software Foundation, wrote
1954
to be compatible with the original version of
1956
distributed in Seventh Edition \*(UX.
1957
John Woods contributed a number of bug fixes.
1958
David Trueman, with contributions
1959
from Arnold Robbins, made
1961
compatible with the new version of \*(UX
1963
Arnold Robbins is the current maintainer.
1965
The initial DOS port was done by Conrad Kwok and Scott Garfinkle.
1966
Scott Deifik is the current DOS maintainer. Pat Rankin did the
1967
port to VMS, and Michal Jaegermann did the port to the Atari ST.
1968
The port to OS/2 was done by Kai Uwe Rommel, with contributions and
1969
help from Darrel Hankerson.
1971
If you find a bug in
1973
please send electronic mail to
1974
.BR bug-gnu-utils@prep.ai.mit.edu ,
1977
.BR arnold@gnu.ai.mit.edu .
1978
Please include your operating system and its revision, the version of
1980
what C compiler you used to compile it, and a test program
1981
and data that are as small as possible for reproducing the problem.
1983
Before sending a bug report, please do two things. First, verify that
1984
you have the latest version of
1986
Many bugs (usually subtle ones) are fixed at each release, and if
1987
your's is out of date, the problem may already have been solved.
1988
Second, please read this man page and the reference manual carefully to
1989
be sure that what you think is a bug really is, instead of just a quirk
1991
.SH ACKNOWLEDGEMENTS
1992
Brian Kernighan of Bell Labs
1993
provided valuable assistance during testing and debugging.