awk - pattern scanning and processing language


     /usr/bin/awk [-f progfile] [-F c] [ ' prog  ']  [parameters]

     /usr/xpg4/bin/awk  [-F ERE]   [-v assignment...]   'program'
     -f progfile... [argument...]


     The /usr/xpg4/bin/awk utility is described  on  the  nawk(1)
     manual page.

     The /usr/bin/awk utility scans each input filename for lines
     that  match  any of a set of patterns specified in prog. The
     prog string must be enclosed in single quotes ( ')  to  pro-
     tect  it  from the shell. For each pattern in prog there may
     be an associated action performed when a line of a  filename
     matches  the  pattern.  The set of pattern-action statements
     may appear literally as prog or in a file specified with the
     -f  progfile option. Input files are read in order; if there
     are no files, the standard input is read. The file name  '-'
     means the standard input.


     The following options are supported:

     -f progfile
           awk uses the set of patterns it reads from progfile.

     -Fc   Uses the character c as the field separator (FS) char-
           acter.  See the discussion of FS below.


  Input Lines
     Each input line is matched against the  pattern  portion  of
     every  pattern-action  statement;  the  associated action is
     performed for each matched pattern. Any filename of the form
     var=value  is  treated as an assignment, not a filename, and
     is executed at the time it would have been opened if it were
     a filename. Variables assigned in this manner are not avail-
     able inside a BEGIN rule, and are assigned after  previously
     specified files have been read.

     An input line is normally made up  of  fields  separated  by
     white  spaces.  (This default can be changed by using the FS
     built-in variable or the -Fc  option.)  The  default  is  to
     ignore  leading  blanks  and  to  separate  fields by blanks
     and/or tab characters. However, if FS is  assigned  a  value
     that  does not include any of the white spaces, then leading
     blanks are not ignored. The fields are denoted $1, $2,  ...;
     $0 refers to the entire line.

  Pattern-action Statements
     A pattern-action statement has the form:

     pattern { action }

     Either pattern or action may be  omitted.  If  there  is  no
     action,  the  matching  line is printed. If there is no pat-
     tern, the action is performed on every input line.  Pattern-
     action statements are separated by newlines or semicolons.

     Patterns are arbitrary Boolean combinations ( !, ||, &&, and
     parentheses)  of  relational expressions and regular expres-
     sions. A relational expression is one of the following:

     expression relop expression
     expression matchop regular_expression

     where a relop is any of the six relational operators  in  C,
     and  a  matchop  is either ~ (contains) or !~ (does not con-
     tain). An expression is an arithmetic  expression,  a  rela-
     tional expression, the special expression

     var in array

     or a Boolean combination of these.

     Regular expressions are as in  egrep(1).  In  patterns  they
     must  be surrounded by slashes. Isolated regular expressions
     in a pattern apply to the entire line.  Regular  expressions
     may also occur in relational expressions. A pattern may con-
     sist of two patterns separated by a comma; in this case, the
     action  is performed for all lines between the occurrence of
     the first pattern to the occurrence of the second pattern.

     The special patterns BEGIN and END may be  used  to  capture
     control  before the first input line has been read and after
     the last input line has been read respectively.  These  key-
     words do not combine with any other patterns.

  Built-in Variables
     Built-in variables include:

           name of the current input file

     FS    input  field  separator  regular  expression  (default
           blank and tab)

     NF    number of fields in the current record

     NR    ordinal number of the current record

     OFMT  output format for numbers (default %.6g)

     OFS   output field separator (default blank)

     ORS   output record separator (default new-line)

     RS    input record separator (default new-line)

     An action is a sequence of statements. A  statement  may  be
     one of the following:

     if ( expression ) statement [ else statement ]
     while ( expression ) statement
     do statement while ( expression )
     for ( expression ; expression ; expression ) statement
     for ( var in array ) statement
     { [ statement ] ... }
     expression      # commonly variable = expression
     print [ expression-list ] [ >expression ]
     printf format [ ,expression-list ] [ >expression ]
     next            # skip remaining patterns on this input line
     exit [expr]     # skip the rest of the input; exit status is expr

     Statements are terminated by semicolons, newlines, or  right
     braces.  An empty expression-list stands for the whole input
     line. Expressions  take  on  string  or  numeric  values  as
     appropriate,  and  are built using the operators +, -, *, /,
     %, ^ and concatenation (indicated by a blank). The operators
     ++, --, +=, -=, *=, /=, %=, ^=, >, >=, <, <=, ==, !=, and ?:
     are also available in expressions. Variables may be scalars,
     array elements (denoted x[i]), or fields. Variables are ini-
     tialized to the null string or zero. Array subscripts may be
     any  string, not necessarily numeric; this allows for a form
     of associative memory. String  constants  are  quoted  (""),
     with the usual C escapes recognized within.

     The print statement prints its  arguments  on  the  standard
     output, or on a file if >expression is present, or on a pipe
     if '|cmd' is present. The output  resulted  from  the  print
     statement  is terminated by the output record separator with
     each argument separated by the current output field  separa-
     tor.  The  printf  statement  formats  its  expression  list
     according to the format (see printf(3C)).

  Built-in Functions
     The arithmetic functions are as follows:

           Return cosine  of  x,  where  x  is  in  radians.  (In
           /usr/xpg4/bin/awk only. See nawk(1).)

           Return  sine  of  x,  where  x  is  in  radians.   (In
           /usr/xpg4/bin/awk only. See nawk(1).)

           Return the exponential function of x.

           Return the natural logarithm of x.

           Return the square root of x.

           Truncate its argument to an integer. It will be  trun-
           cated toward 0 when x > 0.

     The string functions are as follows:

     index(s, t)
           Return the position in string s where string  t  first
           occurs, or 0 if it does not occur at all.

           truncates s to an integer value. If s  is  not  speci-
           fied, $0 is used.

           Return the length of its argument taken as  a  string,
           or of the whole line if there is no argument.

     split(s, a, fs)
           Split the string s into array elements a[1], a[2], ...
           a[n],  and  returns n. The separation is done with the
           regular expression fs or with the field  separator  FS
           if fs is not given.

     sprintf(fmt, expr, expr,...)
           Format the expressions  according  to  the  printf(3C)
           format given by fmt and returns the resulting string.

     substr(s, m, n)
           returns the n-character substring of s that begins  at
           position m.

     The input/output function is as follows:

           Set $0 to the next input record from the current input
           file.  getline  returns  1 for successful input, 0 for
           end of file, and -1 for an error.

  Large File Behavior
     See largefile(5) for the description of the behavior of  awk
     when encountering files greater than or equal to 2 Gbyte ( 2
    **31 bytes).


     Example 1: Printing lines longer than 72 characters

     example% length > 72

     Example 2: Printing first two fields in opposite order

     example% { print $2, $1 }

     Example 3: Same, with input fields separated by comma and/or
     blanks and tabs

     example% BEGIN { FS = ",[ \t]*|[ \t]+" }
           { print $2, $1 }

     Example 4: Adding up first column, print sum and average

     example% { s += $1 }
     END  { print "sum is", s, " average is", s/NR }

     Example 5: Printing fields in reverse order

     example% { for (i = NF; i > 0; --i) print $i }

     Example 6: Printing all lines between start/stop pairs

     example% /start/, /stop/

     Example 7: Printing all lines whose first field is different
     from the previous one

     example% $1 != prev { print; prev = $1 }

     Example 8: Printing a file, filling in page numbers starting
     at 5

     example% /Page/     { $2 = n++; }
                  { print }

     Example 9: Printing a file and numbering its pages, starting
     at 5

     Assuming this program is in a file named prog, the following
     command  line  prints  the  file  input  numbering its pages
     starting at 5:

     example% awk -f prog n=5 input


     See environ(5) for descriptions of the following environment
     variables  that  affect  the execution of awk: LANG, LC_ALL,

           Determine the radix character used  when  interpreting
           numeric  input, performing conversions between numeric
           and  string  values  and  formatting  numeric  output.
           Regardless   of  locale,  the  period  character  (the
           decimal-point character of the POSIX  locale)  is  the
           decimal-point  character  recognized in processing awk
           programs (including assignments in command-line  argu-


     See attributes(5) for descriptions of the  following  attri-

    |       ATTRIBUTE TYPE        |       ATTRIBUTE VALUE       |
    | Availability                | SUNWesu                     |
    | CSI                         | Enabled                     |

    |       ATTRIBUTE TYPE        |       ATTRIBUTE VALUE       |
    | Availability                | SUNWxcu4                    |
    | CSI                         | Enabled                     |
    | Interface Stability         | Standard                    |


     egrep(1),  grep(1),  nawk(1),  sed(1),  printf(3C),   attri-
     butes(5), environ(5), largefile(5), standards(5)


     Input white space is not preserved on output if  fields  are

     There  are  no  explicit  conversions  between  numbers  and
     strings.  To  force an expression to be treated as a number,
     add 0 to it. To force an  expression  to  be  treated  as  a
     string, concatenate the null string ("") to it.

Man(1) output converted with man2html