Awk Tips

- This example demonstrates the simplest general form of an Awk program:
 Syntax: awk <search pattern> {<program actions>}
awk '/gold/' coins.txt
awk '/gold/ {print $5,$6,$7,$8}' coins.txt

                    awk '{if ($3 < 1980) print $3, "    ",$5,$6,$7,$8}' coins.txt

The next example prints out how many coins are in the collection:

   awk 'END {print NR,"coins"}' coins.txt

Suppose the current price of gold is $425, and I want to figure out the approximate total value of the gold pieces in the coin collection. I invoke Awk as follows:

   awk '/gold/ {ounces += $2} END {print "value = $" 425*ounces}' coins.txt

AWK PROGRAM EXAMPLE

awk -f <awk program file name>
Given an ability to write an Awk program in this way, then what should a "master" "coins.txt" analysis program do? Here's one possible output:
  Summary Data for Coin Collection:

     Gold pieces:                   nn
     Weight of gold pieces:         nn.nn
     Value of gold pieces:       n,nnn.nn

     Silver pieces:                 nn
     Weight of silver pieces:       nn.nn
     Value of silver pieces:     n,nnn.nn

     Total number of pieces:        nn
     Value of collection:        n,nnn.nn

The following Awk program generates this information:
   # This is an awk program that summarizes a coin collection.
   #
   /gold/    { num_gold++; wt_gold += $2 }      # Get weight of gold.
   /silver/  { num_silver++; wt_silver += $2 }  # Get weight of silver.
   END { val_gold = 485 * wt_gold;              # Compute value of gold.
         val_silver = 16 * wt_silver;           # Compute value of silver.
         total = val_gold + val_silver;
         print "Summary data for coin collection:";  # Print results.
         printf ("\n");
         printf ("   Gold pieces:                   %2d\n", num_gold);
         printf ("   Weight of gold pieces:         %5.2f\n", wt_gold);
         printf ("   Value of gold pieces:        %7.2f\n",val_gold);
         printf ("\n");
         printf ("   Silver pieces:                 %2d\n", num_silver);
         printf ("   Weight of silver pieces:       %5.2f\n", wt_silver);
         printf ("   Value of silver pieces:      %7.2f\n",val_silver);
         printf ("\n");
         printf ("   Total number of pieces:        %2d\n", NR);
         printf ("   Value of collection:         %7.2f\n", total); }
This program has a few interesting features:

 

Awk Syntax

* Awk is invoked as follows:

   awk [ -F<ch> ] {pgm} | { -f <pgm_file> } [ <vars> ] [ - | <data_file> ]
-- where:
   ch:          Field-separator character.
   pgm:         Awk command-line program.
   pgm file:    File containing an Awk program.
   vars:        Awk variable initializations.
   data file:   Input data file.
An Awk program has the general form:
   BEGIN              {<initializations>}
   <search pattern 1> {<program actions>}
   <search pattern 2> {<program actions>}
   ...
   END 

Search Patterns

The simplest kind search pattern that can be specified is a simple string, enclosed in forward-slashes ("/"). For example:

   /The/
 /^The/ - beginning of the line
  /The$/ - ends with "The"
/\$/ - to search "$"
/[Tt]he/
/(^Germany)|(^Netherlands)/ - OR
/wh./ - wild card
 $1 ~ /^France$/ - first field is "France"
NR is, as explained in the overview, a count of the lines searched by Awk 
For example:
   NF == 0
-- matches all blank lines, or those whose nnumber of fields is zero.
 
Variable declaration:
var == 0

Awk's built-in variables include the field variables -- $1, $2, $3, and so on ($0 is the entire line) -- that give the text or values in the individual text fields in a line, and a number of variables with specific functions:

ARRAYS

* Awk also permits the use of arrays. The naming convention is the same as it is for variables, and, as with variables, the array does not have to be declared. Awk arrays can only have one dimension; the first index is 1. Array elements are identified by an index, contained in square brackets. For example:

   some_array[1], some_array[2], some_array[3] ...
One interesting feature of Awk arrays is that the indexes can also be strings, which allows them to be used as a sort of "associative memory". For example, an array could be used to tally the money your friends owe you, as follows:
   debts["Kimmie"], debts["Michael"], debts["Hugh"] ...

STANDARD FUNCTIONS

There are several predefined arithmetic functions:

   length()  Length
   sqrt()     Square root.
   log()      Base-e log.
   exp()      Power of e.
   int()      Integer part of argument.
 {print length, $0}

Awk, not surprisingly, includes a set of string-processing operations:

   substr()   As mentioned, extracts a substring from a string.
		 substr(<string>,<start of substring>,<max length of substring>)
   split()    Splits a string into its elements and stores them in an array.
		split(<string>,<array>,[<field separator>])
   index()    Finds the starting point of a substring within a string.
		index(<target string>,<search string>)

Awk supports control structures similar to those used in C, including:

   if ... else
   while
   for
 

Full script

1