456 459




Linux Unleashed, Third Edition:gawk





-->















Previous
Table of Contents
Next




Built-In Variables
The gawk language has a few built-in variables that are used to represent things such as the total number of records processed. These are useful when you want to get totals. Table 25.7 shows the important built-in variables.
Table 25.7. The important built-in variables.



Variable
Description



NR
The number of records read so far

FNR
The number of records read from the current file

FILENAME
The name of the input file

FS
Field separator (default is whitespace)

RS
Record separator (default is newline)

OFMT
Output format for numbers (default is %g)

OFS
Output field separator

ORS
Output record separator

NF
The number of fields in the current record



The NR and FNR values are the same if you are processing only one file, but if you are doing more than one file, NR is a running total of all files, while FNR is the total for the current file only.
The FS variable is useful because it controls the input file’s field separator. To use the colon for the /etc/passwd file, for example, use the following command in the script, usually as part of the BEGIN pattern:


FS=”:”


You can use these built-in variables as you would any other. For example, the following command gives you a way to check the number of fields in the file you are processing and generates an error message if the values are incorrect:



NF <= 5 {print “Not enough fields in the record”}


Control Structures
Enough of the details have been covered to allow us to start doing some real gawk programming. Although we have not covered all of gawk’s pattern and action considerations, we have seen all the important material. Now we can look at writing control structures.
If you have any programming experience at all or have tried some shell script writing, many of these control structures will appear familiar. If you haven’t done any programming, common sense should help, as gawk is cleanly laid out without weird syntax. Follow the examples and try a few test programs of your own.
Incidentally, gawk enables you to place comments anywhere in your scripts, as long as the comment starts with a # sign. You should use comments to indicate what is going on in your scripts if it is not immediately obvious.
The if Statement
The if statement is used to allow gawk to test some condition and, if it is true, execute a set of commands. The general syntax for the if statement is as follows:


if (expression) {commands} else {commands}


The expression is always evaluated to see if it is true or false. No other value is calculated for the if expression. Here’s a simple if script:


# a simple if loop
(if ($1 == 0){
print “This cell has a value of zero”
}
else {
printf “The value is %d\n”, $1
})


Notice that the curly braces were used to lay out the program in a readable manner. Of course, this could all have been entered on one line and gawk would have understood it, but writing in a nicely formatted manner makes it easier to understand what is going on and to debug the program if the need arises.
In this simple script, we test the first column to see if the value is zero. If it is, a message to that effect is printed. If not, the printf statement prints the value of the column.
The flow of the if statement is quite simple to follow. There can be several commands in each part, as long as the curly braces mark the start and end. There is no need to have an else section. It can be left out entirely, if desired. For example, this is a complete and valid gawk script:


(if ($1 == 0){
print “This cell has a value of zero”
})


The gawk language, to be compatible with other programming languages, allows a special format of the if statement when a simple comparison is being conducted. This quick-and-dirty if structure is harder to read for novices, and I don’t recommend it if you are new to the language. For example, here’s the if statement written the proper way:


# a nicely formatted if loop
(if ($1 > $2){
print “The first column is larger”
}
else {
print “The second column is larger”
})


Here’s the quick-and-dirty method:



# if syntax from hell
$1 > $2{
print “The first column is larger”
}
{print “The second column is larger”)


You may notice that the keywords if and else are left off. The general structure is retained: expression, true commands, and false commands. However, this is much less readable if you don’t know that it is an if statement! Not all versions of gawk allow this method of using if, so don’t be too surprised if it doesn’t work. Besides, you should be using the more verbose method of writing if statements for readability’s sake.
The while Loop
The while statement allows a set of commands to be repeated as long as some condition is true. The condition is evaluated each time the program loops. The general format of the gawk while loop is as follows:


while (expression){
commands
}


For example, the while loop can be used in a program that calculates the value of an investment over several years (the formula for the calculation is value=amount(1+interest_rate)^years):


# interest calculation computes compound interest
# inputs from a file are the amount, interest_rate, and years
{var = 1
while (var <= $3) {
printf(“%f\n”, $1*(1+$2)^var)
var++
}
}


You can see in this script that we initialize the variable var to 1 before entering the while loop. If we don’t do this, gawk assigns a value of zero. The values for the three variables we use are read from the input file. The autoincrement command is used to add one to var each time the line is executed.



Previous
Table of Contents
Next














Wyszukiwarka

Podobne podstrony:
456 20 (2)
09 (459)
03 (459)
README (456)
456 11 (2)
452 456
2006 04 11 Uchwała ZG OSP system szkoleniaid 456

więcej podobnych podstron