[Chapter 4] 4.2 Shell Variables

4.2 Shell Variables

A major piece of the Korn shell's programming functionality relates to shell variables. We've already seen the basics of variables. To recap briefly: they are named places to store data, usually in the form of character strings, and their values can be obtained by preceding their names with dollar signs ($). Certain variables, called environment variables, are conventionally named in all capital letters, and their values are made known (with the export statement) to subprocesses.

If you are a programmer, you already know that just about every major programming language uses variables in some way; in fact, an important way of characterizing differences between languages is comparing their facilities for variables.

The chief difference between the Korn shell's variable schema and those of conventional languages is that the Korn shell's places heavy emphasis on character strings. (Thus it has more in common with a special-purpose language like SNOBOL than a general-purpose one like Pascal.) This is also true of the Bourne shell and the C shell, but the Korn shell goes beyond them by having additional mechanisms for handling integers (explicitly) and simple arrays.

4.2.1 Positional Parameters

As we have already seen, you can define values for variables with statements of the form varname=value, e.g.:

$ fred=bob
$ print "$fred"
bob

Some environment variables are predefined by the shell when you log in. There are other built-in variables that are vital to shell programming. We will look at a few of them now and save the others for later.

The most important special, built-in variables are called positional parameters. These hold the command-line arguments to scripts when they are invoked. Positional parameters have names 1, 2, 3, etc., meaning that their values are denoted by $1, $2, $3, etc. There is also a positional parameter 0, whose value is the name of the script (i.e., the command typed in to invoke it).

Two special variables contain all of the positional parameters (except positional parameter 0): * and @. The difference between them is subtle but important, and it's apparent only when they are within double quotes.

"$*" is a single string that consists of all of the positional parameters, separated by the first character in the environment variable IFS (internal field separator), which is a space, TAB, and NEWLINE by default. On the other hand, "$@" is equal to "$1" "$2"... "$N", where N is the number of positional parameters. That is, it's equal to N separate double-quoted strings, which are separated by spaces. We'll explore the ramifications of this difference in a little while.

The variable # holds the number of positional parameters (as a character string). All of these variables are "read-only," meaning that you can't assign new values to them within scripts.

For example, assume that you have the following simple shell script:

print "fred: $@"
print "$0: $1 and $2"
print "$# arguments"

Assume further that the script is called fred. Then if you type fred bob dave, you will see the following output:

fred: bob dave
fred: bob and dave
2 arguments

In this case, $3, $4, etc., are all unset, which means that the shell will substitute the empty (or null) string for them. [4]

[4] Unless the option nounset is turned on.

4.2.1.1 Positional parameters in functions

Shell functions use positional parameters and special variables like * and # in exactly the same way as shell scripts do. If you wanted to define fred as a function, you could put the following in your .profile or environment file:

function fred {
    print "fred: $*"
    print "$0: $1 and $2"
    print "$# arguments"
}

You will get the same result if you type fred bob dave.

Typically, several shell functions are defined within a single shell script. Therefore each function will need to handle its own arguments, which in turn means that each function needs to keep track of positional parameters separately. Sure enough, each function has its own copies of these variables (even though functions don't run in their own subshells, as scripts do); we say that such variables are local to the function.

However, other variables defined within functions are not local [5] (they are global), meaning that their values are known throughout the entire shell script. For example, assume that you have a shell script called ascript that contains this:

[5] However, see the section on typeset in Chapter 6 for a way of making variables local to functions.

function afunc {
    print in function $0: $1 $2
    var1="in function"
}
var1="outside of function"
print var1: $var1
print $0: $1 $2
afunc funcarg1 funcarg2
print var1: $var1
print $0: $1 $2

If you invoke this script by typing ascript arg1 arg2, you will see this output:

var1: outside of function
ascript: arg1 arg2
in function afunc: funcarg1 funcarg2
var1: in function
ascript: arg1 arg2

In other words, the function afunc changes the value of the variable var1 from "outside of function" to "in function," and that change is known outside the function, while $0, $1, and $2 have different values in the function and the main script. Figure 4.2 shows this graphically.

Figure 4.2: Functions have their own positional parameters

It is possible to make other variables local to functions by using the typeset command, which we'll see in Chapter 6. Now that we have this background, let's take a closer look at "$@" and "$*". These variables are two of the shell's greatest idiosyncracies, so we'll discuss some of the most common sources of confusion.

Why are the elements of "$*" separated by the first character of IFS instead of just spaces? To give you output flexibility. As a simple example, let's say you want to print a list of positional parameters separated by commas. This script would do it:
```
IFS=,
print $*
```
Changing IFS in a script is fairly risky, but it's probably OK as long as nothing else in the script depends on it. If this script were called arglist, then the command arglist bob dave ed would produce the output bob,dave,ed. Chapter 10 contains another example of changing IFS.
Why does "$@" act like N separate double-quoted strings? To allow you to use them again as separate values. For example, say you want to call a function within your script with the same list of positional parameters, like this:
```
function countargs {
    print "$# args."
}
```
Assume your script is called with the same arguments as arglist above. Then if it contains the command countargs "$*", the function will print 1 args. But if the command is countargs "$@", the function will print 3 args.

4.2.2 More on Variable Syntax

Before we show the many things you can do with shell variables, we have to make a confession: the syntax of $varname for taking the value of a variable is not quite accurate. Actually, it's the simple form of the more general syntax, which is ${varname}.

Why two syntaxes? For one thing, the more general syntax is necessary if your code refers to more than nine positional parameters: you must use ${10} for the tenth instead of $10. Aside from that, consider the example, from Chapter 3, of setting your primary prompt variable (PS1) to your login name:

PS1="($LOGNAME)-> "

This happens to work because the right parenthesis immediately following LOGNAME is "special" (in the sense of the special characters introduced in Chapter 1) so that the shell doesn't mistake it for part of the variable name. Now suppose that, for some reason, you want your prompt to be your login name followed by an underscore. If you type:

PS1="$LOGNAME_ "

then the shell will try to use "LOGNAME_" as the name of the variable, i.e., to take the value of $LOGNAME_. Since there is no such variable, the value defaults to null (the empty string, ""), and PS1 is set to just a single space.

For this reason, the full syntax for taking the value of a variable is ${varname}. So if we used

PS1="${LOGNAME}_ "

we would get the desired $yourname_. It is safe to omit the curly brackets ({}) if the variable name is followed by a character that isn't a letter, digit, or underscore.


4.1 Shell Scripts and Functions		4.3 String Operators