[Chapter 3] 3.4 Shell Variables

3.4 Shell Variables

There are several characteristics of your environment that you may want to customize but that cannot be expressed as an on/off choice. Characteristics of this type are specified in shell variables. Shell variables can specify everything from your prompt string to how often the shell checks for new mail.

Like an alias, a shell variable is a name that has a value associated with it. The Korn shell keeps track of several built-in shell variables; shell programmers can add their own. By convention, built-in variables have names in all capital letters. The syntax for defining variables is somewhat similar to the syntax for aliases:

varname=value

There must be no space on either side of the equal sign, and if the value is more than one word, it must be surrounded by quotes. To use the value of a variable in a command, precede its name by a dollar sign ($).

You can delete a variable with the command unset varname. Normally this isn't useful, since all variables that don't exist are assumed to be null, i.e., equal to the empty string "". But if you use the option nounset (see Table 3.1), which causes the shell to indicate an error when it encounters an undefined variable, then you may be interested in unset.

The easiest way to check a variable's value is to use the print built-in command. [7] All print does is print its arguments, but not until the shell has evaluated them. This includes-among other things that will be discussed later-taking the values of variables and expanding filename wildcards. So, if the variable fred has the value bob, typing:

[7] The Korn shell supports the old command echo, which does much the same thing, for backward compatibility reasons. However, we strongly recommend print because its options are the same on all UNIX systems, whereas echo's options differ between BSD-derived and System V-derived UNIX versions.

$ print "$fred"

will cause the shell to simply print bob. If the variable is undefined, the shell will print a blank line. A more verbose way to do this is:

$ print "The value of \$varname is \"$varname\"."

The first dollar sign and the inner double quotes are backslash-escaped (i.e., preceded with \ so the shell doesn't try to interpret them; see Chapter 1) so that they appear literally in the output, which for the above example would be:

The value of $fred is "bob".

3.4.1 Variables and Quoting

Notice that we used double quotes around variables (and strings containing them) in these print examples. In Chapter 1 we said that some special characters inside double quotes are still interpreted (while none are interepreted inside single quotes). We've seen one of these special characters already: the tilde (~), which is expanded to your (or another user's) home directory.

Another special character that "survives" double quotes is the dollar sign - meaning that variables are evaluated. It's possible to do without the double quotes in some cases; for example, we could have written the above print command this way:

$ print The value of \$varname is \"$varname\".

But double quotes are more generally correct.

Here's why. Suppose we did this:

$ fred=>'Four spaces between these    words.'

Then if we entered the command print $fred, the result would be:

Four spaces between these words.

What happened to the extra spaces? Without the double quotes, the shell split the string into words after substituting the variable's value, as it normally does when it processes command lines. The double quotes circumvent this part of the process (by making the shell think that the whole quoted string is a single word).

Therefore the command print "$fred" prints this:

Four spaces between these    words.

This becomes particularly important when we start dealing with variables that contain user or file input later on.

Double quotes also allow other special characters to work, as we'll see in Chapters 4, 6, and 7. But for now, we'll revise the "When in doubt, use single quotes" rule in Chapter 1 by adding, "...unless a string contains a variable, in which case you should use double quotes."

3.4.2 Built-in Variables

As with options, some built-in shell variables are meaningful to general UNIX users, while others are arcana for hackers. We'll look at the more generally useful ones here, and we'll save some of the more obscure ones for later chapters. Again, Appendix B contains a complete list.

3.4.2.1 Editing mode variables

Several shell variables relate to the command-line editing modes that we saw in the previous chapter. These are listed in Table 3.2.

The first two of these are sometimes used by text editors and other screen-oriented programs, which rely on the variables being set correctly. Although the Korn shell and most windowing systems should know how to set them correctly, you should look at the values of COLUMNS and LINES if you are having display trouble with a screen-oriented program.

Table 3.2: Editing Mode Variables
Variable	Meaning
COLUMNS	Width, in character columns, of your terminal. The standard value is 80 (sometimes 132), though if you are using a windowing system like X, you could give a terminal window any size you wish.
LINES	Length of your terminal in text lines. The standard value for terminals is 24, but for IBM PC-compatible monitors it's 25; once again, if you are using a windowing system, you can usually resize to any amount.
HISTFILE	Name of history file, on which the editing modes operate.
EDITOR	Pathname of your favorite text editor; the suffix (macs or vi) determines which editing mode to use.
VISUAL	Similar to EDITOR; used if EDITOR is not set or vice versa.
FCEDIT	Pathname of editor to use with the fc command.

3.4.2.2 Mail Variables

Since the mail program is not running all the time, there is no way for it to inform you when you get new mail; therefore the shell does this instead. [8] The shell can't actually check for incoming mail, but it can look at your mail file periodically and determine whether the file has been modified since the last check. The variables listed in Table 3.3 let you control how this works.

[8] BSD UNIX users should note that the biff command on those systems does a better job of this; while the Korn shell only prints "you have mail" messages right before it prints command prompts, biff can do so at any time.

Table 3.3: Mail Variables
Variable	Meaning
MAIL	Name of file to check for incoming mail (i.e., your mail file)
MAILCHECK	How often, in seconds, to check for new mail (default 600 seconds, or 10 minutes)
MAILPATH	List of filenames, separated by colons (`:`), to check for incoming mail

Under the simplest scenario, you use the standard UNIX mail program, and your mail file is /usr/mail/yourname or something similar. In this case, you would just set the variable MAIL to this filename if you want your mail checked:

MAIL=/usr/mail/yourname

If your system administrator hasn't already done it for you, put a line like this in your .profile.

However, some people use nonstandard mailers that use multiple mail files; MAILPATH was designed to accommodate this. The Korn shell will use the value of MAIL as the name of the file to check, unless MAILPATH is set, in which case the shell will check each file in the MAILPATH list for new mail. You can use this mechanism to have the shell print a different message for each mail file: for each mail filename in MAILPATH, append a question mark followed by the message you want printed.

For example, let's say you have a mail system that automatically sorts your mail into files according to the username of the sender. You have mail files called /usr/mail/you/fritchie, /usr/mail/you/droberts, /usr/mail/you/jphelps, etc. You define your MAILPATH as follows:

MAILPATH=/usr/mail/you/fritchie:/usr/mail/you/droberts:\
/usr/mail/you/jphelps

If you get mail from Jennifer Phelps, then the file /usr/mail/you/jphelps will change. The Korn shell will notice the change within 10 minutes and print the message:

you have mail in /usr/mail/you/jphelps.

If you are in the middle of running a command, the shell will wait until the command finishes (or is suspended) to print the message. To customize this further, you could define MAILPATH to be:

MAILPATH=\
/usr/mail/you/fritchie?You have mail from Fiona.:\
/usr/mail/you/droberts?Mail from Dave has arrived.:\
/usr/mail/you/jphelps?There is new mail from Jennifer.

The backslashes at the end of each line allow you to continue your command on the next line. But be careful: you can't indent subsequent lines. Now, if you get mail from Jennifer, the shell will print:

There is new mail from Jennifer.

3.4.2.3 Prompting Variables

If you have seen enough experienced UNIX users at work, you may already have realized that the shell's prompt is not engraved in stone. It seems as though one of the favorite pastimes of UNIX hackers is thinking of cute or innovative prompt strings. We'll give you some of the information you need to do your own here; the rest will come in the next chapter.

Actually, the Korn shell uses four prompt strings. They are stored in the variables PS1, PS2, PS3, and PS4. The first of these is called the primary prompt string; it is your usual shell prompt, and its default value is "$ " (a dollar sign followed by a space). Many people like to set their primary prompt string to something containing their login name. Here is one way to do this:

PS1="($LOGNAME)-> "

LOGNAME is another built-in shell variable, which is set to your login name when you log in. So, PS1 becomes a left parenthesis, followed by your login name, followed by ")-> ". If your login name is fred, your prompt string will be "(fred)-> ". If you are a C shell user and, like many such people, are used to having a command number in your prompt string, the Korn shell can do this similarly to the C shell: if there is an exclamation point in the prompt string, it will substitute the command number. Thus, if you define your prompt string to be:

PS1="($LOGNAME !)->"

then your prompts will be like (fred 1)->, (fred 2)->, and so on.

But perhaps the most useful way to set up your prompt string is so that it always contains your current directory. This way, you needn't type pwd to remember where you are. Putting your directory in the prompt is more complicated than the above examples, because your current directory changes during your login session, whereas your login name and the name of your machine don't. But we can accommodate this by taking advantage of some of the shell's arcane quoting rules. Here's how:

PS1='($PWD)-> '

The difference is the single quotes, instead of double quotes, surrounding the string on the right side of the assignment. Notice that this string is evaluated twice: once when the assignment to PS1 is done (in your .profile or environment file) and then again after every command you enter. Here's what each of these evaluations does:

The first evaluation just observes the single quotes and returns what is inside them without further processing. As a result, PS1 contains the string ($PWD)-> .
After every command, the shell evaluates ($PWD)->. PWD is a built-in variable that is always equal to the current directory, so the result is a primary prompt that always contains the current directory.

We'll add to this example in Chapter 7, Input/Output and Command-line Processing. PS2 is called the secondary prompt string; its default value is >. It is used when you type an incomplete line and hit RETURN, as an indication that you must finish your command. For example, assume that you start a quoted string but don't close the quote. Then if you hit RETURN, the shell will print > and wait for you to finish the string:

$ print "This is a long line,		# PS1 for the command
> which is terminated down here"	# PS2 for the continuation
$					# PS1 for the next command

PS3 and PS4 relate to shell programming and debugging, respectively; they will be explained in Chapter 5, Flow Control and Chapter 9, Debugging Shell Programs.

3.4.2.4 Terminal Types

The shell variable TERM is vitally important for any program that uses your entire screen or window, like a text editor. Such programs include all screen editors (such as vi and emacs), more, and countless third-party applications.

Because users are spending more and more time within programs, and less and less using the shell itself, it is extremely important that your TERM is set correctly. It's really your system administrator's job to help you do this (or to do it for you), but in case you need to do it yourself, here are a few guidelines.

The value of TERM must be a short character string with lowercase letters that appears as a filename in the terminfo database. [9] This database is a two-tiered directory of files under the root directory /usr/lib/terminfo. This directory contains subdirectories with single-character names; these in turn contain files of terminal information for all terminals whose names begin with that character. Each file describes how to tell the terminal in question to do certain common things like position the cursor on the screen, go into reverse video, scroll, insert text, and so on. The descriptions are in binary form (i.e., not readable by humans).

[9] Versions of UNIX not derived from System V use termcap, an older-style database of terminal capabilities that uses the single file /etc/termcap for all terminal descriptions.

Names of terminal description files are the same as that of the terminal being described; sometimes an abbreviation is used. For example, the DEC VT100 has a description in the file /usr/lib/terminfo/v/vt100; a monitor for a 386-based PC/AT has a description in the file /usr/lib/terminfo/A/AT-386M. An xterm terminal window under the X Window System has a description in /usr/lib/terminfo/x/xterm.

Sometimes your UNIX software will set up TERM correctly; this usually happens for X terminals and PC-based UNIX systems. Therefore, you should check the value of TERM by typing print $TERM before going any further. If you find that your UNIX system isn't setting the right value for you (especially likely if your terminal is of a different make than your computer), you need to find the appropriate value of TERM yourself.

The best way to find the TERM value-if you can't find a local guru to do it for you-is to guess the terminfo name and search for a file of that name under /usr/lib/terminfo by using ls. For example, if your terminal is a Blivitz BL-35A, you could try:

$ cd /usr/lib/terminfo
$ ls b/bl*

If you are successful, you will see something like this:

bl35a           blivitz35a

In this case, the two names are likely to be synonyms for (links to) the same terminal description, so you could use either one as a value of TERM. In other words, you could put either of these two lines in your .profile:

TERM=bl35a
TERM=blivitz35a

If you aren't successful, ls won't print anything, and you will have to make another guess and try again. If you find that terminfo contains nothing that resembles your terminal, all is not lost. Consult your terminal's manual to see if the terminal can emulate a more popular model; nowadays the odds of this are excellent.

Conversely, terminfo may have several entries that relate to your terminal, for submodels, special modes, etc. If you have a choice of which entry to use as your value of TERM, we suggest you test each one out with your text editor or any other screen-oriented programs you use and see which one works best.

The process is much simpler if you are using a windowing system, in which your "terminals" are logical portions of the screen rather than physical devices. In this case, operating system-dependent software was written to control your terminal window(s), so the odds are very good that if it knows how to handle window resizing and complex cursor motion, then it is capable of dealing with simple things like TERM. The X Window System, for example, automatically sets "xterm" as its value for TERM in an xterm terminal window.

3.4.2.5 Command Search Path

Another important variable is PATH, which helps the shell find the commands you enter.

As you probably know, every command you use is actually a file that contains code for your machine to run. [10] These files are called executable files or just executables for short. They are stored in various different directories. Some directories, like /bin or /usr/bin, are standard on all UNIX systems; some depend on the particular version of UNIX you are using; some are unique to your machine; if you are a programmer, some may even be your own. In any case, there is no reason why you should have to know where a command's executable file is in order to run it.

[10] Unless it's a built-in command (one of those shown in boldface, like cd and print), in which case the code is simply part of the executable file for the entire shell.

That is where PATH comes in. Its value is a list of directories that the shell searches every time you enter a command; [11] the directory names are separated by colons (:), just like the files in MAILPATH. For example, if you type print $PATH, you will see something like this:

[11] Unless the command name contains a slash (/), in which case the search does not take place.

/sbin:/usr/sbin:/usr/bin:/etc:/usr/ucb:/local/bin

Why should you care about your path? There are two main reasons. First, once you have read the later chapters of this book and you try writing your own shell programs, you will want to test them and eventually set aside a directory for them. Second, your system may be set up so that certain "restricted" commands' executable files are kept in directories that are not listed in PATH. For example, there may be a directory /usr/games in which there are executables that are verboten during regular working hours.

Therefore you may want to add directories to your PATH. Let's say you have created a bin directory under your login directory, which is /home/you, for your own shell scripts and programs. To add this directory to your PATH so that it is there every time you log in, put this line in your .profile:

PATH=$PATH":/home/you/bin"

This sets PATH to whatever it was before, followed immediately by a colon and /home/you/bin.

This is the "safe" way of doing it. When you enter a command, the shell searches directories in the order they appear in PATH until it finds an executable file. Therefore, if you have a shell script or program whose name is the same as an existing command, the shell will use the existing command-unless you type in the command's full pathname to disambiguate. For example, if you have created your own version of the more command in the above directory and your PATH is set up as in the last example, you will need to type /home/you/bin/more (or just ~/bin/more) to get your version.

The more reckless way of resetting your path is to tell the shell to look in your directory first by putting it before the other directories in your PATH:

PATH="/home/you/bin:"$PATH

This is less safe because you are trusting that your own version of the more command works properly. But it is also risky for a more important reason: system security. If your PATH is set up in this way, you leave open a "hole" that is well known to computer crackers and mischief makers: they can install "Trojan horses" and do other things to steal files or do damage. (See Chapter 10 for more details.) Therefore, unless you have complete control of (and confidence in) everyone who uses your system, use the first of the two methods of adding your own command directory.

If you need to know which directory a command comes from, you need not look at directories in your PATH until you find it. The shell built-in command whence prints the full pathname of the command you give it as argument, or just the command's name if it's a built-in command itself (like cd), an alias, or a function (as we'll see in Chapter 4, Basic Shell Programming).

3.4.2.6 PATH and Tracked Aliases

It is worth noting that a search through the directories in your PATH can take time. You won't exactly die if you hold your breath for the length of time it takes for most computers to search your PATH, but the large number of disk I/O operations involved in some PATH searches can take longer than the command you invoked takes to run!

The Korn shell provides a way to circumvent PATH searches: the tracked alias mechanism we saw earlier in this chapter. First, notice that if you specify a command by giving its full pathname, the shell won't even use your PATH-instead, it will just go directly to the executable file.

Tracked aliases do this for you automatically. If you have alias tracking turned on, then the first time you invoke an alias, the shell looks for the executable in the normal way (through PATH). Then it stores the full pathname as if it were the alias, so that the next time you invoke the command, the shell will use the full pathname and not bother with PATH at all. If you ever change your PATH, the shell marks tracked aliases as "undefined," so that it will search for the full pathnames again when you invoke the corresponding commands.

In fact, you can add tracked aliases for the sole purpose of avoiding PATH lookup of commands that you use particularly often. Just put a "trivial alias" of the form alias -t command =command in your .profile or environment file; the shell will substitute the full pathname itself. [12]

[12] Actually, the shell predefines tracked aliases for most widely-used UNIX utilities.

3.4.3 Directory Search Path

CDPATH is a variable whose value, like that of PATH, is a list of directories separated by colons. Its purpose is to augment the functionality of the cd built-in command.

By default, CDPATH isn't set (meaning that it is null), and when you type cd dirname, the shell will look in the current directory for a subdirectory called dirname. [13] If you set CDPATH, you give the shell a list of places to look for dirname; the list may or may not include the current directory.

[13] As with PATH, this search is disabled when dirname starts with a slash.

Here is an example. Consider the alias for the long cd command from earlier in this chapter:

alias cdcm="cd work/projects/devtools/windows/confman"

Now suppose there were a few directories under this directory to which you need to go often; they are called src, bin, and doc. You define your CDPATH like this:

CDPATH=:~/work/projects/devtools/windows/confman

In other words, you define your CDPATH to be the empty string (meaning the current directory, wherever you happen to be) followed by ~/work/projects/devtools/windows/confman.

With this setup, if you type cd doc, then the shell will look in the current directory for a (sub)directory called doc. Assuming that it doesn't find one, it looks in the directory ~/work/projects/devtools/windows/confman. The shell finds the dirname directory there, so you go directly there.

This feature gives you yet another way to save typing when you need to cd often to directories that are buried deep in your file hierarchy. You may find yourself going to a specific group of directories often as you work on a particular project, and then changing to another set of directories when you switch to another project. This implies that the CDPATH feature is only useful if you update it whenever your work habits change; if you don't, you may occasionally find yourself where you don't want to be.

3.4.3.1 Miscellaneous Variables

We have covered the shell variables that are important from the standpoint of customization. There are also several that serve as status indicators and for various other miscellaneous purposes. Their meanings are relatively straightforward; the more basic ones are summarized in Table 3.4.

The shell sets the values of these variables (the first three at login time, the last two whenever you change directories). Although you can also set their values, just like any other variables, it is difficult to imagine any situation where you would want to.

Table 3.4: Status Variables
Variable	Meaning
HOME	Name of your home (login) directory
SECONDS	Number of seconds since the shell was invoked
SHELL	Pathname of the shell you are running
PWD	Current directory
OLDPWD	Previous directory before the last cd command


3.3 Options		3.5 Customization and Subprocesses