[Chapter 28] 28.9 ex Scripts Built by diff

28.9 ex Scripts Built by diff

The -e option of diff produces an editing script usable with either ex (33.4) or ed, instead of the usual output. This script consists of a sequence of a (add), c (change), and d (delete) commands necessary to re-create file2 from file1 (the first and second files specified on the diff command line).

Obviously there is no need to completely re-create the first file from the second, because you could do that easily with cp. However, by editing the script produced by diff, you can come up with some desired combination of the two versions.

It might take you a moment to think of a case in which you might have use for this feature. Consider this one: two people have unknowingly made edits to different copies of a file, and you need the two versions merged. (This can happen especially easily in a networked environment, in which people copy files between machines. Poor coordination can easily result in this kind of problem.)

To make this situation concrete, let's take a look at two versions of the same paragraph, that we want to combine:

Version 1:The Book of Kells, now one of the treasures of the Trinity
College Library in Dublin, was found in the ancient
monastery at Ceannanus Mor, now called Kells. It is a
beautifully illustrated manuscript of the Latin Gospels,
and also contains notes on local history. 
It was written in the eighth century. 
The manuscript is generally regarded as the finest example
of Celtic illumination.


Version 2:The Book of Kells was found in the ancient
monastery at Ceannanus Mor, now called Kells. It is a
beautifully illustrated manuscript of the Latin Gospels,
and also contains notes on local history. 
It is believed to have been written in the eighth century. 
The manuscript is generally regarded as the finest example
of Celtic illumination.

As you can see, there is one additional phrase in each of the two files. We can merge them into one file that incorporates both edits. Typing:

$ diff -e version1 version2 > exscript

will yield the following output in the file exscript:

6c
It is believed to have been written in the eighth century. 
.
1,2c
The Book of Kells was found in the ancient
.

You'll notice that the script appears in reverse order, with the changes later in the file appearing first. This is essential whenever you're making changes based on line numbers; otherwise, changes made earlier in the file may change the numbering, rendering the later parts of the script ineffective. You'll also notice that, as mentioned, this script will simply recreate version2, which is not what we want. We want the change to line 5, but not the change to lines 1 and 2. We want to edit the script so that it looks like this:

6c
It is believed to have been written in the eighth century. 
.
w

(Notice that we had to add the w command to write the results of the edit back into the file.) Now we can type:

$ ex - version1 < exscript

to get the resulting merged file:

The Book of Kells, now one of the treasures of the Trinity
College Library in Dublin, was found in the ancient
monastery at Ceannanus Mor, now called Kells. It is a
beautifully illustrated manuscript of the Latin Gospels,
and also contains notes on local history. 
It is believed to have been written in the eighth century. 
The manuscript is generally regarded as the finest example
of Celtic illumination.

Using diff like this can get confusing, especially when there are many changes. It is easy to get the direction of changes confused or to make the wrong edits. Just remember to do the following:

Specify the file that is closest in content to your eventual target as the first file on the diff command line. This will minimize the size of the editing script that is produced.
After you have corrected the editing script so that it makes only the changes that you want, apply it to that same file (the first file).

Nonetheless, because there is so much room for error, it is better not to have your script write the changes back directly into one of your source files. Instead of adding a w command at the end of the script, add the command %p (or 1,$p) to write the results to standard output (13.1). This is almost always preferable when you are using a complex editing script.

If we use this command in the editing script, the command line to actually make the edits would look like this:

$ ex - version1 < exscript > version3

Writers often find themselves making extensive changes and then wishing they could go back and recover some part of an earlier version. Obviously, frequent backups will help. However, if backup storage space is at a premium, it is possible to save only some older version of a file and then keep incremental diff -e scripts to mark the differences between each successive version. (As it turns out, this is what version control systems like SCCS and RCS (20.12) do.)

To apply multiple scripts to a single file, you can simply pipe them to ex rather than redirecting input:

cat
$ cat script1 script2 script3 | ex - oldfile

cat	$ `cat script1 script2 script3 \| ex - oldfile`

But wait! How do you get your w (or %p) command into the pipeline? You could edit the last script to include one of these commands. But there's another trick that we ought to look at because it illustrates another useful feature of the shell that many people are unaware of. If you enclose a semicolon-separated list of commands in parentheses (13.7), the standard output of all of the commands are combined, and can be redirected together. The immediate application is that, if you type:

echo
$ cat script1 script2 script3; echo '%p' | ex - oldfile

echo	$ `cat script1 script2 script3; echo '%p' \| ex - oldfile`

the results of the cat command will be sent, as usual, to standard output, and only the results of echo will be piped to ex. But if you type:

$ (cat script1 script2 script3; echo '%p') | ex - oldfile

the output of the entire sequence will make it into the pipeline, which is what we want.

- TOR from UNIX Text Processing, Hayden Books, 1987, Chapter 12


28.8 More Friendly diff Output		28.10 Problems with diff and Tabstops