UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 26.6 Just What Does a Regular Expression Match? Chapter 26
Regular Expressions (Pattern Matching)
Next: 26.8 I Never Meta Character I Didn't Like
 

26.7 Limiting the Extent of a Match

A regular expression tries to match the longest string possible; that can cause unexpected problems. For instance, look at the following regular expression, which matches any number of characters inside of quotation marks:

".*"

Let's look at a troff macro that has two quoted arguments, as shown below:

.Se "Appendix" "Full Program Listings"

To match the first argument, a novice might describe the pattern with the following regular expression:

\.Se ".*"

However, the pattern ends up matching the whole line because the second quotation mark in the pattern matches the last quotation mark on the line. If you know how many arguments there are, you can specify each of them:

\.Se ".*" ".*"

Although this works as you'd expect, each line might not have the same number of arguments, causing misses that should be hits - you simply want the first argument. Here's a different regular expression that matches the shortest possible extent between two quotation marks:

"[^"]*"

It matches "a quote, followed by any number of characters that do not match a quote, followed by a quote." The use of what we might call "negated character classes" like this is one of the things that distinguishes the journeyman regular expression user from the novice. [ Perl 5 (37.5) has added a new "non-greedy" regular expression operator that matches the shortest string possible. -JP ]

- DD from O'Reilly & Associates' sed & awk


Previous: 26.6 Just What Does a Regular Expression Match? UNIX Power ToolsNext: 26.8 I Never Meta Character I Didn't Like
26.6 Just What Does a Regular Expression Match? Book Index26.8 I Never Meta Character I Didn't Like

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System