Regular Expressions
From NetBSD Wiki
Regular expressions (regex for short) form a small domain-specific language for matching text. They are an absolute must in every Unix user's toolbox, as they are used by common Unix tools like grep, sed and most text editors have a powerful searching functions that also uses regular expressions.
An example:
$ grep '^[tT]he results? of your search(es)? (is|are):' file the result of your search is: The result of your searches are: this foo bar the results of your search are:
The example matches only at the beginning of a line (the ^ takes care of that), where it finds a t or a T. Then it wants the word result, optionally suffixed with the character "s". (the ? says a character may be there, but it is not required). Then we see the words "of" and "your", followed by "search", optionally prefixed by the letters "es". The ( and ) group a sub-expression (we could have used matchers inside the parens as well). Because we created a group, the question mark (?) works on the entire group instead of just one character. Then we expect another group, which is formed by either "is" or "are". The pipe-symbol | means "or".
The second match extends beyond the regex because we didn't explicitly tell grep that the line had to end after the matching part. We could have easily excluded that match by suffixing the entire expression with a $:
$ grep '^[tT]he results? of your search(es)? (is|are|):$' file
Note: there is often some overlap between the special characters of regular expressions and the special characters the shell interprets. It is important to properly quote your expressions if you construct a regular expression in the shell. See the next section for more information.
Shell globs vs. regular expressions
Shell globs are a more basic form of regular expressions where * and ? equal .* and .? in regexes. Because they are more basic, they are less powerful. Windows-inspired tools and tools for the less educated user often opt to use shell globbing wildcards in search dialog boxes to make their searches just that little bit more powerful most users need. This prevents the more advanced user from doing mind-bending regular expression conversions on their text, but the common case does not require such spectactular use of regular expressions. It can be argued whether this is a good thing or not.
See also
References
- Regular expressions case study in The Art of Unix Programming
- Google search for regular expressions
