Learning AWK Programming
上QQ阅读APP看书,第一时间看更新

Closure

The closure, or asterisk or star, means that the item immediately preceded by * is matched zero or more times. For example, in the given expression, we match a letter, immediately followed by a lowercase letter or digit. The first character class matches a letter. The second character class matches a letter or digit. The star repeats the second character class:

$ echo -e "ca\n
c\n
c1\n
1\n
;\n
c;\n
cc" | awk '/[a-z][a-z0-9]*/'

The output on execution of this code is as follows:

ca
c
c1
c;
cc

Let's have another example to explain it. In this example, we print all the lines that contain the ca string and it is followed by zero or more occurrences of t:

$ echo -e "ca\n
cat\n
catt\n
c\n
catterpillar" | awk '/cat*/'

The output on execution of this code is as follows:

ca
cat
catt
catterpillar

To match as long a string as possible between (and), we can use closure as follows:

$ awk '/\(.*\)/'   dot_regex.txt

The output on execution of the preceding program is as follows:

(that is cool)
(this)

A summary of closure operations is as follows:

Pattern

Matches

A*

Matches the null string, A, or AA, and so on

AB*C

Matches ACABC, or ABBC, and so on

AB.*C

Matches AB followed by zero or more other characters followed by C as ABCABBC, or XAB78478XC, and so on

[0-9]*

Matches zero or more numbers

[0-9][0-9]*

Matches one or more numbers

^A*

Matches any line

^A\*

Matches any line starting with A*

^AA*

Matches any line starting with one A as AAA, or AAA, and so on