280 likes | 399 Vues
This lecture covers the fundamentals of using AWK in Unix programming for text processing. It reviews patterns, regular expressions, and various Unix commands that support them, such as grep and sed. Key concepts include searching files for specific lines, performing actions on those lines, and managing fields and records. Examples illustrate how to change delimiters, conditionally print fields, and employ logical operations. The session highlights AWK's history and practical applications, making it essential for anyone looking to enhance their Unix text manipulation skills.
E N D
CSE4251 The Unix Programming Environment Lecture 10 • awk
Recap • Regular expressions (lec-5) • symbols & rules to describe text patterns/filters • Unix commands/utilities that support regular expressions • grep(fgrep, egrep)- search a file for a string or regular expression • sed - stream editor • awk (nawk) - pattern scanning and processing language note: there are some minor differences between the regular expressions supported by these programs
awk history • The name AWK • Initials of designers: Alfred V. Alo, Peter J. Weinberger, and Brian W. Kernighan. • Appear 1977, stable release 1985 • In BSD, OS X: bawk or nawk. • GNU/Linux : gawk $ which awk /bin/awk $ ls -l /bin/*awk lrwxrwxrwx. 1 root root4 Jul 2 2013 /bin/awk -> gawk -rwxr-xr-x. 1 root root 382456 Jul 4 2012 /bin/gawk
awk basics • basic function: • search files for lines that contains certain patterns, • do actions on those lines • basic command format: $ awk ‘{action}’ file.txt $ awk‘/pattern/{action}’ file.txt $ awk‘/pattern1/{action1} /pattern2/{action2}’ file.txt
gamefile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Heme 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
awk basics • basic command examples: $ awk ‘{print}’ gamefile #print all lines (no pattern constraints) $ awk '{print}' gamefile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Heme 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
awk basics • basic command examples: $ awk ‘{print $1}’ gamefile #print 1st field in all lines $ awk '{print $1}' gamefile northwest western southwest southern southeast eastern northeast north central
awk basics • basic command examples: $ awk ‘/north/{print $1}’ gamefile #print 1st field in lines containing north $ awk '/north/{print $1}' gamefile northwest northeast north
awk basics • basic command examples: $ awk ‘/north/{print $1} /west/{print}’ gamefile $ awk '/north/{print $1} /west/{print}' gamefile northwest northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 northeast north
More concepts • A line is called a record • text separated by delimiter is called field • default delimiter is space • FS: input field separator (delimiter) • default is space • two ways to change default delimiter • change via –F • change via setting FS
More concepts • Change delimiter to “:” via –F • Change delimiter to “:” via setting FS $ awk -F: '/north/{print $1}' gamefile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 $ awk '{FS=":"} /north/{print $1}' gamefile northwest northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9
More concepts • $0, $1, ... etc • $0 : the whole line • $1 : the first field in a line • NR : Number of record • also the line number • NF : number of fields in a line
More concepts • E.g., print line number, first field, and number of fields in the line; connect each output filed with “-” $ awk '{print NR "-" $1 "-" NF}' gamefile 1-northwest-8 2-western-8 3-southwest-8 4-southern-8 5-southeast-8 6-eastern-8 7-northeast-9 8-north-8 9-central-8
Another example • E.g., print line number, employ first name, and number of fields in the line; connect each output filed with “---” $ cat employees.txt Tom Jones 4424 5/12/66 543354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 $ awk '{print NR "---" $1 "---" NF}' employees.txt 1---Tom---5 2---Mary---5 3---Sally---5 4---Billy---5
awkcomparison expression • Conditional expression • condition? exp1 : exp2 • Logical operation • &&, ||, !
More examples • awk ‘$3<4000’ employees.txt • print lines where $3 is less than 4000 • awk‘/Tom/{print “Hello, “ $1}’ employees • find the line containing Tom, then print “Hello Tom“ $ awk '$3<3000' employees.txt Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 $ awk '/Tom/{print "Hello, " $1}' employees.txt Hello, Tom
More examples • awk‘/ly/{ print $1}’ employees • print the names that contain ly • awk ‘$1 !~ /ly/{ print $1}’ employees • print the names that dose not contain ly $ awk '/ly/{print $1}' employees.txt Sally Billy $ awk '$1 !~ /ly$/{print $1}' employees.txt Tom Mary
More examples • Conditional expression $ cat needmax.txt 1 2 3 5 6 3 7 2 $ awk '{max=($1>$2)? $1 : $2; print max}' needmax.txt 2 5 6 7
More examples • awk'$8 > 10 && $8 < 17' gamefile • awk ‘$7==5{print $7+5}’ gamefile $ awk '$8 > 10 && $8 < 17' gamefile southern SO Suan Chin 5.1 .95 4 15 central CT Ann Stephens 5.7 .94 5 13 $ awk '$7==5{print $7+5}' gamefile 10 10 10 10
Math operators • awk '/southern/{print $5 + 10.56}' gamefile • awk '/southern/{print $8 - 10}' gamefile • awk '/southern/{print $8 / 2}' gamefile • awk '/southern/{print $8 * 2}' gamefile • awk '/northeast/ {print $8 % 3}' gamefile
More operators • assignment operators: =, +=, -=, *=, /=, %=, ^= • increment and decrement: ++, -- • awk '$3 == "Chris"{ $3 = "Christian"; print}' gamefile • if a line’s 3rd field is “Chris”, change it to Christian and print out the line • awk‘{$7^=2; print $7}’ gamefile • square the 7th field and print out the 7thfield
More operators • awk‘{x=1; y=x++; print x, y}’ gamefile $ awk '{x=1;y=x++; print x,y}' gamefile 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1
BEGIN pattern • BEGIN pattern is followed by an action block that is executed before processing any lines from the input file. • can run an awk command without file $ awk 'BEGIN{print "Hello" }' Hello
BEGIN pattern • Change delimiter to “:” via setting FS $ awk '{FS=":"} /north/{print $1}' gamefile northwest northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 $ awk 'BEGIN{FS=":"} /north/{print $1}' gamefile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9
END pattern • END pattern allows actions to be executed after processing all lines in the input file • can run an awk command without file $ awk 'END{ print "The number of records is " NR}' employees.txt The number of records is 4
Can use redirection & pipe in actions • awk'$8 > 10 && $8 < 17 {print}' gamefile • awk '$8 > 10 && $8 < 17 {print > “tmp.out”}' gamefile $ $ awk '$8 > 10 && $8 < 17 {print}' gamefile southern SO Suan Chin 5.1 .95 4 15 central CT Ann Stephens 5.7 .94 5 13 $ awk '$8 > 10 && $8 < 17 {print > "tmp.out"}' gamefile $ cat tmp.out southern SO Suan Chin 5.1 .95 4 15 central CT Ann Stephens 5.7 .94 5 13
awkin script • awk –fawk.scriptsomefile $ cat awk.script #file: awk_first /Tom/{print "Tom's birthday is " $4} /Mary/{print NR, $0} #print line number /^Sally/{print "Hi, Sally. " $1 " has salary of $" $5 "."} $ awk -f awk.script employees.txt Tom's birthday is 5/12/66 2 Mary Adams 5346 11/4/63 28765 Hi, Sally. Sally has salary of $650000.
More to explore • conditional statement • loop • arrays • user-defined functions. Chapter 6 of Unix Shells by Example [Online Version] , 4th Edition by Ellie Quigley