CS 330 Lecture 3 – Regular expressions++
Agenda
- more wildcard characters
- developing on clark.cs.uwec.edu
- substitution
- slurping a file
- expressions as replacements
- Perl subroutines
- zero-width assertions
Code
emails.pl
#!/usr/bin/perl open($in, '<', 'getlist.txt'); while ($line = <$in>) { $line =~ s/^(\S+).*$/$1\@uwec.edu/; print($line); } close($in);
page.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <title></title> </head> <body> <h2>Foo</h2> <h3>Blech</h3> <h4>Scrumpt</h4> </body> </html>
demote.pl
#!/usr/bin/perl open($in, '<', 'page.html'); { local $/ = undef; $html = <$in>; $html =~ s#(<\s*/?h)(\d)(\s*)#$1 . ($2 - 1) . $3#gie; } print $html; close($in);
text.txt
asdfasjkdf askjdf asdkjfhadsjkf hasdkjfsad'fasdf sdf asfdsafasdfa sdfas dfsakdjfsakldfj asdf asdf asdfasdjf asdfkljasdf kasdhfsakjdhf sadf sadfjhsdajkfhsadjfhsdf asdfj sdfkjhasfd kjasdf asdfhasdjkhf asdflj asdfl sadf asdlfkja sdf alsdkjfasdflj asdfasd fasd falsdjfadsf lasdkjfsaldf asdfhsdfjk foo
insert_paras.pl
#!/usr/bin/perl open($in, '<', 'text.txt'); { local $/ = undef; $html = <$in>; $html =~ s#(?<=\n)(?=\n)#</p>\n\n<p>#g; } print "<p>$html</p>"; close($in);
stuff.txt
asdf asdfjhasd kfjhasdf sadfjasdf [1, 5, 6] asdfkjlasd [10] [6, 0 , 3]
summer.pl
#!/usr/bin/perl sub sum($) { my($sequence) = @_; $sum = 0; while ($sequence =~ m/(\d+)/g) { $sum += $1; } return $sum; } open($in, '<', 'stuff.txt'); { local $/ = undef; $html = <$in>; $html =~ s#\[(\s*\d+(\s*,\s*\d+)*\s*)\]#sum($1)#ge; } print $html; close($in);
Haiku
interpretation A:
spouse criteria
match caret dot star dollar
I am desperate
spouse criteria
match caret dot star dollar
I am desperate
interpretation B:
spouse criteria
match caret dot start dollar
all wild, always
spouse criteria
match caret dot start dollar
all wild, always