CS 330 Lecture 3 – Regular expressions++
January 27, 2012 by Chris Johnson. Filed under cs330, lectures, spring 2012.
Agenda
- more wildcard characters
- developing on clark.cs.uwec.edu
- substitution
- slurping a file
- expressions as replacements
- Perl subroutines
- zero-width assertions
Code
emails.pl
#!/usr/bin/perl
open($in, '<', 'getlist.txt');
while ($line = <$in>) {
$line =~ s/^(\S+).*$/$1\@uwec.edu/;
print($line);
}
close($in);
page.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title></title>
</head>
<body>
<h2>Foo</h2>
<h3>Blech</h3>
<h4>Scrumpt</h4>
</body>
</html>
demote.pl
#!/usr/bin/perl
open($in, '<', 'page.html');
{
local $/ = undef;
$html = <$in>;
$html =~ s#(<\s*/?h)(\d)(\s*)#$1 . ($2 - 1) . $3#gie;
}
print $html;
close($in);
text.txt
asdfasjkdf askjdf asdkjfhadsjkf hasdkjfsad'fasdf
sdf
asfdsafasdfa
sdfas dfsakdjfsakldfj asdf
asdf
asdfasdjf asdfkljasdf kasdhfsakjdhf sadf sadfjhsdajkfhsadjfhsdf asdfj sdfkjhasfd kjasdf
asdfhasdjkhf
asdflj asdfl sadf asdlfkja sdf
alsdkjfasdflj asdfasd fasd falsdjfadsf
lasdkjfsaldf asdfhsdfjk
foo
insert_paras.pl
#!/usr/bin/perl
open($in, '<', 'text.txt');
{
local $/ = undef;
$html = <$in>;
$html =~ s#(?<=\n)(?=\n)#</p>\n\n<p>#g;
}
print "<p>$html</p>";
close($in);
stuff.txt
asdf asdfjhasd kfjhasdf
sadfjasdf
[1, 5, 6]
asdfkjlasd
[10]
[6, 0 , 3]
summer.pl
#!/usr/bin/perl
sub sum($) {
my($sequence) = @_;
$sum = 0;
while ($sequence =~ m/(\d+)/g) {
$sum += $1;
}
return $sum;
}
open($in, '<', 'stuff.txt');
{
local $/ = undef;
$html = <$in>;
$html =~ s#\[(\s*\d+(\s*,\s*\d+)*\s*)\]#sum($1)#ge;
}
print $html;
close($in);
Haiku
interpretation A:
spouse criteria
match caret dot star dollar
I am desperate
show
interpretation B:
spouse criteria
match caret dot start dollar
all wild, always
show