CS 330 Lecture 4 – State Machines and Regular Expressions
Agenda
- State machines: states, transitions, accepting states
- A machine to recognize parenthesized words
- A machine to recognize numbers
- A possible implementation
- Regular expressions as state machine encoding
- Not much Ruby
- Composing a string of email addresses
- Removing all HTML
- Instantiating states (geographic ones)
TODO
- If we want more on regex, read pages 257-269 of Read Ruby. (There’s a better site for this book, but it appears to be down.)
- If we want to start assembly, read chapters 1-3 of Programming From the Ground Up, stopping before the Finding a Maximum Value section (page 21 in the letter-size edition.)
Program This
Draw a state machine that accepts numbers that
- May be negative or positive.
- Always have a number in the ones-place.
- May have a decimal and numbers following.
- May have a scientific notation suffix of “e” followed by more numbers. This suffix follows any decimal portion.
Code
email_list.rb
#!/usr/bin/env ruby
text = IO.read(ARGV[0])
list = ""
text.scan(/[A-Z]{2,}/) do |match|
list += $& + "@uwec.edu,"
end
puts list
strip_html.rb
#!/usr/bin/env ruby
text = IO.read(ARGV[0])
text = text.gsub(/<.*?>/, '')
puts text
Haiku
It’s all legal there.
In .*, anything goes.
One accepting state.
In .*, anything goes.
One accepting state.