Re: Teaching Regular Expression Pattern Matching

...colleagues who tend to do a lot of elaborate string manipulation using only built-in string functions—a common anti-pattern I've observed

Curiously, I more frequently see the converse anti-pattern, namely over-using regexes. For example, I've often seen Perl beginners essaying:

if ($nodename =~ /$mynode/)
[download]

when they should have been using:

if ($nodename eq $mynode)
[download]

Apart from the obvious problem of embedded names (e.g. "freddy" v "fred"), I've lost count of the number of times I've asked a Perl rookie to consider what happens if $mynode contains regex metacharacters. Update: So I suggest you mention \Q and \E and quotemeta in your course.

I faced a similar Perl training problem a few years back and sent out an email with seven specific word puzzle problems against the Unix "words" file (e.g. /usr/dict/words). I offered some sort of prize for the winner IIRC. All could be solved as one liners using: perl -ne 'your-program-here' words or via a longer program, if you prefer. For example: find all palindromes; find the longest word in the dictionary; find all words that contain a particular letter four or more times; find all words that start with "e", have "n" as their second last letter, and are greater than seven characters in length; find all words that are of even length and contain an even number of each and every distinct vowel in the word. There are an endless number of interesting word puzzles available. You can further ask them to produce both regex and non-regex solutions, to compare and contrast which approach is more appropriate for each problem. It is not hard to invent problems where the regex solution is vastly superior to the non-regex one, which may help convince them of the power of regex.

Update: See also:

Comment on Re: Teaching Regular Expression Pattern Matching Select or Download Code


go ahead... be a heretic
	PerlMonks