|Perl: the Markov chain saw|
Map Tutorial: The Basics
The map built-in function allows you to build a new list from another list while modifying the elements - simultaneously. Map takes a list and applies either a block of code or an expression to that list to produce a new list. I will limit the scope of this tutorial to the code block form.
When applied correctly, map can produce lightning-fast transforms very efficiently. When abused, it can produce some extremely obfuscated code, sacrificing readability and maintainability (giving legacy coders unnecessary headaches).
Vroom has an excellent tutorial on Complex Sorting - in it he has some more complex but extremely useful explanations of map. The purpose of this tutorial is to talk about the easy stuff: to allow a programmer new to the concept to stick one toe in the water at a time, so to speak.
Map: what is it good for?
But say that you wanted each element to contain a single word. As long as you didn't care about punctuation, you can use the map function like so:
Remember that split uses whitespace as its default delimiter, and the special variable $_ as its default variable to split up. Line 2 can be written as:
The choice to use default arguments is a trade-off between understandability and laziness/elegance. Also, remember that a file handle can be taken in list context.
Sorry, but the details of this regex are beyond the scope of the tutorial, be sure and check out root's tutorial on String matching and Regular Expressions. I will tell you what it does, though: it turns Hello World! into Hello World and it does so without removing punctuation from anacronyms like J.A.P.H. - okay, okay, half-truth: J.A.P.H. becomes J.A.P.H - good enough for this example (can anyone say "exercise for reader").
Moving on . . . now we can add this regex. The inner block of a map statement may contain a number of statements separated by semi-colons. The statements are interpreted left to right:
Example 3: (know what a function returns!)
Uh-oh. What happened? If you try this, you will not receive the output you might have expected. Instead, you will see numbers and/or blank lines. If the substitution operator found no punctuation in a line it will return UNDEF, otherwise it will return the number of substitutions on that line. It does NOT return the line itself. In cases like this, the function or operator affects it's argument by reference. Split does not work in this manner - it returns what was split off. Look at example 2 again - the last thing that gets passed out of the map block is the return value of split. So, if we want to return the line altered by a substitution, we will have to tell Perl so - like this:
Map: what is it NOT good for?
Also, some built-in functions, such as chomp and reverse, can be applied to a list AT ONCE, so to speak. For example, if you wanted to slurp the contents of a text file into a list without the new lines, you might be tempted to use your new knowledge like so:
(remember what we learned from example 3 - chomp returns the numbers of newlines chomped off (1), so we have to explicitly let Perl know we want the remaining value). However, it turns out that chomp can do a much better job by itself:
The second example actually runs faster. Why? Because Perl will literally stuff the entire file into the array - no iteration needed. The same goes for the chomp - Perl will not iterate through the list. By using a map statement, however, you are forcing iteration to happen.
I used benchmark to time these two examples using '/usr/dict/words' as the input file. Here were the results for 100 iterations:
Something else to consider is readability and maintainability. If you want your code to be either, map statements might not be a good solution - let's face it, no other language really has this one-liner of death implemented, and unless you like watching ears bleed, keep it simple! (personally, I like watching ears bleed!)
Of course, there aren't too many obfuscated Perl scripts out there that don't use map. Keep up the higher learning!