I want to meditate on some of the most useful perl-isms that, while easy, are oft misunderstood for beginners. I say that, and am possibly projecting, since when I was a beginner, I had not grokked them and had misused them. I also want to give kudos to the comments below as they have greatly helped refine this posting.
Map, Grep and Sort are not the same thing, but are often used together. Flowing from right to left, they act like shell pipelining in reverse. They build something new, like a tiny little factory. While the original data structure is untouched, it should feel like a list is being transformed every step of the way.
Grep
I will start with grep as it is easier to understand. It is a filter in the purest sense. From the right, grep receives a list of things, one thing at a time, and either passes those things to the left or does not based on its logic. It's usage is
grep { filtering-block } @input_list
What is the filtering block? It is called for each item in the input list. If it returns a true value, that list element is passed to the left. If it returns a false value, the list element is not passed to the left. Inside the filtering block, the list element being examined is aliased to the variable $_. On the right, I see the input list. To the left is passed the filtered list.
my @caps = grep { $_ =~ /^[A-Z]/ } qw( This is for Edna );
Since the magic variable is implied, the above may be shortened to
my @caps = grep { /^[A-Z]/ } qw( This is for Edna );
Pipelining
It is common for grep and map to be chained together in a pipeline. Each part of the pipeline gets a list from the right and hands off a list to the left.
The following snippet finds all words that start with a capitalized vowel. The first filter finds all words that start with a vowel and passes them to the left. It passes is and
Edna to the left. The second filter then only passes Edna to the left.
my @vowel_caps = grep { /^[A-Z]/ } grep { /^[aAeEiIoOuU]/ }
qw( This is for Edna );
The pipeline can consist of mixed and matched grep, map and sort pieces.
Sort
As it sounds, sorting takes a list from the right and passes a list of equal size to the left that is sorted by the sort block. The sort block is a function that compares variables $a and $b and returns -1, 0 or 1. The variables $a and $b are two items from the list that are to be sorted. If the return value is -1, that means $a should be sorted before $b. If the value is 1, $b should be sorted before $a. For a 0 return value, the order does not matter. For strings, the cmp operator can be used and for numbers the spaceship operator <=> is used. You can do two or more levels of sorting using the || operator.
my @numbers = sort { $b <=> $a } ( 23,5,23,64,2);
my @words = sort { $b cmp $a }
( "this", "Is", "sortingly", "something" );
my @owords = sort ( "default", "operator", "is", "text", "sort" );
my @things = sort { $a->{order} <=> $b->{order} }
( { name => "foo", order => 4 },
{ name => "bar", order => 88 },
{ name => "baf", order => -12 } );
my @things = sort { $a->{order} <=> $b->{order} ||
$b->{name} cmp $a->{name} }
( { name => "foo", order => 4 },
{ name => "bar", order => 88 },
{ name => "baf", order => -12 },
{ name => "zoo", order=>88 } );
Map
Map takes each item from the list on the right and passes the return value from its block to the left. It should not transform the items on the list as a matter of practice; it should not be used with side effects nor should it be used as the way to iterate through a list. Each item is aliased in block as the magic variable $_.
my @squares = map { $_ * $_ } ( 1,2,3,4,5,6 );
my @even_squares = map { $_ * $_ } grep { $_/2 == int($_/2) }
( 1,2,3,4,5,6 );
my @alpha_sorted_even_digits = sort { $a cmp $b }
map { [ "zero", "one", "two", "three", "four",
"five", "six", "seven", "eight", "nine" ]->[$_] }
grep { $_/2 == int( $_/2 ) } ( 1,2,3,4,5,8,9 );
One important thing to note is that map operates in list content, meaning that it returns one or more things. This feature can be used to generate a list that is greater than the original list, or even to generate a hash.
my @doubled = map { $_, $_ } ( 1,3,4,5,6,6 );
my @flattened = map { @$_ } ( [ 1,2,3], [ "A","B","C" ], [ $var1, $var
+2, $var3 ] );
my %squares_hash = map { $_ => $_ * $_ } ( 1,2,3,4,5,6 );
Beyond
This being perl, there are many ways to do anything. There are a number of fantastic libraries set up for handling lists, namely
Re: map grep and sort
by eyepopslikeamosquito (Archbishop) on May 24, 2013 at 20:55 UTC
|
Inside the filtering block, the list element being examined is set to the
default variable $_
To be more precise, $_ is an alias to the list element.
Map takes each item from the list on the right and passes the return value
from its block to the left. It does not transform the items on the list even
if it feels like that.
That is not correct.
From map:
Note that $_ is an alias to the list value, so it can be used to modify the
elements of the LIST.
While this is useful and supported, it can cause bizarre results if the
elements of LIST are not variables.
Note however that
Effective Perl
Programming, item 20,
cautions against this:
For efficiency, $_ is actually an alias for the current element in the iteration. If you modify $_ within the transform expression of a map, you modify the input data. This is generally considered to be bad style, and -- who knows? -- you may even wind up confusing yourself this way. If you want to modify the contents of a list, use foreach.
BTW, in the same item,
"Use foreach, map and grep as appropriate",
Effective Perl Programming
provides a nice summary of when to use foreach, map and grep:
- Use foreach to iterate read-only over each element of a list
- Use map to create a list based on the contents of another list
- Use foreach to modify elements of a list
- Use grep to select elements in a list
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
I agree. I should change the text to note $_ being an alias and change the 'does not transform' into 'Should not be used to transform'
| [reply] [Watch: Dir/Any] |
Re: map grep and sort
by moritz (Cardinal) on May 25, 2013 at 08:08 UTC
|
When I was learning map and grep, it helped me to see the equivalent of them in normal perl code.
my @a = map { transformation($_) } @b;
# same as
my @a;
push @a, tranformation($_) for @b;
@a
And
my @a = grep { condition($_) } @b;
# same as
my @a;
for (@b) {
push @b, $_ if condition($_);
}
@a
If you see yourself using one of those patterns, you can likely use map or grep instead.
And as a final note, you can use both without a block if you want to use only one expression:
my @even = grep $_ % 2 == 0, @numbers;
my @odd = map $_ * 2 +1, @numbers;
In particular that allows you to write
my @numbers = grep /^[0-9]+$/, @words;
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: map grep and sort
by Tux (Canon) on May 25, 2013 at 12:39 UTC
|
my @img = map { $_->[0] }
sort { $tsort{$Option{thumbsorting}}->() }
map { my $seq = m/(\d+)/ ? $1 : 0;
[ $_, $seq, (stat "$idir/$_")[7,9], lc $_, rand 1 ]
+ }
grep { my $s = -s "$idir/$_"; $s and $s > 100 } # Sanity c
+heck. Minimal image size 100
# convert can't deal with .ico files (yet)
# Tk can deal with Tiff/NEF as of 804.027_501 with Tk::TIF
+F
grep m/\.(jpe?g|gif|x[pb]m|png|bmp|tiff?|nef)$/i => @file_
+names;
and the Guttman Rosler Transform
@out = map substr ($_, 1 + rindex $_, "\0") =>
sort =>
map "\L$_\E\0$_" =>
@in;
or combinations
# sort file names by the (first) sequence of digits, then by size
my @sorted = map { $_->[0],
sort { $a->[1] cmp $b->[1] }
map { m/(\d+)/; # sort by digits in file name
[ $_, pack "l>l>", $1 || 0, -s $_ ] }
@filelist;
Enjoy, Have FUN! H.Merijn
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: map grep and sort
by davido (Cardinal) on May 25, 2013 at 03:36 UTC
|
Probably beyond the scope of "the basics", but List::Util::reduce sure is handy too sometimes. And although you have to load a module to use it, that module is part of the core Perl distribution.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
That is a very good and important point. Map can generate hashes because of this. I will add that fact to the above.
| [reply] [Watch: Dir/Any] |
Re: map grep and sort
by sundialsvc4 (Abbot) on May 28, 2013 at 14:08 UTC
|
It is also worth mentioning, perhaps as a counterpoint to this most-excellent thread (that I just dropped a great many up-votes onto) ... whatever you do, be clear, and always code for the future. Chains of map/sort/grep can indeed be quite powerful, but they can also be ... or, over time, can become ... difficult to maintain. If what you are doing can alternatively be expressed using, say, one of the List::Util packages, and even when a peek under the covers of those modules reveals heavy use of map/sort/grep, consider carefully what approach is best to take. (I do not mean to suggest the answer.)
Probably for many years in the future, someone (else ... sucks that that bread-truck was in same place you were at the same time) is going to encounter that code, is going to have to accurately understand it, and is going to have to make what may well be a significant change, thanks to the ever-present Marketing Department. It will be critical for them to be able to make the right changes to your “elegant” logic, which can be very difficult. Perhaps the change that they must do will prove to be quite disruptive to your original design choice ... whereas, given a different choice on your part, it would not have been so. Carefully weigh these concerns against elegance and efficiency. You are creating logic that is “very tightly coupled.” Every step in the chain must be considered together and is co-dependent upon every other part of it. Aye, this is both a fee-chur and a curse.
Make sure that you subject this sort of logic to extensive testing, e.g. with the ubiquitous Test::More, and that you ... well ... that you do actually test it thoroughly. Precisely because these techniques do pack a lot of semantic meaning into a very small space, it becomes exceptionally easy to make an unintended error. CPAN modules already include extensive self-tests that are run at installation time. Your logic, initially, has none. Many times, a subtle application problem can be traced to an edge-case that encountered elegance. Now, the solution was obliged to be a “big and tricky” rewrite of a tightly-interconnected (but elegant!) section of code.
| [reply] [Watch: Dir/Any] |
Re: map grep and sort
by perl514 (Pilgrim) on Jul 04, 2013 at 09:02 UTC
|
| [reply] [Watch: Dir/Any] [d/l] |
|
|