Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

map grep and sort

by coyocanid (Acolyte)
on May 24, 2013 at 17:49 UTC ( #1035181=perlmeditation: print w/ replies, xml ) Need Help??

I want to meditate on some of the most useful perl-isms that, while easy, are oft misunderstood for beginners. I say that, and am possibly projecting, since when I was a beginner, I had not grokked them and had misused them. I also want to give kudos to the comments below as they have greatly helped refine this posting.

Map, Grep and Sort are not the same thing, but are often used together. Flowing from right to left, they act like shell pipelining in reverse. They build something new, like a tiny little factory. While the original data structure is untouched, it should feel like a list is being transformed every step of the way.

Grep

I will start with grep as it is easier to understand. It is a filter in the purest sense. From the right, grep receives a list of things, one thing at a time, and either passes those things to the left or does not based on its logic. It's usage is

grep { filtering-block } @input_list

What is the filtering block? It is called for each item in the input list. If it returns a true value, that list element is passed to the left. If it returns a false value, the list element is not passed to the left. Inside the filtering block, the list element being examined is aliased to the variable $_. On the right, I see the input list. To the left is passed the filtered list.

my @caps = grep { $_ =~ /^[A-Z]/ } qw( This is for Edna );
Since the magic variable is implied, the above may be shortened to
my @caps = grep { /^[A-Z]/ } qw( This is for Edna );
Pipelining
It is common for grep and map to be chained together in a pipeline. Each part of the pipeline gets a list from the right and hands off a list to the left. The following snippet finds all words that start with a capitalized vowel. The first filter finds all words that start with a vowel and passes them to the left. It passes is and Edna to the left. The second filter then only passes Edna to the left.
my @vowel_caps = grep { /^[A-Z]/ } grep { /^[aAeEiIoOuU]/ } qw( This is for Edna );
The pipeline can consist of mixed and matched grep, map and sort pieces.
Sort
As it sounds, sorting takes a list from the right and passes a list of equal size to the left that is sorted by the sort block. The sort block is a function that compares variables $a and $b and returns -1, 0 or 1. The variables $a and $b are two items from the list that are to be sorted. If the return value is -1, that means $a should be sorted before $b. If the value is 1, $b should be sorted before $a. For a 0 return value, the order does not matter. For strings, the cmp operator can be used and for numbers the spaceship operator <=> is used. You can do two or more levels of sorting using the || operator.
my @numbers = sort { $b <=> $a } ( 23,5,23,64,2); my @words = sort { $b cmp $a } ( "this", "Is", "sortingly", "something" ); my @owords = sort ( "default", "operator", "is", "text", "sort" ); my @things = sort { $a->{order} <=> $b->{order} } ( { name => "foo", order => 4 }, { name => "bar", order => 88 }, { name => "baf", order => -12 } ); my @things = sort { $a->{order} <=> $b->{order} || $b->{name} cmp $a->{name} } ( { name => "foo", order => 4 }, { name => "bar", order => 88 }, { name => "baf", order => -12 }, { name => "zoo", order=>88 } );
Map
Map takes each item from the list on the right and passes the return value from its block to the left. It should not transform the items on the list as a matter of practice; it should not be used with side effects nor should it be used as the way to iterate through a list. Each item is aliased in block as the magic variable $_.
my @squares = map { $_ * $_ } ( 1,2,3,4,5,6 ); my @even_squares = map { $_ * $_ } grep { $_/2 == int($_/2) } ( 1,2,3,4,5,6 ); my @alpha_sorted_even_digits = sort { $a cmp $b } map { [ "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" ]->[$_] } grep { $_/2 == int( $_/2 ) } ( 1,2,3,4,5,8,9 );
One important thing to note is that map operates in list content, meaning that it returns one or more things. This feature can be used to generate a list that is greater than the original list, or even to generate a hash.
my @doubled = map { $_, $_ } ( 1,3,4,5,6,6 ); my @flattened = map { @$_ } ( [ 1,2,3], [ "A","B","C" ], [ $var1, $var +2, $var3 ] ); my %squares_hash = map { $_ => $_ * $_ } ( 1,2,3,4,5,6 );
Beyond This being perl, there are many ways to do anything. There are a number of fantastic libraries set up for handling lists, namely

Comment on map grep and sort
Select or Download Code
Re: map grep and sort
by eyepopslikeamosquito (Canon) on May 24, 2013 at 20:55 UTC

    Inside the filtering block, the list element being examined is set to the default variable $_
    To be more precise, $_ is an alias to the list element.

    Map takes each item from the list on the right and passes the return value from its block to the left. It does not transform the items on the list even if it feels like that.
    That is not correct. From map:
    Note that $_ is an alias to the list value, so it can be used to modify the elements of the LIST. While this is useful and supported, it can cause bizarre results if the elements of LIST are not variables.

    Note however that Effective Perl Programming, item 20, cautions against this:

    For efficiency, $_ is actually an alias for the current element in the iteration. If you modify $_ within the transform expression of a map, you modify the input data. This is generally considered to be bad style, and -- who knows? -- you may even wind up confusing yourself this way. If you want to modify the contents of a list, use foreach.

    BTW, in the same item, "Use foreach, map and grep as appropriate", Effective Perl Programming provides a nice summary of when to use foreach, map and grep:

    • Use foreach to iterate read-only over each element of a list
    • Use map to create a list based on the contents of another list
    • Use foreach to modify elements of a list
    • Use grep to select elements in a list

      I agree. I should change the text to note $_ being an alias and change the 'does not transform' into 'Should not be used to transform'
Re: map grep and sort
by davido (Archbishop) on May 25, 2013 at 03:36 UTC

    Probably beyond the scope of "the basics", but List::Util::reduce sure is handy too sometimes. And although you have to load a module to use it, that module is part of the core Perl distribution.


    Dave

        That is a very good and important point. Map can generate hashes because of this. I will add that fact to the above.
Re: map grep and sort
by moritz (Cardinal) on May 25, 2013 at 08:08 UTC

    When I was learning map and grep, it helped me to see the equivalent of them in normal perl code.

    my @a = map { transformation($_) } @b; # same as my @a; push @a, tranformation($_) for @b; @a

    And

    my @a = grep { condition($_) } @b; # same as my @a; for (@b) { push @b, $_ if condition($_); } @a

    If you see yourself using one of those patterns, you can likely use map or grep instead.

    And as a final note, you can use both without a block if you want to use only one expression:

    my @even = grep $_ % 2 == 0, @numbers; my @odd = map $_ * 2 +1, @numbers;

    In particular that allows you to write

    my @numbers = grep /^[0-9]+$/, @words;
Re: map grep and sort
by Tux (Monsignor) on May 25, 2013 at 12:39 UTC

    Why not go the whole strech and mention (and explain) Schwartzian Transform

    my @img = map { $_->[0] } sort { $tsort{$Option{thumbsorting}}->() } map { my $seq = m/(\d+)/ ? $1 : 0; [ $_, $seq, (stat "$idir/$_")[7,9], lc $_, rand 1 ] + } grep { my $s = -s "$idir/$_"; $s and $s > 100 } # Sanity c +heck. Minimal image size 100 # convert can't deal with .ico files (yet) # Tk can deal with Tiff/NEF as of 804.027_501 with Tk::TIF +F grep m/\.(jpe?g|gif|x[pb]m|png|bmp|tiff?|nef)$/i => @file_ +names;

    and the Guttman Rosler Transform

    @out = map substr ($_, 1 + rindex $_, "\0") => sort => map "\L$_\E\0$_" => @in;

    or combinations

    # sort file names by the (first) sequence of digits, then by size my @sorted = map { $_->[0], sort { $a->[1] cmp $b->[1] } map { m/(\d+)/; # sort by digits in file name [ $_, pack "l>l>", $1 || 0, -s $_ ] } @filelist;

    Enjoy, Have FUN! H.Merijn
Re: map grep and sort
by sundialsvc4 (Abbot) on May 28, 2013 at 14:08 UTC

    It is also worth mentioning, perhaps as a counterpoint to this most-excellent thread (that I just dropped a great many up-votes onto) ... whatever you do, be clear, and always code for the future.   Chains of map/sort/grep can indeed be quite powerful, but they can also be ... or, over time, can become ... difficult to maintain.   If what you are doing can alternatively be expressed using, say, one of the List::Util packages, and even when a peek under the covers of those modules reveals heavy use of map/sort/grep, consider carefully what approach is best to take.   (I do not mean to suggest the answer.)

    Probably for many years in the future, someone (else ... sucks that that bread-truck was in same place you were at the same time) is going to encounter that code, is going to have to accurately understand it, and is going to have to make what may well be a significant change, thanks to the ever-present Marketing Department.   It will be critical for them to be able to make the right changes to your “elegant” logic, which can be very difficult.   Perhaps the change that they must do will prove to be quite disruptive to your original design choice ... whereas, given a different choice on your part, it would not have been so.   Carefully weigh these concerns against elegance and efficiency.   You are creating logic that is “very tightly coupled.”   Every step in the chain must be considered together and is co-dependent upon every other part of it.   Aye, this is both a fee-chur and a curse.

    Make sure that you subject this sort of logic to extensive testing, e.g. with the ubiquitous Test::More, and that you ... well ... that you do actually test it thoroughly.   Precisely because these techniques do pack a lot of semantic meaning into a very small space, it becomes exceptionally easy to make an unintended error.   CPAN modules already include extensive self-tests that are run at installation time.   Your logic, initially, has none.   Many times, a subtle application problem can be traced to an edge-case that encountered elegance.   Now, the solution was obliged to be a “big and tricky” rewrite of a tightly-interconnected (but elegant!) section of code.

Re: map grep and sort
by perl514 (Pilgrim) on Jul 04, 2013 at 09:02 UTC

    Hi coyocanid,

    Thank you. I was looking for something on grep and map. This is awesome.

    Perlpetually Indebted To PerlMonks

    use Learning::Perl; use Beginning::Perl::Ovid; print "Awesome Books";
    http://dwimperl.com/windows.html is a boon for Windows.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://1035181]
Approved by Paladin
Front-paged by MidLifeXis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2014-09-21 14:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (172 votes), past polls