Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: one line to check for common list values

by haukex (Archbishop)
on Feb 25, 2017 at 19:24 UTC ( [id://1182835]=note: print w/replies, xml ) Need Help??


in reply to one line to check for common list values

Welcome to Perl and the Monastery, dollar_sign! Just a small tip on posting: Thank you for using <code> tags for the code, but please don't add line numbers.

Perl has excellent documentation, see perldoc.perl.org, and the command line tool perldoc. For example, typing perldoc -q intersection at the command line brings up this FAQ: How do I compute the difference of two arrays? How do I compute the intersection of two arrays?

One of the pieces of Perl philosophy is TIMTOWTDI, There Is More Than One Way To Do It, so if you want to use a regex to find matches, that's ok, but see the performance comparison below. Note that if your input wasn't limited to integers, your regex has one potential problem, see what happens when you set my @input = (1.23) and my @valuesToMatch = (23). I wrote about the generation of regexes here (still a draft at the moment), and I showed what I would consider a "safer" version of your code below.

I would also like to know if using regexps for this is less or more efficient (performance-wise) than just using a for loop.

The go-to module for this is Benchmark. Here is a comparison of the performance of using regular expressions, hashes, and grep. Note how the hash approach clearly wins!

#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; # use fixed values for repeatable tests my @input = 1..100; my @valuesToMatch = map {$_*14} 1..10; my @expectedOutput = (14,28,42,56,70,84,98); print Dumper(\@valuesToMatch); # precompute what can be precomputed # for the regex approach my $regex_str = join '|', map {quotemeta} sort { length $b <=> length $a } @valuesToMatch; my $regex = qr/(?:\A|\0)($regex_str)(?:\z|\0)/; # for the hash approach my %matchtbl = map {$_=>1} @valuesToMatch; # the different implementations, in a dispatch table my %subs = ( hash => sub { my @matches = grep {$matchtbl{$_}} @input; return @matches; }, regex => sub { my @matches = join("\0", @input) =~ /$regex/g; return @matches; }, grep => sub { my @matches; for my $in (@input) { push @matches, $in if grep {$in==$_} @valuesToMatch; } return @matches; }, ); # check to make sure our implementations are all working use Test::More; for my $sub (sort keys %subs) { my @results = $subs{$sub}->(); is_deeply \@results, \@expectedOutput, $sub or diag explain \@results; } done_testing; # now benchmark the implementations use Benchmark qw/cmpthese/; cmpthese(-3, \%subs);

Update: Example output (whitespace edited for brevity):

$VAR1 = [ 14, 28, 42, 56, 70, 84, 98, 112, 126, 140 ]; ok 1 - grep ok 2 - hash ok 3 - regex 1..3 Rate grep regex hash grep 33107/s -- -49% -85% regex 65160/s 97% -- -70% hash 218506/s 560% 235% --
Would it be possible to reduce the amount of code necessary to find the common values between the two arrays to only one line of Perl?

Sure, it's possible to jam it all into one line, the AM showed one way to do it for the "regex" approach (Update: tybalt89 showed a few more, but benchmark them and you'll see all three of them are less efficient than all of the above). It's also possible in the above "grep" solution using two nested greps, but for the "hash" solution, you'll need multiple statements, even if you jam them all into one line :-)

I had no idea where to look for up to date Perl-specific exercises

One might be HackerRank, or perhaps Rosetta Code.

Replies are listed 'Best First'.
Re^2: one line to check for common list values
by Anonymous Monk on Feb 26, 2017 at 00:39 UTC
    HackerRank is fun, but many of the problems are rigged to favor C solutions. Also check out Project Euler (projecteuler.net).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1182835]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (9)
As of 2024-04-18 17:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found