Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re^2: Too much SQL not enough perl

by EvanCarroll (Chaplain)
on Oct 10, 2005 at 04:29 UTC ( #498709=note: print w/replies, xml ) Need Help??

in reply to Re: Too much SQL not enough perl
in thread Too much SQL not enough perl

I do not think it is fair you load up your hash prior to the benchmarking, that skews the results.

Nor, is it fair that in the 'slow' one, you use a temporary array
my @items = split/,/,$_; foreach my $item (@items){

foreach my $item (split/,/,$_){

Also, if ($choices{$item}) should probably be if (exists $choices{$item}) or you will choke on 0, empty strings, and undefs.


Nor, is a grep a good idea, quoting a passage I remember reading in perldoc perlfaq: perldoc -q unique:
These are slow (grep) (checks every element even if the first matches), inefficient (same reason), and potentially buggy
But, granted it still says:
Hearing the word "in" is an indication that you probably should have used a hash, not a list or array, to store your data. Hashes are designed to answer this question quickly and efficiently. Arrays arenít.

Evan Carroll

Replies are listed 'Best First'.
Re^3: Too much SQL not enough perl
by Roy Johnson (Monsignor) on Oct 10, 2005 at 13:33 UTC
    Note that with a little skullduggery, you can make grep short-circuit. Granted, it's still much more clear to use List::Util 'first', and for searching the same candidate list many times, it's more efficient to use a hash.
    my @candidates = qw(z y a b c a d a e a f); foreach my $question ('a', 'm') { if (do{{;grep {$_ eq $question ? do {print "Match\n"; last} : 0 } @candidates}}) { print "Found $question\n"; } }

    Caution: Contents may have been coded under pressure.

      Of course that really just uses grep for its side-effect, and you could just a little less twisted-brainedly say something like

      foreach my $question ('a', 'm') { if ( sub { $_ eq $question and return 1 for @candidates; 0; }->() +) { print "Found $question\n"; } }

      which is basically an inlined implementation of List::Util’s first.

      Makeshifts last the longest.

        Of course that really just uses grep for its side-effect
        No, it uses the return value as well. The interesting thing is that last is interpreted as true.

        Caution: Contents may have been coded under pressure.
Re^3: Too much SQL not enough perl
by InfiniteSilence (Curate) on Oct 10, 2005 at 15:10 UTC
    Nor, is it fair that in the 'slow' one, you use a temporary array...

    EvanCarroll is right about the slight differences in the routines, so I commented out some things in code and made them both use the same array (note, I increased the number of elements to look for as well as the number of iterations for more meaningful results):

    #!/usr/bin/perl -w use strict; use Benchmark qw(:all); my @choices = qw|a b c z m p t l c f g|; my %choices = map{$_=>1}@choices; my @cData = <DATA>; timethese(50_000,{'Damn Slow'=>\&parseData1, 'Much Better'=>\&parseData2}); sub parseData1 { foreach (@cData){ chomp; my @items = split/,/,$_; #slow way, foreach my $item (@items){ if(grep {$item eq $_} @choices){ #print qq|FOUND $item|; } } } } sub parseData2 { foreach(@cData){ chomp; # foreach my $item (split/,/,$_){ my @items = split/,/,$_; #slow way, foreach my $item (@items){ if ($choices{$item}){ #print qq|FOUND $item\n|; } } } } __DATA__ z,t,m,u,a,b,c s,t,l,m,z,a,s c,b,a,m,u,t,n k,l,t,s,z,r,t

    Which still produces:

    perl Benchmark: timing 50000 iterations of Damn Slow, Much Better... Damn Slow: 6 wallclock secs ( 6.12 usr + 0.00 sys = 6.12 CPU) @ 81 +72.61/s (n=50000) Much Better: 3 wallclock secs ( 2.96 usr + 0.00 sys = 2.96 CPU) @ 1 +6874.79/s (n=50000)

    I think the above changes were 'fair' since the performance of a function like SQL in() in an actual database should not decrease noticeably with the number of elements.

    Celebrate Intellectual Diversity

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://498709]
[1nickt]: my $x = $aohoaoh->[0]->{' foo'}->[0]->{'bar' }; should work
[thepkd]: not indexing
[thepkd]: getting data out of the ds
[choroba]: it depends. Give more details, please
[thepkd]: i used a series of {}'s but it dont work
[choroba]: You need square brackets for arrays
[GotToBTru]: you can certainly construct a single expression to access any part of the data structure, without using temp variables. but you might make it easier on yourself and any other poor soul who has to understand your code if you do
[thepkd]: to dereference i mean
LanX .oO( oh tempz, oh moretz)
[thepkd]: @GotToBTru Sure. But.

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (13)
As of 2016-12-06 13:23 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (104 votes). Check out past polls.