Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: Too much SQL not enough perl

by EvanCarroll (Chaplain)
on Oct 10, 2005 at 04:29 UTC ( #498709=note: print w/replies, xml ) Need Help??


in reply to Re: Too much SQL not enough perl
in thread Too much SQL not enough perl

I do not think it is fair you load up your hash prior to the benchmarking, that skews the results.

Nor, is it fair that in the 'slow' one, you use a temporary array
my @items = split/,/,$_; foreach my $item (@items){

vs
foreach my $item (split/,/,$_){


Also, if ($choices{$item}) should probably be if (exists $choices{$item}) or you will choke on 0, empty strings, and undefs.

UPDATE:

Nor, is a grep a good idea, quoting a passage I remember reading in perldoc perlfaq: perldoc -q unique:
These are slow (grep) (checks every element even if the first matches), inefficient (same reason), and potentially buggy
But, granted it still says:
Hearing the word "in" is an indication that you probably should have used a hash, not a list or array, to store your data. Hashes are designed to answer this question quickly and efficiently. Arrays arenít.


Evan Carroll
www.EvanCarroll.com

Replies are listed 'Best First'.
Re^3: Too much SQL not enough perl
by Roy Johnson (Monsignor) on Oct 10, 2005 at 13:33 UTC
    Note that with a little skullduggery, you can make grep short-circuit. Granted, it's still much more clear to use List::Util 'first', and for searching the same candidate list many times, it's more efficient to use a hash.
    my @candidates = qw(z y a b c a d a e a f); foreach my $question ('a', 'm') { if (do{{;grep {$_ eq $question ? do {print "Match\n"; last} : 0 } @candidates}}) { print "Found $question\n"; } }

    Caution: Contents may have been coded under pressure.

      Of course that really just uses grep for its side-effect, and you could just a little less twisted-brainedly say something like

      foreach my $question ('a', 'm') { if ( sub { $_ eq $question and return 1 for @candidates; 0; }->() +) { print "Found $question\n"; } }

      which is basically an inlined implementation of List::Util’s first.

      Makeshifts last the longest.

        Of course that really just uses grep for its side-effect
        No, it uses the return value as well. The interesting thing is that last is interpreted as true.

        Caution: Contents may have been coded under pressure.
Re^3: Too much SQL not enough perl
by InfiniteSilence (Curate) on Oct 10, 2005 at 15:10 UTC
    Nor, is it fair that in the 'slow' one, you use a temporary array...

    EvanCarroll is right about the slight differences in the routines, so I commented out some things in code and made them both use the same array (note, I increased the number of elements to look for as well as the number of iterations for more meaningful results):

    #!/usr/bin/perl -w use strict; use Benchmark qw(:all); my @choices = qw|a b c z m p t l c f g|; my %choices = map{$_=>1}@choices; my @cData = <DATA>; timethese(50_000,{'Damn Slow'=>\&parseData1, 'Much Better'=>\&parseData2}); sub parseData1 { foreach (@cData){ chomp; my @items = split/,/,$_; #slow way, foreach my $item (@items){ if(grep {$item eq $_} @choices){ #print qq|FOUND $item|; } } } } sub parseData2 { foreach(@cData){ chomp; # foreach my $item (split/,/,$_){ my @items = split/,/,$_; #slow way, foreach my $item (@items){ if ($choices{$item}){ #print qq|FOUND $item\n|; } } } } __DATA__ z,t,m,u,a,b,c s,t,l,m,z,a,s c,b,a,m,u,t,n k,l,t,s,z,r,t

    Which still produces:

    perl seediff.pl Benchmark: timing 50000 iterations of Damn Slow, Much Better... Damn Slow: 6 wallclock secs ( 6.12 usr + 0.00 sys = 6.12 CPU) @ 81 +72.61/s (n=50000) Much Better: 3 wallclock secs ( 2.96 usr + 0.00 sys = 2.96 CPU) @ 1 +6874.79/s (n=50000)

    I think the above changes were 'fair' since the performance of a function like SQL in() in an actual database should not decrease noticeably with the number of elements.

    Celebrate Intellectual Diversity

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://498709]
help
Chatterbox?
[marto]: the ticketing system does not accept calls via email, nor has it a working API. It's tied into Active Directory for authentication and the Solaris boxes aren't on that domain
[Corion]: The one thing I haven't figured out a solution to is how to get an edge-trigger instead of sending an email every 5 minutes if the usage is above 90%. I want one mail when it goes over 90% but no more emails as long as it stays between 90% and 95%.
[Corion]: marto: Clever! ;)
[Corion]: You can only reach me by pager
[Corion]: Maybe the solution would be to launch a cron job every minute that takes two measurements a minute apart and sends a mail if the usage is below on the first and above threshold on the last measurement
[marto]: that's essentially it :)
[marto]: I think the long term solution would be to have sysadmins that do their job, so I don't have to do everything :P
[marto]: they already have an entire BMC patrol system, which they disabled, because it was sending out spurious messages. So rather than fix the issue, or even find out what it was, they turned it off. No messages, can't be any problems, right?
[Corion]: marto: But having open tickets / incidents increases the pressure on them ;) Of course, likely your contract / SLA specifies an upper limit for the number of incidents :-D
[Corion]: marto: Ow ...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2017-01-24 10:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you watch meteor showers?




    Results (203 votes). Check out past polls.