http://www.perlmonks.org?node_id=78949

scottstef has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a little script to maintain some log files. In the nature of readability I have set up several if statements to make it easier for the folks that follow. Due to the fact that it will be run without much need for efficiency, I went for readablity.
#see if it hasn't been touched in 3 days if (-M "$logdir/$f" > 3.0) { #see if file begins with ab or def case insensitive if ($f=~/^(ab|def)/i) { #see if hasn't been zipped already if (!($f=~/gz\b/)) { system ("/usr/local/bin/gzip", "$logdi +r/$f"); }

My question would be would it be more efficient to have a larger regex that does it in one pass?

Replies are listed 'Best First'.
Re: Efficiency of multiple if statements
by Masem (Monsignor) on May 09, 2001 at 00:58 UTC
    At least, if not to combine the regexes (since one is a affrimative case, one negative), but to combine everything into one if statement:
    if ( ( -M "$logdir/$f" > 3.0 ) && ( $f =~ /^(ab|def)/i ) && ( $f !~ /gz\b/ ) ) { system ( "/usr/local/bin/gzip", "$logdir/$f" ); }
    The perl interpreter is smart enough to recongize when it can short-circuit the above operation when it encounters a false statement; that is, if the -M operative fails, the regexs should never be performed. Not only that, this, IMO, is much easier to read and understand ("Oh, this system call only happens when these 3 conditions are met..."), and much more maintainable.

    Mind you, I'm assuming here that you want to take no alternate actions if those if statements fail. If you do, you need seperate else statements as your code above is ready for.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: Efficiency of multiple if statements (code)
by deprecated (Priest) on May 09, 2001 at 01:18 UTC
    I am surprised nobody has mentioned this yet, but just a few days ago I was given the same decision. A big hunka hunka burning regex or a series of nested conditionals. The result is at Optimization for readability and speed (code).

    The bottom line: The regex was 10 times faster than the if statements. Running it once, it wouldnt have made a difference. Running it 200 times, it did make a difference. And merlyn was right. Looking back on it, it really seems silly to have done that from a perl programmer point of view.

    Something else you may be interested in (and one of the ways I did simplify and readbilityize the code) is here, at Using arrays of qr!! to simplify larger RE's for readability (code).. You might find that using a sequence of qr!!'s in your code may not only increase its readbility but also its speed.

    Looking back at it, I think I wound up updating everyone in the CB with regards to speed. I did benchmark everything and it turned out to be 10x faster. The reason that particular node didnt report any speed increase is I was doing something rather stupid elsewhere in the program. :)

    brother dep.

    psssst, you probably could have used Super Search to find this answer! :)

    --
    Laziness, Impatience, Hubris, and Generosity.

(Ovid) Re: Efficiency of multiple if statements
by Ovid (Cardinal) on May 09, 2001 at 01:03 UTC

    Well, if you don't have anything in the if block, why not put everything in the one if test? The following is a cleaner example (IMHO) and I've used character classes because they tend to be more efficient than the the /i modifier (they work here only because you had simple regexes). Also, I broke the regexes out into separate tests rather than use the inefficient alternations.

    #see if it hasn't been touched in 3 days if (-M "$logdir/$f" > 3.0 and #see if file begins with ab or def ( $f =~ /^[Aa][Bb]/ or $f =~ /^[Dd][Ee][Ff]/ ) and #see if hasn't been zipped already ! ($f =~ /gz\b/ ) ) { system ("/usr/local/bin/gzip", "$logdir/$f"); }

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Efficiency of multiple if statements
by Anonymous Monk on May 09, 2001 at 12:02 UTC
    It may be more efficient to do the regex tests first, and short circuit the more expensive file test
Re: Efficiency of multiple if statements
by petdance (Parson) on May 09, 2001 at 09:23 UTC
    Does it matter which is "more efficient"? Are you having a speed problem? Have you run anything thru Benchmark?

    Dijkstra says "Premature optimization is the root of all evil". I would add the corrolary that "'Unnecessary' is, by definition, premature."

    That aside, you can pretty easily figure out what makes most sense as your potential bottleneck. What I see is this:

    1. Check something out on disk
    2. Do a string check
    3. Do another string check
      And then
    4. Open another process that does a gzip

    Where do you think the bottleneck would be? It's sure not gonna be items 2 or 3.

    xoxo,
    Andy

    %_=split/;/,".;;n;u;e;ot;t;her;c; ".   #   Andy Lester
    'Perl ;@; a;a;j;m;er;y;t;p;n;d;s;o;'.  #   http://petdance.com
    "hack";print map delete$_{$_},split//,q<   andy@petdance.com   >