Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Optimizing existing Perl code (in practise)

by semio (Friar)
on Aug 19, 2002 at 06:44 UTC ( #191093=note: print w/replies, xml ) Need Help??


in reply to Optimizing existing Perl code (in practise)

I found myself also asking this question based on some feedback I received from a recent question I posted - converting hex to char. In this string, unpack and printf were presented as options for converting data. To test the performance for each, I did the following:
#!c:/perl/bin/perl -w use strict; use POSIX qw(strftime); my $x; my $maxint = 200000; my $start = strftime "%H:%M:%S", localtime; for ($x=0; $x <$maxint;$x++) { print unpack "H*", "abc" } my $finish = strftime "%H:%M:%S", localtime; print "$start $finish";
Results: 01:32:57 01:33:48 (51 seconds)
#!c:/perl/bin/perl -w use strict; use POSIX qw(strftime); my $x; my $maxint = 200000; my $start = strftime "%H:%M:%S", localtime; for ($x=0; $x <$maxint;$x++) { printf "%x%x%x",ord('a'),ord('b'),ord('c'); } my $finish = strftime "%H:%M:%S", localtime; print "$start $finish";
Results: 01:31:56 01:32:50 (54 seconds)

In this case, unpack is the clear winner, although the performance difference doesn't become apparent until after 100000 iterations. So, in my opinion, being that TIMTOWTDI, I would look for a performance differential between these methods and opt for the one that requires the least amount of execution time.

The second thing I would check to see if any shelling out can be replaced by an available perl function. I recently wrote a program that required that the date/time stamps in a log file be updated. For this, I made the mistake of relying on shelling out

my $time1 = `date '+%H:%M:%S'`;
when I should have used

my $time1 = strftime "%H:%M:%S", localtime;
Hope this helps.

cheers, -semio

Replies are listed 'Best First'.
Re: Re: Optimizing existing Perl code (in practise)
by grep (Monsignor) on Aug 19, 2002 at 07:11 UTC

    You should definately look into Benchmark. I was able to reduce your test down to this, and I get the CPU usage

    use strict; use Benchmark; timethese(1500000, { 'unpack' => 'unpack "H*", "abc"', 'sprintf' => 'sprintf "%x%x%x",ord("a"),ord("b"),ord("c +")' } );

    The Results:
    Benchmark: timing 1500000 iterations of sprintf, unpack... sprintf: 0 wallclock secs ( 0.17 usr + 0.00 sys = 0.17 CPU) @ 88 +23529.41/s (n=1500000) (warning: too few iterations for a reliable count) unpack: 10 wallclock secs ( 9.87 usr + 0.01 sys = 9.88 CPU) @ 15 +1821.86/s (n=1500000)

    ACCCK!!!Abigail-II caught me in a latenight brain seizure. I shoulda been tipped off by sprintf winning. :( ++Abigail-II



    grep
    Mynd you, mønk bites Kan be pretti nasti...
      You should always be very suspicious if your benchmark shows results of 8823529.41 runs/second. Specially when it comes to non-trivial tasks like sprintf() - after all, than requires perl to parse a format.

      Another thing that should ring loud bells is that you are doing sprintf() in void context. That's not a natural operation. Perhaps Perl optimizes that away for you - totally screwing up your benchmark. It's a simple test:

      $ perl -MO=Deparse -wce 'sprintf "%x%x%x", ord ("a"), ord ("b"), o +rd ("c")' Useless use of a constant in void context at -e line 1. BEGIN { $^W = 1; } '???'; -e syntax OK $
      Indeed, you just benchmarked how fast perl can do an empty loop. Not very useful. Your benchmark should include assigning the result to a variable. So, you might want to do:
      #!/usr/bin/perl use strict; use warnings 'all'; use Benchmark; timethese -10 => { unpack => '$_ = unpack "H*" => "abc"', sprintf => '$_ = sprintf "%x%x%x", ord ("a"), ord ("b"), ord ( +"c")', } __END__ Benchmark: running sprintf, unpack for at least 10 CPU seconds... sprintf: 11 wallclock secs (10.25 usr + 0.00 sys = 10.25 CPU) @ 77 +5053.56/s (n=7944299) unpack: 11 wallclock secs (10.48 usr + 0.01 sys = 10.49 CPU) @ 33 +1145.09/s (n=3473712)
      It looks like sprintf is still a winner. But is it? Let's check the deparser again:
      $ perl -MO=Deparse -wce '$_ = sprintf "%x%x%x", ord "a", ord "b", +ord "c"' BEGIN { $^W = 1; } $_ = '616263'; -e syntax OK $
      Oops. Perl is so smart, it figured out at compile time the result of the sprintf. We'd have to make the arguments of sprintf variable to make Perl actually do work at run time:
      $ perl -MO=Deparse -wce '($a, $b, $c) = split // => "abc"; $_ = sprintf "%x%x%x", ord $a, ord $b, ord $c' BEGIN { $^W = 1; } ($a, $b, $c) = split(//, 'abc', 4); $_ = sprintf('%x%x%x', ord $a, ord $b, ord $c); -e syntax OK $
      And only now we can run a fair benchmark:
      #!/usr/bin/perl use strict; use warnings 'all'; use Benchmark; use vars qw /$a $b $c $abc/; $abc = "abc"; ($a, $b, $c) = split // => $abc; timethese -10 => { unpack => '$_ = unpack "H*" => $::abc', sprintf => '$_ = sprintf "%x%x%x", ord $::a, ord $::b, ord $:: +c', } __END__ Benchmark: running sprintf, unpack for at least 10 CPU seconds... sprintf: 11 wallclock secs (10.51 usr + 0.01 sys = 10.52 CPU) @ 20 +8379.75/s (n=2192155) unpack: 10 wallclock secs (10.10 usr + 0.00 sys = 10.10 CPU) @ 32 +3836.04/s (n=3270744)
      And guess what? unpack is the winner!

      The moral: no benchmark is better than a bad benchmark.

      Abigail

Re: Re: Optimizing existing Perl code (in practise)
by snafu (Chaplain) on Aug 19, 2002 at 15:03 UTC
    The second thing I would check to see if any shelling out can be replaced by an available perl function. I recently wrote a program that required that the date/time stamps in a log file be updated. For this, I made the mistake of relying on shelling out

    This one particular piece of advice is very good. A peave of mine is when I see people who write Perl scripts and all the work in them is done by using system() calls. What is the point in writing a Perl script if you're not going to use the Perl functions? You might as well write the thing in shell.

    Spawning system calls does take more resources and thus it behooves the Perl programmer to try and code the functionality they want using Perl built-ins and modules.

    gj! ++ on this one.

    _ _ _ _ _ _ _ _ _ _
    - Jim
    Insert clever comment here...

      A peave of mine is when I see people who write Perl scripts and all the work in them is done by using system() calls. What is the point in writing a Perl script if you're not going to use the Perl functions? You might as well write the thing in shell.
      And a "peave" of me is people who see everything black-and-white. I've written Perl programs where the majority of the work was done doing "system". What's the point of using a glue language, and not glueing? You might as well write the thing in C.

      Your point of view is quite opposite of the viewpoint of "code reuse". Unix comes with a handy toolkit. There's nothing wrong with using it.

      You might as well write the thing in shell.
      Not always. Perl gives you more control flow syntax than a shell.
      Spawning system calls does take more resources and thus it behooves the Perl programmer to try and code the functionality they want using Perl built-ins and modules.
      Bull. Programming means making trade-offs between developer time and run-time. The fact that you have choosen Perl instead of say, C, means that you strongly favour developer time over run time. Your arguments make sense if you are a C coder - but for a Perl coder they are just silly.

      Really, what's the point of writing:

      my $text = do { open my $fh => $file or die "open: $!\n"; local $/; <$fh>; };
      If you can just write:
      my $text = `cat $file`;
      Most programs won't read in gazillions of files in a single program, so the extra overhead is minute. Far less than the sacrifice you already made by using Perl instead of C. I also prefer
      system mkdir => -p => $dir;
      over the Perl equivalent. It takes to long to figure out which module implemented it, and to download and install it.

      Of course, making use of external programs makes you less portable, but so does making use of modules not coming with the core. And many programs dealing with file names aren't portable anyway. Do you always use File::Spec when dealing with file names? I certainly don't.

      I'm not claiming everything should be done with system. Not at all. But I don't thing that everything that can be done in Perl should, and that therefore system should be avoided.

      Abigail

        This could quickly turn into a flame war... :) With that, I will reply once, then I suppose we can see where it goes from there. But, this is probably all going to be summed up as to each his own and we will have to agree to disagree.

        And a "[peeve]" of me is people who see everything black-and-white. I've written Perl programs where the majority of the work was done doing "system". What's the point of using a glue language, and not glueing? You might as well write the thing in C.

        Perl, is most certainly a glue language, but it most certainly does not have to be strictly considered or used only as a glue language. I believe that the fact that Perl has evolved so much makes it more than just a glue language. I prefer to use it as a full-blown programming language with the abilities to do whatever any other language can do natively. C has the ubiqitous system() function, too. That doesn't mean that because its there you should use it unless its absolutely necessary.

        I've seen too many programs written in both C and Perl that have non-sensical uses of the system() command that in all reality, the only reason the function was used was for [sheer] laziness. What ends up happening is that the command that was being called by system() from either language somehow ends up getting removed from the system, moved, changed, manipulated, whatever, and now the program that used the system() call is broken because it relied on an outside source for its ability to complete its task. If the coder had taken the time to write the extra 3 lines that would have allowed his/her program to do the same job that s/he was trying to complete using system() s/he could have kept from making somebody else maintain his/her code *and* saved a few CPU cycles at the same time. Thus, portability is a big issue to me when it comes to this matter.

        I don't buy it. The use of system() should be kept at a minimum. I'm not saying never ever use it. Sometimes, you just have to. I usually use Shell so I can at least keep my code more Perl-esque when I need to make OS level system calls.

        In fact, I would consider using system() for your example of the mkdir -p command. But I would still use Shell which is a core Perl module so you shouldn't have to worry about installing it.

        Of course, making use of external programs makes you less portable, but so does making use of modules not coming with the core. And many programs dealing with file names aren't portable anyway. Do you always use File::Spec when dealing with file names? I certainly don't.

        I agree that the use of non-standard modules can be troublesome. In this case, you have to pick and choose your battle. I'd rather install a Perl module than try and install a program on the box that my script needs to call to get a job done. Lets face it, that whole ordeal is pretty much 6 in 1, half-dozen the other.

        Not always. Perl gives you more control flow syntax than a shell.

        Yes, while I will agree that you definitely have more flow control syntax with Perl, you definitely have enough flow control in shell to do the task.

        Bull. Programming means making trade-offs between developer time and run-time.

        Don't forget maintainabilty. In my opinion, this should always be considered high on any coder's list of things to be observant of as I am sure you agree. In the professional IT environment, you cannot afford to have your program break on a production box because a system level command comes up missing on the host ($PATH?, OS upgrades, etc) that your program needs to use. In my opinion, this alone makes for an excellent reason NOT to use system() often.

        The fact that you have choosen Perl instead of say, C, means that you strongly favour developer time over run time. Your arguments make sense if you are a C coder - but for a Perl coder they are just silly.

        The fact that you favor writing something in ksh vs Perl is the same as saying you favor developing something in Perl vs C. You have to choose your poison. My point is that if you create a Perl script using all system() calls you might as well write in ksh. You're basically doing the same thing. You're just gluing the shell commands together with Perl instead of sh. I mean, really, what's the difference? I know this is more of an instance by instance situation because it could go both ways. It is ultimately up to the developer to figure out what s/he wants to develop in. Thus, if it were me, I'd not write a Perl script that was nothing but system() calls. I am using extreme examples, though, which is what I was considering the original poster's (to this system() thing) statement to mean.

        I think we *might* be on the same page here. You seem to be more liberal toward your use of calling system(). I, on the other hand, am more conservative. I'm not sure arguing over this is worth the time. :)

        update: Fixed mispellings pointed out by dmmiller2k (shear -> sheer and peave -> peeve).

        _ _ _ _ _ _ _ _ _ _
        - Jim
        Insert clever comment here...

        I agree entirely with the message you were trying to get across. Still, this bothered me:

        I also prefer
        system mkdir => -p => $dir;
        over the Perl equivalent. It takes to long to figure out which module implemented it, and to download and install it.

        That would be File::Path and it comes with perl. I just thought your statement might be misleading to some.

        -sauoq
        "My two cents aren't worth a dime.";
        

        I think we've gone far off track and read much too much into the initial point here: one shoud avoid the kind of pretzel logic that results in superfluous system calls - a class of stupid coding practices one of which merlyn dubbed a useless use of cat.

        F.ex, I recently saw this in a script: my @data = `some_tool foo bar params 2> /dev/null | sort`;
        Tell me where there's point in piping over to sort(1) here? You can just do it in Perl: my @data = sort `some_tool foo bar params 2> /dev/null`;

        I have seen quite a bunch of cases where people shell out to awk or grep(1) from inside a Perl script, when what they were intending to do was perfectly doable in Perl and would have taken no more effort (if not less).

        I cringe every time, and I do call that a peeve of mine - I think you would too.

        Really, what's the point of writing:
        my $text = do { open my $fh => $file or die "open: $!\n"; local $/; <$fh>; };
        If you can just write: my $text = `cat $file`;
        This example is just silly, IMHO. If I need `cat $file` more than once, I'll write a
        sub slurp_file($) { open my $fh, (my $file = shift) or die "open $file: $!\n"; local $/; <$fh>; }
        and next time it's slurp_file $file - no extra process, no extra typing, no loss of clarity. If I do need to slurp a file more than once. If not, of course, the effort is silly. But then, it's very unlikely that I will need to do that in spite of Perl offering the lovely @ARGV / diamond operator combo. In the remaining 0.5% of cases, sure, I concede that cat(1) would be the tool of choice.

        Makeshifts last the longest.

        my $text = `cat $file`;
        If you're gonna do that, you might as well do
        my $text = join '', <FileHandle->new($file)> and still do it in perl. Frankly I don't know which is faster but I would choose to do it the second way because I think it would be faster.
Re^2: Optimizing existing Perl code (in practise)
by Aristotle (Chancellor) on Aug 19, 2002 at 10:18 UTC
    my $time1 = strftime "%H:%M:%S", localtime;
    You mean s/localtime/time/ of course.

    Makeshifts last the longest.

      You mean s/localtime/time/ of course.

      I sure hope he doesn't. From the POSIX perldoc page:

      Synopsis: strftime(fmt, sec, min, hour, mday, mon, year, wday = -1, yday = -1, + isdst = -1)
      Those are the same values as returned by localtime().

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://191093]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (8)
As of 2019-07-23 11:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If you were the first to set foot on the Moon, what would be your epigram?






    Results (25 votes). Check out past polls.

    Notices?