Optimizing Output

Dogma has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Optimizing Output by dws (Chancellor) on Apr 15, 2002 at 04:57 UTC
So is it more efficent (correct?) to store output in a scalar and then "print" it when your done generating output or to make several prints. Do the simplest thing that works. Then, if there proves to be a performance problem, profile (measure) before deciding what to do. Optimizing before you have data is almost always a waste of time. And once you have the data, you'll often find that the issue is algorithmic. I often gather up strings into a scalar for later printing, and though I've often daydreamed up super-efficient way of doing this, the applications always either seem to be fast enough, or the performance issues are solved by doing a more effective query against the database.	[reply]
Re: Optimizing Output by Juerd (Abbot) on Apr 15, 2002 at 07:56 UTC
To optimize output... never believe people who say sys* functions are always faster because they are more direct. syswrite is not faster than print, because print benefits from Perl's internal optimization. (syswrite vs print: print wins (buffered: 580%, unbuffered: 250%)) do not turn off buffering (do not turn on autoflush). A lot of people seem to have a habit of writing `$\|++;` in every single script. Most scripts do not need it. Use `$\|` wisely. (`print ''` unbuffered vs buffered: buffered wins (90%)) write large chunks if you are on a slow medium. If, for some reason, you have to write to a file on an operating system that does no buffering, buffer yourself, and write large chunks. For normal scripts, this is not much of a problem and writing directly is probably more efficient than building a chunk. do not try to interpolate function calls as described in How do I expand function calls in a string? (perlfaq4), but if you do, use `${\ ... }` instead of `@{[ ... ]}` - if you need list context, join it yourself (and remember that a constant literal is faster than `$"`). - Yes, I reinvent wheels. - Spam: Visit eurotraQ.	[reply]
Re: Re: Optimizing Output by ariels (Curate) on Apr 15, 2002 at 12:54 UTC
"If, for some reason, you have to write to a file on an operating system that does no buffering, buffer yourself, and write large chunks." This is odd. For this level of output, you're not looking so much at operating system buffering, but at the buffering your run-time environment provides. For Perl, this is given by the "normal" output functions (not sysread and syswrite). For C, this is e.g. the stdio.h functions. See K&R for details on how to implement `putc` and `getc`. It all happens in user space, not kernel space. The vast majority of benefit from buffering comes from this application library level. In fact, the cited benchmarks say just that: switching off Perl's buffering (with `$\|=0`) turns off this buffering; it doesn't do any hacking on obscure OS parameters. And doing it clobbers performance. Juerd below is, of course, correct. You switch buffering off by making your output "piping hot": `$\|=1`.	[reply] [d/l] [select]
Re: Re: Re: Optimizing Output by Juerd (Abbot) on Apr 15, 2002 at 13:06 UTC
switching off Perl's buffering (with $\|=0) The other way around. $\| controls autoflush, the opposite of buffering. `$\| = 1; # Autoflush on, buffering off. $\| = 0; # Autoflush off, buffering on. $\|++; # $\| = 1 $\|--; # $\| = !$\| (flip setting 0/1)` [download] - Yes, I reinvent wheels. - Spam: Visit eurotraQ.	[reply] [d/l]
Re: Optimizing Output by stephen (Priest) on Apr 15, 2002 at 05:49 UTC
I threw together some bizarre benchmarking code to see how much the two methods differ on my machine. The difference was more than I expected... but I pretty much don't believe my results. Code below, once I finish explaining why one should ignore it completely. :) The old adage is that "Premature optimization is the root of all evil." Yet another true cliche is that 80% of processing is performed in 20% of all code. Optimizing before profiling the code to find out where it's spending time can waste your life away, and lead you to sacrifice readability and maintainability where it isn't necessary to do so. I second what dws said. I would go a step further, perhaps: write the code in the most readable and maintainable way you can. If one is going to be interpolating enough variables and subroutines that this question becomes worth thinking about, then it's time to consider using a templating system like The Template Toolkit or Text::Template. Many templating systems precompile themselves, so that they are nearly the same speed as either method. Just for laughs, here's a code snippet that performs a rough-and-ready (read: probably meaningless) comparison of building up a string for appending versus printing a long list: Read more... (2 kB)	[reply] [d/l] [select]
Re: Re: Optimizing Output by Dogma (Pilgrim) on Apr 15, 2002 at 07:48 UTC
This was intended to be a question on optimizing outputing of strings. Not when and what to optimize as that's really a different discussion. Anyways here are the results of your benchmark on my laptop with linux-2.4.18/perl 5.6.1... `build_print: 24 wallclock secs (20.87 usr + 0.08 sys = 20.95 CPU) @ 4 +7.73/s (n=1000) list_print: 20 wallclock secs (17.51 usr + 0.01 sys = 17.52 CPU) @ 57 +.08/s (n=1000) Rate build_print list_print build_print 47.7/s -- -16% list_print 57.1/s 20% --` [download] I suspect there are serous buffering differences from platform to platform. Adding "$\|++" to the top of the script seems to widen the difference between methods by 1-2%.	[reply] [d/l]
Miscelaneous thoughts by Fletch (Bishop) on Apr 15, 2002 at 04:07 UTC
Get it working first, optimize (or refactor if you want to sound hep :) afterwards. Keep in mind that if you're interpolating variables you've implicitly used the `.` operator Heredocs (`<<EOT`) are much more readable (IMHO) than fifteen gazillion `.=` lines (I mean come on, this is perl not C or Java) Update:As was pointed out to me in a msg, strictly speaking refactoring isn't optimization; however refactoring may improve performance by replacing an inefficient implementation with a better designed one. Pardon me for attempting humor. :)	[reply] [d/l] [select]
Debug code (was Re: Optimizing Output) by fuzzycow (Sexton) on Apr 15, 2002 at 14:19 UTC
The other side of the coin is the usability of your output - If you are writing debugging code for your program - it's (in my opinion) better to write less lines with proper debug/trace information then do 'print "X=$a"' every other line. For that i would recommend using something like 'Log::Agent' module (which btw is great)	[reply]
Re: Optimizing Output by riffraff (Pilgrim) on Apr 15, 2002 at 17:29 UTC
Doesn't it say somewhere that interpolating is faster than multiple .'s? If I remember correctly, each '.' forces a copy, where as interpolation only does it once. `$string .= $a . $b . $c . $d;` [download] does like 5 copies, whereas `$string = "$string$a$b$c$d";` [download] only does one. Because of this, I always do interpolation if I can, but I won't go out of my way to do so.	[reply] [d/l] [select]
Re: Re: Optimizing Output by sfink (Deacon) on Apr 15, 2002 at 20:35 UTC
No. `perl -Dt -e '"$a$b".$c.$d' EXECUTING... (-e:0) enter (-e:0) nextstate (-e:1) gvsv(main::a) (-e:1) gvsv(main::b) (-e:1) concat (-e:1) gvsv(main::c) (-e:1) concat (-e:1) gvsv(main::d) (-e:1) concat (-e:1) leave` [download] They both call the concat opcode. At one time, I remember the tokenizer actually rewrote `"x$y"` to be `"x" . $y` and "recursed" on it. But it doesn't seem to anymore. O::Deparse can even tell the difference: `perl -MO=Deparse -e '"$a$b".$c.$d' "$a$b" . $c . $d;` [download]	[reply] [d/l] [select]
Re: Optimizing Output by BUU (Prior) on Apr 15, 2002 at 13:07 UTC
Is there any nice module or script that would run through another script and detirmine what is taking the most time/cpu power?	[reply]
Re: Re: Optimizing Output by ariels (Curate) on Apr 15, 2002 at 13:19 UTC
You want a profiler: Devel::DProf, Devel::SmallProf, ...	[reply]
Re: Re: Optimizing Output by trs80 (Priest) on Apr 15, 2002 at 13:14 UTC
AutoProfiler	[reply]
Re: Re: Optimizing Output by mrbbking (Hermit) on Apr 15, 2002 at 13:17 UTC
Kind of... It's called Benchmark.pm. It won't just read in a script and tell you what's fast and what's slow, but it is very helpful nonetheless. Do a search here for 'use Benchmark;' and you'll find lots of examples. `s!!password!;y?sordid?binger?;y.paw.mrk.; print chr 0x5b;print;print chr(0x5b+0x2);` [download]	[reply] [d/l]