Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^2: Never, never, never

by doom (Deacon)
on Apr 25, 2005 at 14:04 UTC ( #451200=note: print w/ replies, xml ) Need Help??


in reply to Re: Never, never, never
in thread Never, never, never

"Never disable buffering without a good reason", eh? I'm actually of the opposite opinion, and I turn it off out of habit. I think this is a case where the default is wrong: turning buffering on should be regarded as a performance hack.

The trouble is that a lot of beginning perl scripts are mixtures of back-ticks and perl output -- an early use for me was an attempt at adding some readable column descriptions to the output of a unix command-line utility. What happens with buffering turned on is that you get the output from these two sources intermixed in an almost random fashion. And the reason this is happening is not at all obvious, in fact I would argue it's nearly impossible to figure out -- even if it occurs to you to read the docs for the special variables, there's nothing about the writeup for "$|" that would leap out at you (do I want my "pipes to be piping hot?").

So this is a hard one... if the default were different, perl might have an undeserved reputation for slow output, but as it is there's a nasty little gotcha in there. I tend to shut off $| for all my command line scripts... though of course you probably *shouldn't* do that in something like a CGI script.


Comment on Re^2: Never, never, never
Re^3: Never, never, never
by chromatic (Archbishop) on Apr 25, 2005 at 18:27 UTC

    I see a lot of programs where people disable buffering when they only print to standard output with newlines (or standard error) and never call external programs. (I've also seen a lot of programs disable buffering when they never printed to that filehandle!) Out of the last few hundred pieces of code I've seen, perhaps 5% needed to disable buffering.

      I agree that people should not blindly turn off buffering, but to be honest it will lead to fewer problems for most people than blindly leaving it on. The main issue I have with what you say is that you need to know how the program will be used in the future before you can safely leave it on, and this is not easy to do. For example, buffering will keep a script from effectively being used in pipelines when the data arrives in a time-sensitive manner...

      tail -f slow_growing_log | grep fiz | my_script

      With buffering on, it may be a long time before I see a log message I want (Update: after my_script processes it in some way) even though it's at the end of the log file. If I kill the pipeline I may never see the output I want. Perhaps a lot of people don't do this kind of thing, but I find that I use scripts in pipelines where I wouldn't use them in the past. FWIW, I could turn your argument around and say that you should always turn buffering off unless you know you need performance (i.e. you profiled it), since it could be viewed as premature optimization at the expense of compatibility, but I think that argument is a little strained.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://451200]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2014-08-02 04:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (54 votes), past polls