Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

RE: RE (tilly) 5: File reading efficiency and other surly remarks

by lhoward (Vicar)
on Aug 26, 2000 at 21:17 UTC ( [id://29814]=note: print w/replies, xml ) Need Help??


in reply to RE (tilly) 5: File reading efficiency and other surly remarks
in thread File reading efficiency and other surly remarks

I never said that the method I posted was easier to maintain; I only stated that it was significantly more efficient. If fast reading of large files (that you can't fit into memory all at once) is you concern; then the block/hand-split method is better. Also; the code I used for the "block and manual split" approach is not by own; but lifted from an earlier perlmonks discussion.
  • Comment on RE: RE (tilly) 5: File reading efficiency and other surly remarks

Replies are listed 'Best First'.
RE (tilly) 7 (See other comments): File reading efficiency and other surly remarks
by tilly (Archbishop) on Aug 26, 2000 at 21:45 UTC
    Specifically see RE (tilly) 6 (bench): File reading efficiency and other surly remarks. Your speed claim can only be made for the specific setup you tested. If your code will need to run on multiple machines then the optimization is almost certainly wasted effort. If performance does not turn out to be a problem, it is likewise counterproductive to have sacrificed maintainability for this.

    In short, the fact that this might be faster is very good to know for the times that you need to squeeze performance out on one specific platform. But don't apply such optimizations until you know that you need to, and don't apply this one until you have benchmarked it against your target setup.

    A few general notes on optimization. Given the overhead of an interpreted language, shorter code is likely to be faster. With well modularized code you retain the ability to recognize algorithm improvements later - which is almost always a better win. Worrying about debuggability up front speeds development and gives more time to worry after the fact about performance. And readable code is easier to understand and optimize.

    Which all boils down to, don't prematurely optimize. Aim for good solid code that you are able to modify after you have enough of your project up and running that you can identify where the bottlenecks really turned out to be.

      I agree %100. If you get to the point where you absolutely-positively need to squeeze more performanceout of your file-reads you can try the block-at-a-time approach and see if it helps. If you don't absolutely need the performance boost stay with something that is easier to read and less platform-tweaking dependant. On our web-farm it has cut down the run time of our log-file analysis jobs nearly in half.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://29814]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2024-04-20 02:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found