Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Alternate for "open"

by Preceptor (Deacon)
on Nov 16, 2015 at 17:24 UTC ( [id://1147819]=note: print w/replies, xml ) Need Help??


in reply to Alternate for "open"

open doesn't consume significant amount of memory. It will be something else. I would suggest usual culprits in this scenario are:

  • foreach ( <$file_handle>) { which reads the whole file into an array before iterating. You should use while instead
  • Too wide scope on something that's getting updated as part of your processing - and thus steadily growing as each file is processed.
  • use strict; use warnings are just generally a good idea - the snippet you quote doesn't seem to suggest it's doing that.

Of course - without some SAMPLE CODE - we can only speculate as to what's causing your problem. It isn't open though.

Replies are listed 'Best First'.
Re^2: Alternate for "open"
by ravi45722 (Pilgrim) on Nov 17, 2015 at 05:02 UTC

    Ya, Here is the clue. It's hashes. I am building a hash to store all my result values. The output is two excel books (3g data, 2g data) Each contains 23 sheets. Each sheet contain 11,222 (31 columns,362 rows) as an average (Not Exactly). Is it needed 64 Gb RAM??? If it need that much how can we reduce it???

      You've now been given several suggestions:

      • use strict;
      • use warnings;
      • Test by simplifying the script so that it only opens the files
      • Make sure to use while not foreach to read your filehandles
      • Post verbatim snippets of the code here so it can be reviewed
      Which of these have you done?

      In particular, does your code contain use strict; and use warnings;?

      It's not an completely unreasonable idea to try to measure the memory footprint of your hash, but most monks here would not do that to find the problem. Better to simplify your code so you can identify the problem.

      Just remove everything until it runs properly, then start adding stuff back in. If it is a really large and ugly codebase, take the opportunity to refactor and move code out into modules. This is better practise for many reasons and will help you do this kind of debugging by making it easy to use and not use parts of the code.

      You could also:

      • If you suspect the hash is getting too big, comment out the code that populates it, run the program, and see if there's a difference.
      • Try running the program on only one file and see if there's a difference.
      • Try running the program on lots of very small files and see if there's a difference.
      • Consider loading your file data into a real database such as SQLite and working from there.
      • Look for memory leaks with Test::LeakTrace

      There, now you have a bunch more suggestions. It will be nice to hear back from you when you've tried some of them and you are still stuck.

      The way forward always starts with a minimal test.

        While I posted my first program here "MONKS" suggested to use strict. From that time I never miss those two lines in my code. And I am very sure that I am using while to read the file (Already posted).

        As per your suggetion I am trying to run the code piece by piece now. And my first piece is reading CDR & build the values in hash. So, I want to be sure that my hash is not taking much memory.

        Thanks for suggestions

      Depends how inefficiently you're storing the data - you can _certainly_ incur overheads - for example, XML is around 10x the memory footprint of the file at rest.

      But at least now we've moved on from blaming open - check what you're inserting into the hash. How many key/value pairs? Are you creating nested data structures? (hash of arrays, etc.)? Because all these things add up - you _can_ expect the memory required to be larger than the raw input.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1147819]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-03-29 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found