Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Prime,

Can you provide a little more information about the contents of the Bench file, and what print_report() is doing?

From what I gather, the bench file is simply a list of absolute file paths on the filesystem (since you're using a find call to populate %today). What exactly are you trying to track?

Another question - have you confirmed your find command on your machine? On my box (redhat 9), that call to find (assuming $search_files is a scaler for a text match of some kind) would return every file on the filesystem. Are you sure you're getting the correct results?

Now that I think about it, I've got an idea on a general approach, assuming you've got access to the standard Unix utils - use sort, uniq, and diff, and parse the output of the diff. e.g.

`cat benchmark_files|sort|uniq -c > benchmark_counted`; `find / $search_files -print |sort | uniq -c > todays_find`; open IN, "diff benchmark_counted todays_find|" or die "$!"; while (<IN>) { ## parse diff output into %yesterday and %today ## an exercise for the reader } close IN;

By using the unix tools, you've now got the same output as you had after the call to _scan_system(). Note - diff will flag identical lines with different counts (that's what the -c option to uniq does) - you'd have to account for that when parsing the diff output.

This assumes, of course, that the real memory hog is %yesterday, before a pile of keys are deleted in building %today. If I'm wrong, and at the end of processing %yesterday and %today are both too big to handle by print_report(), you may well need to look at some kind of BerkeleyDB-type solution, but realize it's going to slow things down by a lot.

I hope this helps - sort/diff/uniq can be a great way to reduce the load on perl when processing large files.


In reply to Re: Memory Management Problem by swngnmonk
in thread Memory Management Problem by PrimeLord

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others making s'mores by the fire in the courtyard of the Monastery: (9)
    As of 2014-11-23 13:28 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My preferred Perl binaries come from:














      Results (132 votes), past polls