<?xml version="1.0" encoding="windows-1252"?>
<node id="308711" title="Memory Management Problem" created="2003-11-20 16:27:58" updated="2005-07-14 23:35:26">
<type id="115">
perlquestion</type>
<author id="134492">
PrimeLord</author>
<data>
<field name="doctext">
Monks I come to you seeking your wisdom once again. I am having a bit of a memory management issue I was hoping you could help me with. I have written a script that produces a daily report on a system scan it does. It first reads in a benchmark file from the scan the day before into a hash. It then runs the daily scan dumping the info it finds into the new benchmark file and also compares the information to the data from the day before in the hash.
&lt;BR&gt;&lt;BR&gt;
The problem I am running into is the data being read into the hash can be several hunder megabytes in size. I need to find a more efficent way to handle this data. Here is an example of the code I have written.
&lt;BR&gt;
&lt;CODE&gt;use strict;

sub _read_benchmark {
       my %yesterday;
       open BENCH, "benchmark_file" or die "$!";
       while (&lt;BENCH&gt;) {
              chomp;
              $yesterday{$_}++;
       }
       close BENCH or warn "$!";
       return \%yesterday;
}

sub _scan_system {
       my $yesterday = shift;
       my %today;
       open BENCH, "&gt; benchmark_file" or die "$!";
       open IN, "find / $search_files -print |" or die "$!";
       while (&lt;IN&gt;) {
              chomp;
              print BENCH "$_\n";
              if (exists $yesterday-&gt;{$_}) {
                     delete $yesterday-&gt;{$_};
              } else {
                     $today{$_}++;
              }
       }
       close IN or warn "$!";
       close BENCH or warn "$!";
       return \%today, $yesterday;
}

sub _print_report {
       ...
}

my $yesterday = _read_benchmark;
my ($today, $yesterday) = _scan_system($yesterday);
_print_report($today, $yesterday);&lt;/CODE&gt;

I believe there is a way I can tie the benchmark file to a hash and that would probably be a huge improvement, but I am not sure how to do that. Any suggestions on how to make this less of a memory hog would be very appreciated. Thanks!
&lt;BR&gt;&lt;BR&gt;
-Prime</field>
</data>
</node>
