Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Re: Memory issue with large array comparison

by dave_the_m (Prior)
on May 24, 2012 at 21:29 UTC ( #972326=note: print w/replies, xml ) Need Help??

in reply to Memory issue with large array comparison

The following code, scaled up to 500,000 filenames and 100,000 IDs, takes 2 seconds to run and uses 90Mb approx.
use warnings; use strict; # create some sample data my (@pathnames, @safe_list); push @pathnames, sprintf "C:/abc/abc1/GS%06d", $_ for 1..500_000; push @safe_list, sprintf "GS%06d", $_ for 1..100_000; my %safe_hash; $safe_hash{$_} = 1 for @safe_list; my @list; for (@pathnames) { # (adjust regex to match whatever the filename/ID format is) /\/(\w+)$/ or die "bad pathname: $_"; push @list, $_ unless $safe_hash{$1}; }


Replies are listed 'Best First'.
Re^2: Memory issue with large array comparison
by bholcomb86 (Novice) on May 25, 2012 at 00:00 UTC

    I decided to try Dave's solution and turns out it works perfectly. I just wanted to solve my memory problem which this does, but as a side benefit it is WAY faster than my previous solution. Consider this problem solved! I appreciate all input given on this matter.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://972326]
[1nickt]: Corion I have a large site I need to check for broken links and absolute links. Making a scraper is easy of course; a spider that crawls a whole site is a little more involved ... I was planning a queue-based tool. Intersted to see what you do...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (8)
As of 2017-10-18 11:44 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (244 votes). Check out past polls.