more useful options | |
PerlMonks |
Another "out of memory!" problemby slugger415 (Monk) |
on Jun 22, 2010 at 23:05 UTC ( [id://845980]=perlquestion: print w/replies, xml ) | Need Help?? |
slugger415 has asked for the wisdom of the Perl Monks concerning the following question: Hello, this is my first post. I'm getting an "out of memory" message. I've looked at some of the previous posts on this subject (2007, 2005 and 2001) but am not sure if I can resolve my problem. My script "crawls" a large website and builds a list of all pages it can find, via a-href's, using HTML::Treebuilder and a few other modules. The key part is that it saves each URL to a %ListOfURLs hash, which it checks against so it doesn't hit the same page twice. I'm finding when the hash gets to be more than 27,000, I get the out-of-memory error. Am I just hitting some kind of memory/hash size limit? There are lots of other hashes and arrays created along the way, such as arrays of all hrefs on each page, e.g.: my @aList = $tree->find_by_tag_name('a'); I've tried undef'ing those when they're no longer needed but it doesn't seem to make any difference. I'm happy to provide some code here but it's a pretty busy script. Any suggestions about how I might otherwise build a list to be checked against that would use less memory would be appreciated. BTW on one post I saw a suggestion to use 'tie', but the documentation for tie speaks thusly: "This function binds a variable to a package class that will provide the implementation for the variable. VARIABLE is the name of the variable to be enchanted." To me that might as well say "Tie a shoelace around a shoebox and wave a magic wand over it." :-) I don't understand a word of it. Thanks for any help you can provide. Scott
Back to
Seekers of Perl Wisdom
|
|