Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Name matching -memory leakage

by deepuceg (Initiate)
on Apr 02, 2008 at 13:45 UTC ( #677956=perlquestion: print w/ replies, xml ) Need Help??
deepuceg has asked for the wisdom of the Perl Monks concerning the following question:

Hi people, While trying to match two files containing list of names of the persons i am getting memory leak.. i am using the module Lingua::EN::MatchNames; it works well for my need.. but the thing is i am having 20,000 records in each file.. upon checking these names the memory is unable to withstand... So i am in idea to clear the memory after checking each name.. Any suggestions regarding memory handling would be appreciable.. Thanks.

Comment on Name matching -memory leakage
Re: Name matching -memory leakage
by Anonymous Monk on Apr 02, 2008 at 13:50 UTC
Re: Name matching -memory leakage
by grizzley (Chaplain) on Apr 02, 2008 at 14:12 UTC

    Can't you load first file to the array (20000 x 30 bytes each? = 600kB) and then read second file one name at a time (30 bytes?) and compare to all names from first one?

    Update: Sorry, just read what the module does and it looks much more clever than just 'eq' :)

      I have another idea: I would try to run your program for 100 names, see how long does it take and how much memory it consumes. Next, the same for 200 names, for 500, 1000 etc. This way I could estimate how much will it take for 20.000 names and... look for better algorithm :)
Re: Name matching -memory leakage
by apl (Monsignor) on Apr 02, 2008 at 14:16 UTC
    How do you know you have a memory leak?

    Can you post the minimal code that produces the problem, with a description of what the problem is?

Re: Name matching -memory leakage
by radiantmatrix (Parson) on Apr 02, 2008 at 14:38 UTC

    Well, you don't provide any example code, so it's hard to give good advice. Given your description, though, I can make a couple of guesses.

    My first guess is that it isn't a memory leak, but simply that you're running out of available memory (a memory leak is a very specific kind of bug).

    Given that, my second guess is that you're loading the entire files into memory before operating on them. If your records are large enough, the combination of doing that and building your output data structure might put you over your available RAM.

    One simple way to avoid that is to put your data -- including your output structure -- into a database. That way, the database engine takes care of memory allocation, etc. DBD::SQLite is very nice for this, as it's a simple RDBMS entirely contained in a Perl module.

    Beyond that advice, memory bugs are difficult to trace down without seeing code. Share some code, and you'll likely get more help. You may also want to check out How (Not) To Ask A Question for tips about asking questions on PerlMonks in a way that's optimized for getting good answers.

    Ramblings and references
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://677956]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2014-09-21 11:46 GMT
Find Nodes?
    Voting Booth?

    How do you remember the number of days in each month?

    Results (168 votes), past polls