http://www.perlmonks.org?node_id=677956

deepuceg has asked for the wisdom of the Perl Monks concerning the following question:

Hi people, While trying to match two files containing list of names of the persons i am getting memory leak.. i am using the module Lingua::EN::MatchNames; it works well for my need.. but the thing is i am having 20,000 records in each file.. upon checking these names the memory is unable to withstand... So i am in idea to clear the memory after checking each name.. Any suggestions regarding memory handling would be appreciable.. Thanks.

Replies are listed 'Best First'.
Re: Name matching -memory leakage
by radiantmatrix (Parson) on Apr 02, 2008 at 14:38 UTC

    Well, you don't provide any example code, so it's hard to give good advice. Given your description, though, I can make a couple of guesses.

    My first guess is that it isn't a memory leak, but simply that you're running out of available memory (a memory leak is a very specific kind of bug).

    Given that, my second guess is that you're loading the entire files into memory before operating on them. If your records are large enough, the combination of doing that and building your output data structure might put you over your available RAM.

    One simple way to avoid that is to put your data -- including your output structure -- into a database. That way, the database engine takes care of memory allocation, etc. DBD::SQLite is very nice for this, as it's a simple RDBMS entirely contained in a Perl module.

    Beyond that advice, memory bugs are difficult to trace down without seeing code. Share some code, and you'll likely get more help. You may also want to check out How (Not) To Ask A Question for tips about asking questions on PerlMonks in a way that's optimized for getting good answers.

    <radiant.matrix>
    Ramblings and references
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: Name matching -memory leakage
by apl (Monsignor) on Apr 02, 2008 at 14:16 UTC
    How do you know you have a memory leak?

    Can you post the minimal code that produces the problem, with a description of what the problem is?

Re: Name matching -memory leakage
by grizzley (Chaplain) on Apr 02, 2008 at 14:12 UTC

    Can't you load first file to the array (20000 x 30 bytes each? = 600kB) and then read second file one name at a time (30 bytes?) and compare to all names from first one?

    Update: Sorry, just read what the module does and it looks much more clever than just 'eq' :)

      I have another idea: I would try to run your program for 100 names, see how long does it take and how much memory it consumes. Next, the same for 200 names, for 500, 1000 etc. This way I could estimate how much will it take for 20.000 names and... look for better algorithm :)
Re: Name matching -memory leakage
by Anonymous Monk on Apr 02, 2008 at 13:50 UTC
    Example?