Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Memory Leak when using Lingua::EN::MatchNames

by allolex (Curate)
on May 04, 2004 at 21:41 UTC ( #350587=note: print w/ replies, xml ) Need Help??


in reply to Memory Leak when using Lingua::EN::MatchNames

I've tried to improve your code, and I have tested what I am posting here.

You have a couple of problems. The first one is that you are creating a number of unnecessary arrays with your extra foreach loops. I've tried to optimize that by using keys with hashes as opposed to creating temporary lookup arrays, and then creating a subroutine to handle reading in the files, etc. When you're done with a file, it is a good idea to close it to avoid leaving dangling filehandles.

The other thing has already been pointed out. If you look at the Lingua::EN::MatchNames documentation, you can see it expects fn1, ln1, fn1, ln2. I fixed that, kind of (see my comment). The name_eq() function will return undef if there is no possible match, so I adding handling for that as well. That accounts for the "uninitialized value" warnings.

#!/usr/bin/perl use strict; use warnings; use Lingua::EN::MatchNames; my $termfile = shift; my $userfile = shift || die "Usage: $0 TERMFILE USERFILE\n"; my %curlookup = getlist($userfile); my %termlookup = getlist($termfile); open my $dfh, ">", "dup.$userfile"); foreach my $termusername (keys %termlookup) { NameComp( $termlookup{$termusername} ) } close $dfh; # getlist takes a filename as an argument sub getlist { my $filename = shift; my $counter; my %results; open my $fh, "<", "$filename"; while ( <$fh> ) { chomp; ++$counter; next unless m/[A-Za-z]/; $results{$counter} = $_; print "Adding user $counter $_ from file \'$filename\' to hash +\n"; } close $fh; return %results; } sub NameComp { # no parens, this is not a function prototype my $compname = shift; foreach my $curusername (keys %curlookup) { print "Comparing \'$compname\' to \'$curlookup{$curusername}\' +\n"; # This method is not good because it assumes a ' FN -SPACE- LN + ' format my @compname = split /\s+/, $compname; my @curname = split /\s+/, $curlookup{$curusername}; my $name_score = name_eq( $compname[0], $compname[1], +$curname[0], $curname[1] ); if ( $name_score ) { if ( $name_score >= 80 ) { print "Found Match $curlookup{$curusername} with a sco +re of $name_score.\n\n"; } } else { print "\t\tNo possible match.\n\n"; } } }

Best of luck to you.

--
Damon Allen Davison
http://www.allolex.net


Comment on Re: Memory Leak when using Lingua::EN::MatchNames
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://350587]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (13)
As of 2015-07-29 11:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (263 votes), past polls