Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Text::Fuzzy recursive fuzzy_index

by Anonymous Monk
on Sep 22, 2018 at 11:06 UTC ( [id://1222833]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks

I am playing with Text::Fuzzy in order to match and replace strings in a text file with a fuzzy logic (I need to take account in my matching of inflected words, such as sing/plural, etc.). The function that seems most promising is:

my ($offset, $edits, $distance) = fuzzy_index ($needle, $haystack);

Once I have the offset of a fuzzy much I can apply:

my $needleLenght=length ($needle); substr($haystack, $offset, $needleLenght) = "my new value";

However, the module's documentation states that the value returned is the offset of the closest match found. However, I need to find ALL offsets of my $needle in $haystack. Do you know any module that achive this? Or any trick to have it with this module?

Replies are listed 'Best First'.
Re: Text::Fuzzy recursive fuzzy_index
by Anonymous Monk on Sep 22, 2018 at 11:42 UTC

    Sorry monks, now I understand what is meant by "closest" match found (I interpreted it as "first"). In this way it can find only one match... and the function is no use to me. I need to get the offset of a fuzzy match, but need also to be able to set the max. distance (similarity)...

      Text::Fuzzy has a maximum distance setting. set_max_distance of Text::Fuzzy.

      It seems not to have a function that will return matched strings with their distances, but you could iterate over the data yourself using the distance method. See code below for an example.

      I think the nearest method in list context will return the list of strings with the shortest distance, so will not be all those within the maximum distance. Both those would be useful to have - maybe submit a feature request?

      # code not tested my $max_distance = 4; my $some_label = 'blortbertblart'; my %targets_hash = (...); my $distance_finder = Text::Fuzzy->new( $some_label, trans => 1); $distance_finder->set_max_distance ($max_distance); my %poss_matches; foreach my $target (keys %targets_hash) { my $distance = $distance_finder->distance( $target_label ); next if !defined $distance; $poss_matches{$distance} //= []; push @{$poss_matches{$distance}}, $target; } # do stuff with @poss_matches

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1222833]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-19 23:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found