Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Levenstein distance transcription

by Eily (Monsignor)
on Dec 05, 2014 at 13:39 UTC ( [id://1109350]=note: print w/replies, xml ) Need Help??


in reply to Levenstein distance transcription

It doesn't look like there is. If you have few unique words (by taking smaller portions of a bigger string if necessary) you could always replace each word by a char and run the character version of the algorithm on it:

use v5.14; use Data::Dump qw/pp/; my @chars = ('0'..'9', 'a'..'z', 'A'..'Z'); # Up to scalar(@chars) different words (actually @chars+1 because of u +ndef, but that wouldn't help readability) $_ = <<STR; Jack and Jill went up the hill to fetch a pail of water Jack fell down and broke his crown and Jill came tumbling after STR my @words = /\w+/g; my %replace; my $asChars = join '', map { $replace{$_}//=shift(@chars) } @words; # +'defined-orcish manoeuver' :D # say pp \%replace; say $asChars; my %reverse = reverse %replace; say join ' ', map $reverse{$_}, split //, $asChars; __DATA__ 0123456789abc0de1fgh12ijk Jack and Jill went up the hill to fetch a pail of water Jack fell down + and broke his crown and Jill came tumbling after

Edit: for the comparison to work, you have to use the same %replace hash for all strings. And the $h{$_}//=NewVal() idiom (Orcish Maneuver) means that any word that's already known will be replaced by the existing substitute, while an unknown word will add a new entry in the hash. Here I use // instead of || because otherwise '0' would be an invalid (false) character.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1109350]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-20 02:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found