Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: longest common substring (with needed tweaks)

by R56 (Sexton)
on Oct 28, 2013 at 16:37 UTC ( [id://1060013]=note: print w/replies, xml ) Need Help??


in reply to Re: longest common substring (with needed tweaks)
in thread longest common substring (with needed tweaks)

Great piece of code Lennotoecom :)

Struggling a bit to understand it tho, as it's greatly simplified!

Can you explain me this first line in detail? Never seen the $` before...

$_ = <DATA>; $_ = $` if /$/; @a = split //, $_;

Thank you!

Replies are listed 'Best First'.
Re^3: longest common substring (with needed tweaks)
by Lennotoecom (Pilgrim) on Oct 28, 2013 at 17:29 UTC
    for example:
    $a = 'aa ab c c'; $a=~m/b/; now $` contains 'aa a' $& contains 'b' $' contains ' c c'
    in other words all symbols of a line before the found result
    found result,
    and all the symbols after found results
      #takes first line from the <DATA> and split values by ' ' into $lines +and $matches ($lines, $matches) = split /\s/, <DATA>; #takes the next line from the <DATA>, chop off the \n and split result +ed string #into @a array by symbols $_ = <DATA>; $_ = $` if /$/; @a = split //, $_; #in this cycle(1) we create all possible combinations of substrings ou +t of the #@a array, (out of the first line) and equals them to 1 for $i (0 .. $#a){ $e = $a[$i]; $hash{$e} = 1; for $y ($i+1 .. $#a){ $e .= $a[$y]; $hash{$e} = 1; } } #in this cycle(2) we read file line by line and for every line #we do exactly the same as the previous cycle but into #temporal hash and then in the foreach cycle(3) we increment #existed keys from the first hash if they are in the current line while(<DATA>){ $_ = $` if /$/; @a = split //, $_; %thash = (); for $i (0 .. $#a){ $e = $a[$i]; $thash{$e} = 1 if defined $hash{$e}; for $y ($i+1 .. $#a){ $e .= $a[$y]; $thash{$e} = 1 if defined $hash{$e}; } } foreach $key (keys %hash){ $hash{$key}++ if defined $thash{$key}; } } #and finally here we go through the hash #and print only those keys which have their value == $matches $max = ''; foreach $key (keys %hash){ if($hash{$key} == $matches){ print "$key\n"; # $max = $key if length($max) < length($key); } } print "$max\n"; __DATA__ 3 2 strrringggg ssttrrringggg stttrrringgg
      this whole script has a flaw:
      the whole resulting hash is build upon the first text line
      so in order to fix it in the cycle number 3 if the hash value is undefined you
      should create one, not omit like in this example
        sub f { @a = split //, shift; $ih = shift; for $i (0 .. $#a){ $e = $a[$i]; ${$ih}{$e} = 1; for $y ($i+1 .. $#a){ $e .= $a[$y]; ${$ih}{$e} = 1; } } } ($l, $m) = split /\s/, <DATA>; $_ = <DATA>; chomp; %h = (); f($_, \%h); while(<DATA>){ chomp; %th = (); f($_, \%th); $h{$_}++ foreach (keys %th); } foreach $key (keys %h){ if($h{$key} == $m){ $r[length($key)] = [] if ! exists $r[length($key)]; push $r[length($key)], $key; } } print "@{$r[$#r]}\n"; __DATA__ 3 2 ac bc b
        1: creates sub named f, which takes two parameters: string and a reference to a hash
        that sub puts all combinations of substrings out of the given string into the hash
        2: splits the first string from the file into two variables $l $m
        3: takes the next line from the file and sends it into sub f with the reference to an empty hash %h
        4: at this point the first line from the file is split on all its substrings which are put into
        hash %h and have a value 1
        5: then we read the rest of the file line by line and send these lines to the sub f along the reference
        to an empty hash %th, right after that the two hashes are compared and the %h hash is incremented on the doubled values
        6: runs through the %h hash and if the value of the key is amount of overlaps we need, then put it into an @r array of arrays
        7: the last line prints all the longest overlaps with the same length

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1060013]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-25 13:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found