Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: longest common substring... almost?

by haoess (Curate)
on Apr 18, 2004 at 19:26 UTC ( #346145=note: print w/ replies, xml ) Need Help??


in reply to longest common substring... almost?

There's a difference between

I'm trying to write a script which generates the longest common substring + 1 character for each line in a file.

and

I'm trying to get the script to return the smallest number of characters to uniquely identify each line.

Assume the following lines:

AA AB ABC ABCDE

In the first case, the output should be

AA AB AB AB

In the second case, output should be

AA AB ABC ABCD

For the first case, this should work:

#!/usr/bin/perl -lw use strict; chomp( my @lines = <DATA> ); # find shortest line my @foo = sort { length $a <=> length $b } @lines; my $diff = $foo[0]; LINES: for (@lines) { # the longest common substring can only be as long # as the shortest line my $line = substr $_, 0, length $diff; while( $line ) { if ( $line eq $diff ) { $diff = $line; next LINES; } else { # reduce current line and recent common string by one $line = substr $line, 0, -1; $diff = substr $diff, 0, -1; } } } for( @lines ) { print substr $_, 0, length( $diff ) + 1; } __DATA__ AA AB ABC ABCDE

-- Frank ('s first post at perlmonks)


Comment on Re: longest common substring... almost?
Select or Download Code
Re: Re: longest common substring... almost?
by ambrus (Abbot) on Apr 19, 2004 at 09:03 UTC
    Later in the original question:

    I'm trying to get the script to return the smallest number of characters to uniquely identify each line.

    This makes clear that the poster wants the second case.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://346145]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (9)
As of 2014-12-20 17:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (97 votes), past polls