Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Hello good Monks. First let me start out with the problem: given an arbitrary list of strings, find the longest common substring. My approach to this problem is to grab one of the strings from the list, scan through it with successively decreasing substring lengths, and check the list for matches. This is simple, and seems quite effective, but I'm wondering if there are any problems with this approach? Could it be done simpler and more efficiently? Is there a well-known algorithm to do this, and I am too daft to find it?

Here is the working code I came up with. I do not have any particular problem with this code, I just wanted to run it by the Monastery to see if anyone could give me suggestions, or find any lurking problems:

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; for ([ qw(fooabc123 fooabc321 foobca232) ], [ qw(abcfoo123 bcafoo321 foo123abc) ], [ qw(foo bor boz bzo) ]) { print Dumper($_); print findlcs(@{ $_ }), "\n"; print "---\n"; } sub findlcs { my $substr = $_[0]; my $len = length $_[0]; my $off = 0; while ($substr) { my @matches = grep /\Q$substr/, @_; #printf "%s%-".(length($_[0])-$off)."s matches %d\n", # " " x $off, $substr, scalar @matches; last if @matches == @_; $off++; $len-- and $off=0 if $off+$len > length $_[0]; $substr = substr $_[0], $off, $len; } return $substr; }

In reply to finding longest common substring by revdiablo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (2)
As of 2024-04-25 06:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found