http://www.perlmonks.org?node_id=305753


in reply to Re: Find common prefix from a list of strings
in thread Find common prefix from a list of strings

I liked this one, but why would you use a grep there..? Also, metacharacters in $p will break it.
sub maxprefix { my $p = shift; for (@_) { chop $p until /^\Q$p/ } return $p; }
Depending on the nature of your data this might be a win:
sub maxprefix { my $s = reverse shift; my $p = ''; for (@_) { $p .= quotemeta chop $s while /^$p/ } chop $p; return $p; }

Makeshifts last the longest.

Replies are listed 'Best First'.
Re^3: Find common prefix from a list of strings
by ysth (Canon) on Nov 10, 2003 at 03:00 UTC
    That approach is quite nice. Here's a unsightly twist on it:
    #!/usr/bin/perl -wl use strict; use warnings; sub Prefix { my $p = shift; ($p)=$_=~join'','\A',map("(\Q$_",split//,$p),join'?',(')')x length( +$p)or return'' for @_; $p } while (<DATA>) { print "$_=> ", Prefix(split ' ',$_); } __DATA__ model4run1 model2run1 model4run2 model1run1 model4run1 model2run1 model4run2 abbot model4run1 model2run1 abbot model1run1 model4run1 model2run1 model4run2 monk model4run1 model2run1 monk model1run1 testing terse testing time
    but I prefer the chop.

      I really liked the one-liner, just that it doesn't work - it fails for 000 720

      Here's what I came up with. Is doesn't look that efficient, but it has the advantage of using no regexen for the actual comparison (only for split).
      #!/usr/bin/perl -wl use strict; use warnings; sub Prefix { { my $e = scalar(@_); return '' if ($e == 0); return $_[0] if ($e == 1); } my $minlen; my @d = map { my @split = split(//, $_); my $l = scalar(@split); if (defined($minlen)) { $minlen = $l if ($l < $minlen); } else { $minlen = $l; } \@split; } @_; my $p = ''; my $i = 0; while($i < $minlen) { my $c = $d[0]->[$i]; return $p unless ($c); for (my $j = 1; $j <= $#d; $j++) { return $p unless ($c eq $d[$j]->[$i]); } $p .= $c; $i++; } $p } while (<DATA>) { print "$_=> ", Prefix(split ' ',$_); } __DATA__ model4run1 model2run1 model4run2 model1run1 model4run1 model2run1 model4run2 abbot model4run1 model2run1 abbot model1run1 model4run1 model2run1 model4run2 monk model4run1 model2run1 monk model1run1 testing terse testing time monk model4run1 model2run1 model1run1 000 720 a aa abc a aa a aa aaa ab ba bb
        You can also just add '^' as the second parameter to join in the previous answer.
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
        Thanks, fixed it (adding \A)
        --
        A math joke: r = | |csc(θ)|+|sec(θ)| |-| |csc(θ)|-|sec(θ)| |
Re^3: Find common prefix from a list of strings (trailing \)
by tye (Sage) on Nov 10, 2003 at 00:09 UTC
    sub maxprefix { my $p = quotemeta shift; for (@_) { chop $p until /^$p/ } return $p; }

    This will still often fail on metacharacters because you'll get a "trailing \ in regexp" error.

    sub maxprefix { my $p = shift(@_); for(@_) { chop $p until /^\Q$p/ } return $p; }

    Update: The one you added has problems as well:

    sub maxprefix { my $s = reverse shift; my $p = ''; for (@_) { $p .= quotemeta chop $s while /^$p/ } chop $p; return $p; }

    It will find the maximum prefix between the first string and the string of the others that has the longest common prefix with the first string:

    print maxprefix("testing","terse","tester","time"),$/;
    prints test not t.

                    - tye
      Fixed before I saw your note. :) quotemeta shift was a duh moment of course. Update: I must have left my brain in the shop today.

      Makeshifts last the longest.

Re^3: Find common prefix from a list of strings
by simul (Novice) on Feb 28, 2012 at 21:14 UTC
    sub common_suffix { my $comm = shift @_; while ($_ = shift @_) { $_ = substr($_,-length($comm)) if (length($_) > length($comm)); $comm = substr($comm,-length($_)) if (length($_) < length($comm)); if (( $_ ^ $comm ) =~ /(\0*)$/) { $comm = substr($comm, -length($1)); } else { return undef; } } return $comm; } sub common_prefix { my $comm = shift @_; while ($_ = shift @_) { $_ = substr($_,0,length($comm)) if (length($_) > length($comm)); $comm = substr($comm,0,length($_)) if (length($_) < length($comm)); if (( $_ ^ $comm ) =~ /^(\0*)/) { $comm = substr($comm,0,length($1)); } else { return undef; } } return $comm; }