http://www.perlmonks.org?node_id=515688


in reply to Re^4: regex at word boundary
in thread regex at word boundary

Does this find all non-trivial palindrome phrases on each input line?

Does it fail on some interesting input?

#!/your/perl/here # find all palindrome phrases on each line # only phrases of two or more alpha # a phrase starts and ends on a word boundary # # nested palindrome phrases will also be found # (including single words) use strict; use warnings; our $N; my $re = qr/( # start $1 \b # left word edge ([a-z].*[a-z]) # at least 2 alpha \b # right word edge (??{ # start code local $N = lc $^N; # save capture $N =~ tr!a-zA-Z!!dc; # remove non-alpha # fail if not pal '(?!)' if (lc($N) ne reverse lc($N)); }) # end code ) # end $1 /ix; while (<DATA>) { my $found; while ( /$re/g ) { print "line $.:\n" unless $found; pos = pos() - length($1); # find nested pals print "(",pos,") \"$1\"\n"; $found = 1; # \n between groups pos = pos() + 1; # advance one } print "\n" if $found; } exit; __DATA__ god dog alpha beta gamma stop pots wonka wonka wonka bookkeeper raisinhead what is a bookkeeper pop repeekkoob tomorrow? boob tube A man, a plan, a canal, Panama! kook peep aha aba abba abbba aabbbaa abbba abba aba
produces
__OUTPUT__ line 1: (0) "god dog" line 2: (17) "stop pots" line 4: (10) "bookkeeper pop repeekkoob" (21) "pop" line 5: (0) "boob" (10) "A man, a plan, a canal, Panama" (42) "kook" (47) "peep" (52) "aha" line 6: (0) "aba abba abbba aabbbaa abbba abba aba" (4) "abba abbba aabbbaa abbba abba" (10) "abbba aabbbaa abbba" (17) "aabbbaa" (25) "abbba" (31) "abba" (36) "aba"
BTW, if '(?!)' is changed to (?!) (no single quotes), I get a "panic: top_env" from the compiler. Is this a known bug?

-QM
--
Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re^6: regex at word boundary
by davido (Cardinal) on Dec 10, 2005 at 05:56 UTC

    From perldiag:
    Panic: top_env
    The compiler attempted to do a goto, or something wierd like that.

    "Panic" messages fall into that category of errors that "you should never see1"

    I've seen that one before, and it seems to me I was mucking around with (??{code}) constructs at the time. They don't call them "highly experimental" for nothing. ;)


    Dave

Re^6: regex at word boundary
by mikeraz (Friar) on Dec 12, 2005 at 16:08 UTC

    This breaks on overlapping palindromes:

    adding these two lines to DATA:
    nested testest detsen nested
    i prefer pi ip referp
    
    produces this output:
    line 7:
    (0) "nested testest detsen nested"
    (7) "testest detsen nested"
    (15) "detsen nested"
    (22) "nested"
    
    line 8:
    (0) "i prefer pi ip referp"
    (2) "prefer pi ip referp"
    (9) "pi ip referp"
    (12) "ip referp"
    (15) "referp"
    

    Be Appropriate && Follow Your Curiosity
      Hrmm....this is what I get:
      line 7: (15) "detsen nested" line 8: (0) "i prefer pi" (2) "prefer pi ip referp" (9) "pi ip"
      That seems to be correct. Nothing else shows up in line 7, because "testest" isn't a palindrome (but "testseet" is).

      I guess it's helpful if you indicate what you think the output should be, so I can short circuit this back and forth :)

      Changing line 7 to this:

      nested testset detsen nested
      I get this:
      line 7: (0) "nested testset detsen" (7) "testset" (15) "detsen nested"
      What did you expect on the 2 lines you added?

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        I'll get what you get, I strongly suspect, when I'm not on a v5.6.1 Perl implementation. So your stuff does work with sufficiently current Perl.

        I'll rerun the speed comparison when I'm home and have access to v5.8

        Be Appropriate && Follow Your Curiosity