Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Split a sentence into words

by ikegami (Patriarch)
on May 30, 2009 at 07:05 UTC ( [id://767014]=note: print w/replies, xml ) Need Help??


in reply to Split a sentence into words

Don't use my variables declared outside the regex pattern from within (?{}).

The problem you are having is that one of the patterns matches, then gets added to @list1, then gets unmatched by backtracking. But you never remove it from @list1 on backtracking. A simple example of this:

>perl -le"'abc1def2' =~ /(?:([a-z])(?{ print $^N }))+2/" a b c b c c d e f

The solution is to use $^R.

use strict; use warnings; my @vocabulary = qw( a abc abcd abd bc ); my $sentence = 'abdaabc'; my ($pattern) = map qr/$_/, join '|', map quotemeta, sort { length($b) <=> length($a) } # optional @vocabulary; use re 'eval'; local our @list; $sentence =~ / (?{ [] }) ^ (?: ($pattern) (?{ [ @{$^R}, $^N ] }) )+ $ (?{ @list = @{$^R} }) /x or die("No solution\n"); print( join('-', @list), "\n" ); # abd-a-abc

Without the sort, you'd get abd-a-a-bc. If you want all possible solutions:

... use re 'eval'; local our @list; $sentence =~ / (?{ [] }) ^ (?: ($pattern) (?{ [ @{$^R}, $^N ] }) )+ $ (?{ push @list, join('-', @{$^R}) }) (?!) /x; die("No solution\n") if !@list; print("$_\n") for @list;
abd-a-a-bc abd-a-abc

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://767014]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2025-06-18 02:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.