Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^3: match sequences of words based on number of characters

by AnomalousMonk (Abbot)
on Feb 18, 2013 at 03:25 UTC ( #1019267=note: print w/ replies, xml ) Need Help??


in reply to Re^2: match sequences of words based on number of characters
in thread match sequences of words based on number of characters

Based on the examples, I don't believe that nicemank is requiring captured words to be adjacent.

Hmmm... After taking another look at the OP, I think you may be right. In which case:

>perl -wMstrict -le "my $s = 'xxxx yy zzzzz xxxx qqq xxxx yy zzzzz xxxx qqq'; ;; for my $ar ([2, 4, 3], [5, 3]) { my $rx = rxg(@$ar); print $rx; my @groups = $s =~ m{ $rx }xmsg; print qq{'$_'} for @groups; } ;; sub rxg { my ($rx) = map qr{ \b $_ \b }xms, join ' \b .+? \b ', map qq{\\w{$_}}, @_ ; ;; return $rx; } " (?^msx: \b \w{2} \b .+? \b \w{4} \b .+? \b \w{3} \b ) 'yy zzzzz xxxx qqq' 'yy zzzzz xxxx qqq' (?^msx: \b \w{5} \b .+? \b \w{3} \b ) 'zzzzz xxxx qqq' 'zzzzz xxxx qqq'

Update: No, darn it, that's still not right! nicemank seems to want  'yy xxxx qqq' from  'yy zzzzz xxxx qqq'. Oh, well...


Comment on Re^3: match sequences of words based on number of characters
Select or Download Code
Replies are listed 'Best First'.
Re^4: match sequences of words based on number of characters
by frozenwithjoy (Curate) on Feb 18, 2013 at 04:00 UTC
    I think that the sub I wrote below does the trick, but I only did limited testing on it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1019267]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2015-07-30 06:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (270 votes), past polls