Here's a solution that exactly matches the phrases specified in AnonyMonk's Re: Using Look-ahead and Look-behind post (which the code of Re^2: Using Look-ahead and Look-behind does not quite do), and also shows how to use the newfangled backtracking control verbs of 5.10 to emulate variable-width negative look-behind. Variable-width positive look-behind is emulated by 5.10's \K assertion.
Explanation:
-
Any 'equity' that is preceded by
- either a character that is not a comma or whitespace, or
- by the 'private' phrase
FAILS and is skipped over (this test has first precedence);
-
Otherwise, any 'equity' that is not followed by a comma that is then followed by any non-whitespace SUCCEEDS.
>perl -wMstrict -le
"use Test::More 'no_plan';
;;
for my $ar_vector (
[ YES => 'equity, private equity', ],
[ YES => 'equity', ],
[ no => 'private equity', ],
[ YES => 'private equity,equity', ],
[ YES => 'private equity, equity', ],
[ no => 'equity,private equity', ],
[ no => 'private equity', ],
[ no => 'mutual funds', ],
[ no => 'cds' ],
) {
my ($expected, $string) = @$ar_vector;
is match($string), $expected, qq{'$string'};
}
;;
sub match {
my ($string) = @_;
;;
my $char_not_comma_or_space = qr{ [^,\s] }xms;
my $private = qr{ private \s+ }xms;
return 'YES' if $string =~
m{ (?: $char_not_comma_or_space | $private) equity (*SKIP)(*FAIL)
|
equity (?! , \S)
}xms;
return 'no',
}
"
ok 1 - 'equity, private equity'
ok 2 - 'equity'
ok 3 - 'private equity'
ok 4 - 'private equity,equity'
ok 5 - 'private equity, equity'
ok 6 - 'equity,private equity'
ok 7 - 'private equity'
ok 8 - 'mutual funds'
ok 9 - 'cds'
1..9
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Outside of code tags, you may need to use entities for some characters:
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
|
|