Re^3: POD style regex for inline HTML elements

Hi Aleena,

The extract_* functions are meant to operate on the start of a string, not from an arbitrary point. As mentioned in the Text::Balanced description, you may skip a prefix before the start of the balanced text, but by default this will only skip whitespace.

So if you were to change text to:

my $text = '  <bold>, I<italic>, and B<I<bold and italic>> text.';
[download]

Your output would be:

$VAR1 = [
          '<bold>',
          ', I<italic>, and B<I<bold and italic>> text.',
          '  '
        ];
[download]

Where the return is a triple of the bracketed text, the remaining string, and the prefix that was bypassed before the bracketed text was found.

If you leave your $text input as it was in your example but change the function call to consider everything preceding a < as a prefix:

my @line = extract_bracketed($text, '<>', qr(.*?(?=<)));
[download]

You'll get:

$VAR1 = [
          '<bold>',
          ', I<italic>, and B<I<bold and italic>> text.',
          'A line with B'
        ];
[download]

Where the prefix is again everything before the <. but includes the bold code at the end, which you'd have to deal with appropriately.

HTH

Comment on Re^3: POD style regex for inline HTML elements Select or Download Code


go ahead... be a heretic
	PerlMonks