Hello ww,
The $errmsg regex is working fine, as the following demonstrates:
#! perl -w
use strict;
use 5.018;
my $start = qr{<table align="left" border="0" cellspacing="0" cellpad
+ding="1"};
my $end = qr{</table>};
my $errmsg = qr{Result</td><td bgcolor=".{7}">Error:.*?(?=</td>)};
while (<DATA>)
{
/$errmsg/ && say if /$start/ .. /$end/;
}
__DATA__
<html><body><small>
<table align="left" border="0" cellspacing="0" cellpadding="1">
<tr><td bgcolor="#db4930">Result</td><td bgcolor="#db4930">Error: 404
+Not Found</td></tr>
</table>
<table align="left" border="0" cellspacing="0" cellpadding="1">
<tr><td bgcolor="#db4930">Result</td><td bgcolor="#db4930">Error: SSLE
+rror: [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERV
+ER_CERTIFICATE:certificate verify failed</td></tr>
</table>
<table align="left" border="0" cellspacing="0" cellpadding="1">
<tr><td foo bar baz> abcde </td></tr>
</table>
<tr><td bgcolor="INVALID">Result</td><td bgcolor="#db4930">Error: 404
+Not Found</td></tr>
</small></body></html>
Output:
13:55 >perl 1458_SoPW.pl
<tr><td bgcolor="#db4930">Result</td><td bgcolor="#db4930">Error: 404
+Not Found</td></tr>
<tr><td bgcolor="#db4930">Result</td><td bgcolor="#db4930">Error: SSLE
+rror: [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERV
+ER_CERTIFICATE:certificate verify failed</td></tr>
13:55 >
The main limitation of the above approach is that it fails to handle nested tables.
As AnomalousMonk and tye have indicated, the problem almost certainly lies in the logic used to split the input into “paragraphs.” If I were debugging this, I’d begin by printing out the value of $item immediately before the line if ( $item =~ /$errmsg/ ) {.
Hope that helps,