http://www.perlmonks.org?node_id=986587


in reply to Re^2: Regex Extraction Help
in thread Regex Extraction Help

You make a good point about splitting on a record separator within possibly malformed records. Based upon the OP's regex, it appears that the pattern's stable--with one space after the semi-colon. However, we can ask split to 'test' the format of the input, like this:

my $info = (split /\s*;\s*/, $dat)[1];

This will return the info the OP wants, whether there are spaces before or after the semi-colon, or not.

And within a regex on the OP's data:

use Modern::Perl; my $dat = 'DR Pfam; PF00070; Pyr_redox; 2.'; $dat =~ /;\s*(\w+)\s*;.+;/ and say $1; #prints PF00070

It was a good call to address this issue...