in reply to Re^2: Regex Extraction Help
in thread Regex Extraction Help
You make a good point about splitting on a record separator within possibly malformed records. Based upon the OP's regex, it appears that the pattern's stable--with one space after the semi-colon. However, we can ask split to 'test' the format of the input, like this:
my $info = (split /\s*;\s*/, $dat);
This will return the info the OP wants, whether there are spaces before or after the semi-colon, or not.
And within a regex on the OP's data:
my $dat = 'DR Pfam; PF00070; Pyr_redox; 2.';
$dat =~ /;\s*(\w+)\s*;.+;/ and say $1; #prints PF00070
It was a good call to address this issue...