Re: text extraction question

by ikegami (Pope)
on Dec 05, 2006 at 19:06 UTC

in reply to text extraction question

use strict; use warnings; my $templateformat = 'w<NM>b<NM>cm<CH>sw<SW>'; my $inputexample = 'w8b8cm512swno'; my %type_matchers = ( NM => qr/\d+/, CH => qr/\d+/, SW => qr/yes|no/, ); my @field_names; my @field_types; while ($templateformat =~ /([^<]+)<([^>]+)>/g) { push(@field_names, $1); push(@field_types, $2); } # my $re = '^'; # $re .= "\Q$field_names[$_]\E((?:(?!\Q$field_names[$_+1]\E).)*)" # for 0..$#field_names-1; # $re .= "\Q$field_names[-1]\E(.*)\\z"; # $re = qr/$re/s; my $re = '^'; $re .= "\Q$field_names[$_]\E($type_matchers{$field_types[$_]})" for 0..$#field_names; $re = qr/$re/s; my @field_values = $inputexample =~ $re or die("Input \"$inputexample\" doesn't match the format defined by +template \"$templateformat\"\n"); local $, = "\t"; local $\ = "\n"; print(@field_names); print(@field_values);


w b cm sw 8 8 512 no

Updated: Replaced the commented paragraph with the one that follows due to a better understanding of the question. Both give the same answer.

Node Type: note
