Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Parsing a Variable Format String

by pc88mxer (Vicar)
on Jul 10, 2008 at 03:46 UTC ( [id://696603]=note: print w/replies, xml ) Need Help??


in reply to Parsing a Variable Format String

Combine regular expression with a dispatch table. For each regular expression you define what action should take place. In this case, the actions will add values to a hash.

Parsing is performed from left to right. As matches are found the matched text is removed from the input string. Parsing stops when no patterns match or the input string is reduced to the empty string.

use Data::Dumper; my %patterns = ( qr/SS\s+(\d+)/ => sub { $_->{SS} = $1 }, qr/PL\s+([=#]?)(\d+)([=#])(\d+)/ => sub { $_->{SP} = $2; $_->{LP} = $4 }, qr/PV\s+([a-z]?\d+)(\.\d+)?/ => sub { }, # to be written... qr/RL\s+(\d+)([,'"])/ => sub { $_->{RL} = $1; }, qr/SA\s+(\d+)/ => sub { $_->{SA} = $1 }, qr/DS\s+(\d+)/ => sub { $_->{DS} = $1 }, qr/CT\s+([#^]?)\s*(\d+)\s+(\S+)\s+(\S+)/, => sub { }, qr/CL\s+([#^]?)(\d+)\s+(\S+)/, => sub { }, ); sub MAGIC { my $line = $_[0]; $_ = {}; $line =~ s/^\s*//; LOOP: while (length($line)) { for my $k (keys %patterns) { if ($line =~ s/^$k(?:\s+|\z)//) { $patterns{$k}->(); next LOOP; } } warn "got stuck on: $line\n"; $_->{leftover} = $line; last; } $_; } print Dumper( MAGIC("SS 21 PL 2#3 PV 51.3 CL #110 +0 RL 126' SA 1 +06 DS 93")); __END__ $VAR1 = { 'SA' => '106', 'DS' => '93', 'RL' => '126', 'SP' => '2', 'SS' => '21', 'LP' => '3' };

Replies are listed 'Best First'.
Re^2: Parsing a Variable Format String
by ozboomer (Friar) on Jul 10, 2008 at 06:13 UTC
    OOooo, that's certainly elegant :)

    I've been trying to get things happening... with some success.

    My regex is still woeful... but I have a vague idea of what's going on here.. I still can't get the CT/CL items "usable" though, viz:
    use Data::Dumper; @opts = ( "SS 21M- PL 2#3 PV 51.3 CL #110 +0 RL 126' SA 106 DS 93", "SS 21 + PL 2.3 PVa51.3 CT^ 110 +0 RL126, SA 106 DS 93", "SS 21F PL2#3# PV 51.3 CL #110 +0 RL 126' SA 106 DS 93" +, "SS 21#+ PL2.3# PV 51.3 CL 110 -13 RL 126' SA 106 DS 93 +", "SS 32 - PL2.3# PV 51.3 CL #110 +0 RL 126' SA 106 DS 93" ); my ($ss, $mstate, $mvote); my %patterns = ( qr/SS\s+(\d+)([MF# ])([\+\- ])/ => sub { $ss = $1; $mstate = $2; $mvote = $3; }, # qr/PL\s+([=#]?)(\d+)([=#]?)(\d+)/ # qr/PL\s+(\d+)([\.\#])(\d+)([\.\#])/ qr/PL\s*(\d+)([\#\.])(\d+)([\#\.]?)/ => sub { $sp = $1; $spl = $2; $spl = "" if ($spl ne "#"); $lp = $3; $lpl = $4; $lpl = "" if ($lpl ne "#"); }, qr/PV\s*([a-z]?\d+)(\.)(\d+)/ => sub { $vsp = $1; $vlp = $3 }, qr/RL\s+(\d+)([,'"]?)/ => sub { $rl = $1; $rlflg = $2; }, qr/SA\s+(\d+)/ => sub { $csa = $1 }, qr/DS\s+(\d+)/ => sub { $cds = $1 }, qr/CT\s+([#^]?)\s*(\d+)\s+(\S+)\s+(\S+)/, => sub { }, qr/CL\s+([#^]?)(\d+)\s+(\S+)/, => sub { }, ); sub MAGIC { my $line = $_[0]; $_ = {}; $line =~ s/^\s*//; printf("Input: >%s<\n", $line); LOOP: while (length($line)) { for my $k (keys %patterns) { if ($line =~ s/^$k(?:\s+|\z)//) { $patterns{$k}->(); next LOOP; } } warn "got stuck on: $line\n"; $_->{leftover} = $line; last; } printf(" \$ss = $ss\n"); printf("\$mstate = >$mstate<\n"); printf(" \$mvote = >$mvote<\n"); printf(" \$sp = $sp\n"); printf(" \$spl = $spl\n"); printf(" \$lp = $lp\n"); printf(" \$lpl = $lpl\n"); printf(" \$ctl = ...\n"); printf(" \$ct = ...\n"); printf(" \$ctr = ...\n"); printf(" \$vsp = $vsp\n"); printf(" \$vlp = $vlp\n"); printf(" \$rl = $rl\n"); printf(" \$rlflg = $rlflg\n"); printf(" \$csa = $csa\n"); printf(" \$cds = $cds\n"); $_; } foreach $buf (@opts) { printf("START -----\n"); MAGIC($buf); # print Dumper( MAGIC($buf)); printf(" \n"); }
    For example, for the 4th item in @opts, I would like to capture the info that:

    Buf: "SS 21#+ PL2.3# PV 51.3 CL 110 +13 RL 126' SA 106 DS 93" Cycle Time = CT (or CL) = 110 Locks = (no #) = OFF Rotation = "that number" = 13
    Many thanks for the assistance; I appreciate it a lot.

      Something like:
      qr/(CT|CL)\s+([#]?)\s*(\d+)\s+(\S+)/, => sub { $_->{CTorCL} = $1; $_->{CycleTime} = $3; $_->{Locks} = ($2 eq "#" ? "ON" : "OFF"); $_->{Rotation} = $4; }
      Note I put the values in a hash. Otherwise you'll have to explicitly declare or reset your variables so one call won't interfere with subsequent calls.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://696603]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2025-06-18 09:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.