Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^3: Parsing issue

by vitoco (Friar)
on Sep 11, 2009 at 14:43 UTC ( #794793=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Parsing issue
in thread Parsing issue

I forgot to mention in my previous post that the if (pat1) {} elsif (pat2) {} elsif ... inside a while loop method is useful when data records is not always in the same order.

In this case, where your "output" format is fixed, it's better (faster) to use a per line parsing method (no while):

#!perl use strict; use warnings; use Data::Dumper; my %hash = (); $_ = <DATA>; /:\s(.*?)\s+\w+\s+:\s(.*?)\s*$/; $hash{'VLAN'} = $1; $hash{'STAT'} = $2; $_ = <DATA>; /:\s(.*?)\s+\w+\s+:\s(.*?)\s*$/; $hash{'FID'} = $1; $hash{'NAME'} = $2; $_ = <DATA>; /:\s(.*?)\s+\w+\s\w+:\s(.*?)\s*$/; $hash{'VTYPE'} = $1; $hash{'LASTM'} = $2; $_ = <DATA>; $_ = <DATA>; /^\s*(.*?)\s*$/; $hash{'EP'} = $1; $_ = <DATA>; $_ = <DATA>; /^\s*(.*?)\s*$/; $hash{'FEP'} = $1; $_ = <DATA>; $_ = <DATA>; /^\s*(.*?)\s*$/; $hash{'UP'} = $1; print Dumper( \%hash ); __DATA__ VLAN : 1 Status : Enabled FID : 1 Name : Some VLAN with spaces VLAN Type: Permanent Last change: 2009-08-31 16:48:45 Egress Ports: host.0.1 Forbidden Egress Ports: ge.3.39 Untagged Ports: host.0.1

This way, you can control other things like key names:

$VAR1 = { 'NAME' => 'Some VLAN with spaces', 'LASTM' => '2009-08-31 16:48:45', 'VTYPE' => 'Permanent', 'FID' => '1', 'VLAN' => '1', 'STAT' => 'Enabled', 'UP' => 'host.0.1', 'EP' => 'host.0.1', 'FEP' => 'ge.3.39' };

Thinking a bit more on my first script, I realize that the string can be also modified in the following way: add a new delimiter just before what we detect as a field name (words separated by exactly one space, before any colon followed by a space), then split:

#!perl use strict; use warnings; use Data::Dumper; my $string = ""; while(<DATA>) { $_ =~ s/\n/ /g; $string .= $_; } $string =~ s/((\w+\s)*\w+)\s*:\s+/\%$1%/g; $string =~ s/^\%(.*)/$1\%/; $string =~ s/\s+\%/\%/g; my %hash = split (/\%/, $string); print Dumper( \%hash ); __DATA__ VLAN : 1 Status : Enabled FID : 1 Name : Some VLAN with spaces VLAN Type: Permanent Last change: 2009-08-31 16:48:45 Egress Ports: host.0.1 Forbidden Egress Ports: ge.3.39 Untagged Ports: host.0.1
$VAR1 = { 'Last change' => '2009-08-31 16:48:45', 'Status' => 'Enabled', 'Forbidden Egress Ports' => 'ge.3.39', 'FID' => '1', 'VLAN' => '1', 'Untagged Ports' => 'host.0.1', 'Egress Ports' => 'host.0.1', 'Name' => 'Some VLAN with spaces', 'VLAN Type' => 'Permanent' };

I also changed the delimiter to another unused char, to differentiate it from the colon inside the time value when trimming out extra spaces.

BTW, this was fun!


Comment on Re^3: Parsing issue
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://794793]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2014-12-20 12:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (95 votes), past polls