Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^3: Parsing issue

by vitoco (Friar)
on Sep 11, 2009 at 14:43 UTC ( #794793=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Parsing issue
in thread Parsing issue

I forgot to mention in my previous post that the if (pat1) {} elsif (pat2) {} elsif ... inside a while loop method is useful when data records is not always in the same order.

In this case, where your "output" format is fixed, it's better (faster) to use a per line parsing method (no while):

#!perl use strict; use warnings; use Data::Dumper; my %hash = (); $_ = <DATA>; /:\s(.*?)\s+\w+\s+:\s(.*?)\s*$/; $hash{'VLAN'} = $1; $hash{'STAT'} = $2; $_ = <DATA>; /:\s(.*?)\s+\w+\s+:\s(.*?)\s*$/; $hash{'FID'} = $1; $hash{'NAME'} = $2; $_ = <DATA>; /:\s(.*?)\s+\w+\s\w+:\s(.*?)\s*$/; $hash{'VTYPE'} = $1; $hash{'LASTM'} = $2; $_ = <DATA>; $_ = <DATA>; /^\s*(.*?)\s*$/; $hash{'EP'} = $1; $_ = <DATA>; $_ = <DATA>; /^\s*(.*?)\s*$/; $hash{'FEP'} = $1; $_ = <DATA>; $_ = <DATA>; /^\s*(.*?)\s*$/; $hash{'UP'} = $1; print Dumper( \%hash ); __DATA__ VLAN : 1 Status : Enabled FID : 1 Name : Some VLAN with spaces VLAN Type: Permanent Last change: 2009-08-31 16:48:45 Egress Ports: host.0.1 Forbidden Egress Ports: ge.3.39 Untagged Ports: host.0.1

This way, you can control other things like key names:

$VAR1 = { 'NAME' => 'Some VLAN with spaces', 'LASTM' => '2009-08-31 16:48:45', 'VTYPE' => 'Permanent', 'FID' => '1', 'VLAN' => '1', 'STAT' => 'Enabled', 'UP' => 'host.0.1', 'EP' => 'host.0.1', 'FEP' => 'ge.3.39' };

Thinking a bit more on my first script, I realize that the string can be also modified in the following way: add a new delimiter just before what we detect as a field name (words separated by exactly one space, before any colon followed by a space), then split:

#!perl use strict; use warnings; use Data::Dumper; my $string = ""; while(<DATA>) { $_ =~ s/\n/ /g; $string .= $_; } $string =~ s/((\w+\s)*\w+)\s*:\s+/\%$1%/g; $string =~ s/^\%(.*)/$1\%/; $string =~ s/\s+\%/\%/g; my %hash = split (/\%/, $string); print Dumper( \%hash ); __DATA__ VLAN : 1 Status : Enabled FID : 1 Name : Some VLAN with spaces VLAN Type: Permanent Last change: 2009-08-31 16:48:45 Egress Ports: host.0.1 Forbidden Egress Ports: ge.3.39 Untagged Ports: host.0.1
$VAR1 = { 'Last change' => '2009-08-31 16:48:45', 'Status' => 'Enabled', 'Forbidden Egress Ports' => 'ge.3.39', 'FID' => '1', 'VLAN' => '1', 'Untagged Ports' => 'host.0.1', 'Egress Ports' => 'host.0.1', 'Name' => 'Some VLAN with spaces', 'VLAN Type' => 'Permanent' };

I also changed the delimiter to another unused char, to differentiate it from the colon inside the time value when trimming out extra spaces.

BTW, this was fun!


Comment on Re^3: Parsing issue
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://794793]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (11)
As of 2015-07-07 20:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls