Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Regex to extract certain lines only from command output/text file.

by arnaud99 (Beadle)
on Mar 07, 2013 at 19:53 UTC ( #1022302=note: print w/ replies, xml ) Need Help??


in reply to Regex to extract certain lines only from command output/text file.

Reformatted the code and added a few comments so it looks cleaner.

use strict; use warnings; my @tmp; my @keep; while (my $a_line = <DATA>) { chomp $a_line; if ( $a_line =~ /^\d+\s+\D+/ ) { #we have a new host master record #process the data from the previous host process_previous(\@tmp); #on return @tmp may be empty @keep = (@keep, @tmp); @tmp =(); #now we empty it anyway } push @tmp, $a_line; } process_previous(\@tmp); @keep = (@keep, @tmp); @tmp=(); print "$_\n" for @keep; exit(0); #----------------- SUBS ---------------------------- sub process_previous { my $array_ref = shift; my $keep_this_data = 0; foreach my $elem(@$array_ref) { if ($elem !~ /\d:\d:\d$/ ) { #found aline NOT terminating in 3 digit, each separated by + a #colon, so we want to keep the whole info abou this host $keep_this_data = 1; last; } } if (!$keep_this_data) { #empty the array @$array_ref = (); } } __DATA__ 7 hostname12 Generic-legacy 10000000AB210ACF6 --- 10000000AB210ACF4 2:5:4 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2 23 srvernam Generic-legacy 5001438002A3004A 3:3:3 5001438002A3004A --- 5001438002A30048 2:3:3 5001438002A30048 2:5:2 5001438002A30048 2:5:2 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2 21 HOSTNAME Generic 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2


Comment on Re: Regex to extract certain lines only from command output/text file.
Download Code
Re^2: Regex to extract certain lines only from command output/text file.
by Anonymous Monk on Mar 08, 2013 at 01:02 UTC

    Neither did your second "approach" solve the OP problems in totality. See the output

    7 hostname12 Generic-legacy 10000000AB210ACF6 --- 10000000AB210ACF4 2:5:4 10000000AB210ACF4 2:3:4 10000000AB210ACF6 3:5:4 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2 20 hostname14 Generic-legacy 10000000AB2A3000C --- 10000000AB2A3000E 3:3:1 21 HOSTNAME Generic 23 srvernam Generic-legacy 5001438002A3004A 3:3:3 5001438002A3004A --- 5001438002A30048 2:3:3 5001438002A30048 2:5:2 5001438002A30048 2:5:2

Re^2: Regex to extract certain lines only from command output/text file.
by arnaud99 (Beadle) on Mar 08, 2013 at 03:07 UTC

    Hi

    Thanks for your comments. The error is due to an empty line at the end of the __DATA__section. I must have added it when I pasted the code.

    Since the empty line did not match /\d:\d:\d$/, the data for that host was considered to be worth keeping.

    I trust the issue is now sorted. I Simply added

    next if $a_line =~ /^\s*$/; #ignore empty lines

    after reading each line.

    Here is the full code, with the extra line check, and the empty __DATA__ line.

    use strict; use warnings; use 5.010; my @tmp; my @keep; while (my $a_line = <DATA>) { next if $a_line =~ /^\s*$/; #ignore empty lines chomp $a_line; if ( $a_line =~ /^\d+\s+\D+/ ) { #we have a new host master record #process the data from the previous host process_previous(\@tmp); #on return @tmp may be empty @keep = (@keep, @tmp); @tmp =(); #now we empty it anyway } push @tmp, $a_line; } process_previous(\@tmp); @keep = (@keep, @tmp); @tmp=(); print "$_\n" for @keep; exit(0); #----------------- SUBS ---------------------------- sub process_previous { my $array_ref = shift; my $keep_this_data = 0; foreach my $elem(@$array_ref) { if ($elem !~ /\d:\d:\d$/ ) { #found a line NOT terminating in 3 digit, each separated b +y a #colon, so we want to keep the whole info abou this host $keep_this_data = 1; last; } } if (!$keep_this_data) { #empty the array @$array_ref = (); } } __DATA__ 7 hostname12 Generic-legacy 10000000AB210ACF6 --- 10000000AB210ACF4 2:5:4 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2 23 srvernam Generic-legacy 5001438002A3004A 3:3:3 5001438002A3004A --- 5001438002A30048 2:3:3 5001438002A30048 2:5:2 5001438002A30048 2:5:2 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2 21 HOSTNAME Generic 9 hostname13 Generic 10000000AB2A3006A 3:5:2 10000000AB2A30068 2:5:2

    Kind regards

    Arnaud.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1022302]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2014-10-02 00:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (41 votes), past polls