Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

File Manipulation

by Mark.Allan (Sexton)
on Aug 23, 2013 at 10:44 UTC ( #1050631=perlquestion: print w/replies, xml ) Need Help??
Mark.Allan has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks

In a file I have a text in the format

[server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log

What I would like to do is read each server from the file and associate the logs to that server. For example

server1:/tmp/location1/file.log,tmp/location2/file.log server2:/usr/loc1/file.log,/usr/loc2/file.log server3:/citrix/dir3/file.log

I just need to know the best way to map the server to the logs, probably via hash

Code so far

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @lines; my @data; open (my $file, '<', 'log') or die $!; while (my $line = <$file>) { if ($line =~ /\[(.*)\]/) { push (@lines, $1); next; } if ($line =~ /(\/.*)/) { push (@data, $1); next; } } my $val = join(",",@data); close $file; foreach(@lines){ print "$_:$val\n";

Thanks in advance

Replies are listed 'Best First'.
Re: File Manipulation
by hdb (Monsignor) on Aug 23, 2013 at 11:04 UTC

    I would use a simple array of arrays where the first element of each sub-array is the server, like this:

    use strict; use warnings; my @logs; while(<DATA>){ push @logs, [ $1 ] if /\[(.*)\]/; push @{$logs[-1]}, $1 if /^(\/.*)/; } print shift @$_, ":", join( ",", @$_ ), "\n" for @logs; __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log
Re: File Manipulation
by kcott (Chancellor) on Aug 23, 2013 at 11:23 UTC

    G'day Mark.Allan,

    Assuming your input file isn't so large that reading all its data at once causes memory issues, you can do something like this:

    $ perl -Mstrict -Mwarnings -Mautodie -e ' open my $fh, "<", "./pm_1050631_in.txt"; my $data = do { local $/; <$fh> }; close $fh; my %server; my $re = qr{\[(\w+)\]\s+([^[]*)}; while ($data =~ /$re/g) { $server{$1} = join "," => split /\s+/ => $2; } for (sort keys %server) { print "$_:$server{$_}\n"; } ' server1:/tmp/location1/file.log,/tmp/location2/file.log server2:/usr/loc1/file.log,/usr/loc2/file.log server3:/citrix/dir3/file.log

    -- Ken

Re: File Manipulation
by 2teez (Vicar) on Aug 23, 2013 at 12:46 UTC

    You have been given great solutions, but in the spirit of "tim today", you could also check this (a somewhat modifications to the solutions already given):

    use strict; use warnings; my %logger; my $key; while(<DATA>){ s/\s+$//; if(/\[(.*)\]/){ $key = $1; }else{push @{$logger{$key}}, $_;} } print $_,":", join ("," => @{$logger{$_}}),$/ for sort {$a cmp $b} keys %logger; __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log
    Since, the key only changes, when the name of server is seen, and that until the next one. It works perfectly well.
    You could also see perldsc

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: File Manipulation
by ww (Archbishop) on Aug 23, 2013 at 10:58 UTC
    Try reading about the input separator, $/, which is explained in under the second subhead, "The Field Record Separators."

    Take note of the fact, however, that you can't use a regex to set the value.

    Update added sample code (below)

    #!/usr/bin/perl use 5.016; use warnings; use Data::Dumper; my ($para, @para, $val, @val); my $serverid = ''; local $/ = "[server"; while ($para = <DATA>) { chomp $para; if ( $para =~ /(\d+\])(.*)(?:\[server)*/s ) { chomp $1; $serverid = $1; push (@para, "server" . "$serverid: "); $val = $2; if ( $val =~ /^\n(.*)/s ) { $val = $1; } $val =~ s/\n/, /gs; push (@val, $val); } } my $i; for $i( 0 .. $#para ) { $para[$i] =~ s/[\]]//; # get rid of square brackets +(if you must) say "$para[$i]$val[$i]"; } =head OUTPUT: C:\> server1: /tmp/location1/file.log, /tmp/location2/file.log, server2: /usr/loc1/file.log, /usr/loc2/file.log, server3: /citrix/dir3/file.log, , server17: /etc/bin/dat/file.log, /etc/misc/logs/files3.log, , =cut __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log [server17] /etc/bin/dat/file.log /etc/misc/logs/files3.log [server0Xff] /won't be seen/file.log /nor/this/file.log /because/server_name/does not match regex in Ln13

    :-(   ... and now, even more belatedly, I see johngg beat me to it, in time and elegance! ++

    My apologies to all those electrons which were inconvenienced by the creation of this post.
Re: File Manipulation
by johngg (Abbot) on Aug 23, 2013 at 13:26 UTC

    Just to bring another bottle to the party and to take up ww's suggestion on input record separator.

    $ perl -Mstrict -Mwarnings -MData::Dumper -e ' open my $inFH, q{<}, \ <<EOD or die $!; [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log EOD my %assoc; { local $/ = q{[}; scalar <$inFH>; # Get rid of first '[' while ( <$inFH> ) { chomp; my( $server, $fileStr ) = split m{]\n}; $assoc{ $server } = [ split m{\n}, $fileStr ]; } } print Data::Dumper->Dumpxs( [ \ %assoc ], [ qw{ *assoc } ] );' %assoc = ( 'server3' => [ '/citrix/dir3/file.log' ], 'server2' => [ '/usr/loc1/file.log', '/usr/loc2/file.log' ], 'server1' => [ '/tmp/location1/file.log', '/tmp/location2/file.log' ] ); $

    I hope this is helpful.



Re: File Manipulation
by sundialsvc4 (Abbot) on Aug 23, 2013 at 13:12 UTC

    Adding my personal “toady” to this, I view such problems in an awk-like sort of way.   There are two “kinds of” lines here:   “those that look like [servername],” and, in the simplest case, “those that don’t.”  There is one thing to be done in each case.

    The data-structure of choice is a hashref, whose elements are arrayrefs containing the file-names.   Perl’s “auto-vivification” feature does, as intended, most of the work, viz:

    (extemporaneous coding follows ... your syntax may vary ... stripped to the bare parts for clarity)

    my $server_name; my $results; while (my $line = <$file>) { if ($line =~ /\[(.*)\]/) { $server_name = $1; } elsif ($line =~ /(\/.*)/) { die "file doesn't begin with servername line!" unless defined($server_name); push @{ $results->{$server_name} }, $1; } } foreach my $k (keys $results) { print "$k: " . join(" ", @{ $results->{$k} } ) . "\n"; }

    Notice how, in the push statement, we simply rely upon Perl to create a new hash-bucket, if one does not yet exist, and to treat the whole thing as an arrayref upon which we can push things.   This is the “auto-vivification” of which I was speaking.   Notice that the program will die if it detects (and that it does look for ...) that the first line in the file is not a server-name record.   The other bits of writing things on multiple source-lines and so forth are just my personal style.

Re: File Manipulation
by Laurent_R (Canon) on Aug 23, 2013 at 18:10 UTC

    If your file is as nicely ordered as the sample you have shown, you probably don't even need any data structure but can print as you read the lines. Something like this:

    use strict; use warnings; my $line; while (<DATA>) { chomp; if (/\[(server\d+)\]/) { print $line, "\n" if defined $line; $line = $1 . ": "; } else { $line .= $_; $line .= ','; } } print $line, "\n"; __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log


    $ perl server1: /tmp/location1/file.log,/tmp/location2/file.log, server2: /usr/loc1/file.log,/usr/loc2/file.log, server3: /citrix/dir3/file.log,

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1050631]
Approved by hdb
Front-paged by Corion
and the monks are chillaxin'...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2018-07-19 00:37 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (398 votes). Check out past polls.