http://www.perlmonks.org?node_id=466833

blackadder has asked for the wisdom of the Perl Monks concerning the following question:

Hi Guys;

I have data like this;
server_A, Perl, UK server_A, Word, UK server_A, Outlook, UK server_A, Excel, UK server_B, Reuters, NL server_B, TradeXL, NL server_B, Thompsons, NL server_B, Bloomberg, NL server_B, Tibco, NL server_c, BasketLink, USA server_c, Evolution, USA server_c, Lotus, USA server_c, TIB, USA server_A, Python, UK
And I need to produce this output;
server_A Perl UK Word UK Outlook UK Excel UK Python UK server_B Reuters NL TradeXL NL Thompsons NL Bloomberg NL Tibco NL server_c BasketLink USA Evolution USA Lotus USA TIB USA
So, I wrote this code;
#! c:/perl/bin/perl.exe use strict; open (LST,"c:/work/test_data.lst") || die "\n$!\n"; chomp (my @data_array = <LST>); my $snap_shot; my %seen; my @cleaned_data; foreach my $data ( @data_array) { my $rec; my ($server, @info) = split (/,/,$data); if (! $seen{$server}) { $rec->{Server_Name} = $server; print "$server\n"; $seen{$server}++; print "\t@info\n"; @{$rec->{Apps_Info}} = @info; } else { print "\t@info\n"; @{$rec->{Apps_Info}} = @info; } push (@cleaned_data, $rec); }
Basically I need help with the logic of the script. It works fine if all entries relating to one item are repeated consecutively. However if entries were not listed consecutively, then this script will not produce the required output. I suppose the easy way around it is to sort the list alphabetically first and then run the script. The way I thought of doing it is by storing and indexing server names in an array then looping through the array and the list to pluck out the related item, this approach was taking ages and involved a lot of coding. So I just wondered how would you or could this be achieved easier and faster but from within the code?

Thanks
Blackadder

Replies are listed 'Best First'.
Re: Extracting non-consecutive but related items from an array.
by robartes (Priest) on Jun 15, 2005 at 08:55 UTC

    This is just the thing hashes exist for:

    #Untested use strict; my %data; #assume file is opened as <INPUT> while (<INPUT>) { chomp; my ($server,@apps_info)=split /,/; push @{$data{$server}}, \@apps_info; } foreach my $server ( keys %data ) { print "$server\n\t"; for ( @{$data{$server}} ) { print join " ", split /,/; print "\n\t"; } }

    Basically, pull all of the data into a hash with the server name as keys.

    Update: Updated as per Tomtom's remark

    CU
    Robartes-

      Isn't it supposed to be $server instead of $_ ?
        Isn't it supposed to be $server instead of $_ ?

        Yup - that's why it says #Untested :). Thanks!

        CU
        Robartes-

        Yes, I think its $server too....Thanks guys.

        But, Can I have simple breakup of this code?
        print $_, $/, map { "\t@$_$/"} @{$clean_data{$_}} for sort keys %clean_data;
        And, why did we have to reference \ @apps_info? couldn't we use it without referencing?
        Blackadder
Re: Extracting non-consecutive but related items from an array.
by PodMaster (Abbot) on Jun 15, 2005 at 08:59 UTC
    Unless I'm overlooking something, you're overcomplicating it
    use strict; use warnings; my %clean_data; chomp (my @data_array = <DATA>); for my $data ( @data_array) { my ($server, @info) = split (/,/,$data); $data = \@info; push @{$clean_data{$server}}, $data; } #use Data::Dumper;die Dumper\%clean_data; print $_, $/, map { "\t@$_$/"} @{$clean_data{$_}} for sort keys %clean_data; __DATA__ server_A, Perl, UK server_A, Word, UK server_A, Outlook, UK server_A, Excel, UK server_B, Reuters, NL server_B, TradeXL, NL server_B, Thompsons, NL server_B, Bloomberg, NL server_B, Tibco, NL server_c, BasketLink, USA server_c, Evolution, USA server_c, Lotus, USA server_c, TIB, USA server_A, Python, UK

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

Re: Extracting non-consecutive but related items from an array.
by Anonymous Monk on Jun 15, 2005 at 09:35 UTC
    my %servers; while (<>) { chomp; next unless 3 == my ($server, $loc, $country) = split /,\s*/; push @{$servers{$server}}, [$loc, $country]; } local $" = " "; while (my ($server, $info) = each %servers) { print "$server\n"; print "\t@$_\n" foreach @$info; }
      #use strict; use FileHandle; # Global Declaration my($spool,%serverdet); open (spooler ,"data.dat") or die "Error opening the file\n"; while($spool=<spooler>) { chomp($spool); ($server,$dat1,$dat2)=split(/,/,$spool); if($serverdet{$server}) { $serverdet{$server} .= ";".$dat1.",".$dat2; }else{ $serverdet{$server} = $dat1.",".$dat2; } } foreach $key (sort keys %serverdet) { print "Server $key\n"; @val = split(/;/,$serverdet{$key}); foreach $val (sort @val) { print $val."\n"; } }
      Hope this helps!

      code tags added by holli

Re: Extracting non-consecutive but related items from an array.
by TedPride (Priest) on Jun 15, 2005 at 20:22 UTC
    use strict; use warnings; my %h; while (<DATA>) { chomp; @_ = split /, /, $_, 2; push @{$h{lc($_[0])}}, $_[1]; } for (sort keys %h) { print "$_\n"; print " $_\n" for sort @{$h{$_}}; print "\n"; } __DATA__ server_A, Perl, UK server_A, Word, UK server_A, Outlook, UK server_A, Excel, UK server_B, Reuters, NL server_B, TradeXL, NL server_B, Thompsons, NL server_B, Bloomberg, NL server_B, Tibco, NL server_c, BasketLink, USA server_c, Evolution, USA server_c, Lotus, USA server_c, TIB, USA server_A, Python, UK