Extracting non-consecutive but related items from an array.

blackadder has asked for the wisdom of the Perl Monks concerning the following question:

Hi Guys;

I have data like this;

server_A, Perl, UK
server_A, Word, UK
server_A, Outlook, UK
server_A, Excel, UK
server_B, Reuters, NL
server_B, TradeXL, NL
server_B, Thompsons, NL
server_B, Bloomberg, NL
server_B, Tibco, NL
server_c, BasketLink, USA
server_c, Evolution, USA
server_c, Lotus, USA
server_c, TIB, USA
server_A, Python, UK
[download]

And I need to produce this output;

server_A
         Perl  UK
         Word  UK
         Outlook  UK
         Excel  UK 
         Python  UK

server_B
         Reuters  NL
         TradeXL  NL
         Thompsons  NL
         Bloomberg  NL
         Tibco  NL

server_c
         BasketLink  USA
         Evolution  USA
         Lotus  USA
         TIB  USA
[download]

So, I wrote this code;

#! c:/perl/bin/perl.exe
use strict;
open (LST,"c:/work/test_data.lst") || die "\n$!\n";
chomp (my @data_array = <LST>);
my $snap_shot;
my %seen;
my @cleaned_data;
foreach my $data ( @data_array)
{
    my $rec;
    my ($server, @info) = split (/,/,$data);
    if (! $seen{$server})
    {
        $rec->{Server_Name} = $server;
        print "$server\n";
        $seen{$server}++;
        print "\t@info\n";
        @{$rec->{Apps_Info}} = @info;        
    }
    else
    {
        print "\t@info\n";
        @{$rec->{Apps_Info}} = @info;
    }    
    push (@cleaned_data, $rec);
}
[download]

Basically I need help with the logic of the script. It works fine if all entries relating to one item are repeated consecutively. However if entries were not listed consecutively, then this script will not produce the required output. I suppose the easy way around it is to sort the list alphabetically first and then run the script. The way I thought of doing it is by storing and indexing server names in an array then looping through the array and the list to pluck out the related item, this approach was taking ages and involved a lot of coding. So I just wondered how would you or could this be achieved easier and faster but from within the code?

Thanks

Blackadder

Comment on Extracting non-consecutive but related items from an array. Select or Download Code

Replies are listed 'Best First'.

Re: Extracting non-consecutive but related items from an array.
by robartes (Priest) on Jun 15, 2005 at 08:55 UTC

This is just the thing hashes exist for:

#Untested
use strict;
my %data;
#assume file is opened as <INPUT>
while (<INPUT>) {
  chomp;
  my ($server,@apps_info)=split /,/;
  push @{$data{$server}}, \@apps_info;
}
foreach my $server ( keys %data ) {
  print "$server\n\t";
  for ( @{$data{$server}} ) {
    print join " ", split /,/;
    print "\n\t";
  } 
}
[download]

Basically, pull all of the data into a hash with the server name as keys.

Update: Updated as per Tomtom's remark

CU
Robartes-

[reply]
[d/l]

Re^2: Extracting non-consecutive but related items from an array.

by Tomtom (Scribe) on Jun 15, 2005 at 08:59 UTC

Isn't it supposed to be $server instead of $_ ?

[reply]

Re^3: Extracting non-consecutive but related items from an array.

by robartes (Priest) on Jun 15, 2005 at 09:18 UTC

Isn't it supposed to be $server instead of $_ ?

Yup - that's why it says #Untested :). Thanks!

CU
Robartes-

[reply]

Re^3: Extracting non-consecutive but related items from an array.

by blackadder (Hermit) on Jun 15, 2005 at 09:29 UTC

print
    $_,
    $/,
    map { "\t@$_$/"} @{$clean_data{$_}}
    for
        sort
            keys %clean_data;
[download]

Blackadder

[reply]
[d/l]

Re^4: Extracting non-consecutive but related items from an array.

by robartes (Priest) on Jun 15, 2005 at 09:46 UTC

simple breakup of this code

by Anonymous Monk on Jun 15, 2005 at 13:43 UTC

Re: Extracting non-consecutive but related items from an array.
by PodMaster (Abbot) on Jun 15, 2005 at 08:59 UTC

use strict;
use warnings;

my %clean_data;
chomp (my @data_array = <DATA>);


for my $data ( @data_array)
{
    my ($server, @info) = split (/,/,$data);
    $data = \@info;
    push @{$clean_data{$server}}, $data;
}

#use Data::Dumper;die Dumper\%clean_data;
print
    $_,
    $/,
    map { "\t@$_$/"} @{$clean_data{$_}}
    for
        sort
            keys %clean_data;

__DATA__
server_A, Perl, UK
server_A, Word, UK
server_A, Outlook, UK
server_A, Excel, UK
server_B, Reuters, NL
server_B, TradeXL, NL
server_B, Thompsons, NL
server_B, Bloomberg, NL
server_B, Tibco, NL
server_c, BasketLink, USA
server_c, Evolution, USA
server_c, Lotus, USA
server_c, TIB, USA
server_A, Python, UK
[download]

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

[reply]
[d/l]

Re: Extracting non-consecutive but related items from an array.
by Anonymous Monk on Jun 15, 2005 at 09:35 UTC

my %servers;
while (<>) {
    chomp;
    next unless 3 == my ($server, $loc, $country) = split /,\s*/;
    push @{$servers{$server}}, [$loc, $country];
}
local $" = " ";
while (my ($server, $info) = each %servers) {
    print "$server\n";
    print "\t@$_\n" foreach @$info;
}
[download]

[reply]
[d/l]

Re^2: Extracting non-consecutive but related items from an array.

by sapnac (Beadle) on Jun 15, 2005 at 14:11 UTC

#use strict;
use FileHandle;

# Global Declaration 
my($spool,%serverdet);

open (spooler ,"data.dat") or  die "Error opening the  file\n";

while($spool=<spooler>) {
    chomp($spool);
    ($server,$dat1,$dat2)=split(/,/,$spool);
    if($serverdet{$server}) {
        $serverdet{$server} .= ";".$dat1.",".$dat2;
    }else{
        $serverdet{$server} = $dat1.",".$dat2;
    }
}

foreach $key (sort keys %serverdet) {
    print "Server $key\n";
    @val = split(/;/,$serverdet{$key});
    foreach $val  (sort @val)    {
     print $val."\n";
    }
}
[download]

code tags added by holli

[reply]
[d/l]

Re: Extracting non-consecutive but related items from an array.
by TedPride (Priest) on Jun 15, 2005 at 20:22 UTC

use strict;
use warnings;

my %h;
while (<DATA>) {
    chomp; @_ = split /, /, $_, 2;
    push @{$h{lc($_[0])}}, $_[1];
}
for (sort keys %h) {
    print "$_\n";
    print "         $_\n" for sort @{$h{$_}};
    print "\n";
}

__DATA__
server_A, Perl, UK
server_A, Word, UK
server_A, Outlook, UK
server_A, Excel, UK
server_B, Reuters, NL
server_B, TradeXL, NL
server_B, Thompsons, NL
server_B, Bloomberg, NL
server_B, Tibco, NL
server_c, BasketLink, USA
server_c, Evolution, USA
server_c, Lotus, USA
server_c, TIB, USA
server_A, Python, UK
[download]

[reply]
[d/l]

Back to Seekers of Perl Wisdom