Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Consolidating data from lots of different sources

by blackadder (Hermit)
on Jun 08, 2005 at 21:42 UTC ( #464850=perlquestion: print w/replies, xml ) Need Help??
blackadder has asked for the wisdom of the Perl Monks concerning the following question:

Dear Gods

I wonder if you have a moment.

I have a data sheet containing the following column headers;
Srv_Name SAN_Info Criticality_rating Commission Date
The data inside the table is of this nature;
Srv_0A,xxxxxxxxx,,xxxxxxxxxxx Srv_0A,,, Srv_0G,,, Srv_0B,xxxxxxxxxx,,xxxxxxxxx Srv_0A,,,xxxxxxxxxxxxxx Srv_0G,xxxxxxxxxxx,xxxxxxxxxxxxxx, Srv_0B,,xxxxxxxxxxxxxx,
And I need to consolidate the information pertaining to one entry in a single distinct record. Producing an output such as that;
Srv_0A,xxxxxxxxxxxxx,xxxxxxxxxxxx,xxxxxxxxxxxxxxxxxx Srv_0G,xxxxxxxxxxxxx,xxxxxxxxxxxx, Srv_0B, xxxxxxxxxxxxx,xxxxxxxxxxxx,xxxxxxxxxxxxxxxxxx
And All the entries that were not matched with any thing will be stored else were. I have spend a lot of thinking time but havenít managed to come up with any ideas.

Your divine intervention is highly regarded....Thanks.

Replies are listed 'Best First'.
Re: Consolidating data from lots of different sources
by Zaxo (Archbishop) on Jun 08, 2005 at 21:57 UTC

    A hash of hashes will do that nicely,

    my %hoh; my @labels = qw/SAN_Info Criticality_rating Commission_Date/; while (<>) { my ($key, @dat) = split ','; $hoh{$key}{$labels[$_] ||= $dat[$_] for 0..$#labels; }
    The ||= assignment prevents a filled slot from being overwritten by undef.

    After Compline,

Re: Consolidating data from lots of different sources
by ikegami (Pope) on Jun 08, 2005 at 21:54 UTC

    Your data contains

    Srv_0A,xxxxxxxxx,,xxxxxxxxxxx Srv_0A,,, Srv_0A,,,xxxxxxxxxxxxxx

    Should that be

    Srv_0A,xxxxxxxxx,,xxxxxxxxxxx Srv_0A,,, Srv_0A,,xxxxxxxxxxxxxx,

    If so,

    use strict; use warnings; my %data; while (<DATA>) { s/^\s+//; s/\s+$//; my @fields = split(/\s*,\s*/); my $server = shift(@fields); $data{$server} ||= ['', '', '']; # Merge data. foreach (0..$#fields) { $data{$server}[$_] = $fields[$_] if length $fields[$_]; } } print(join(',', $_, @{$data{$_}}), "\n") foreach keys(%data); __DATA__ Srv_0A,Axxxxxxxx,,Bxxxxxxxxxx Srv_0A,,, Srv_0G,,, Srv_0B,Cxxxxxxxxx,,Dxxxxxxxx Srv_0A,,Exxxxxxxxxxxxx, Srv_0G,Fxxxxxxxxxx,Gxxxxxxxxxxxxx, Srv_0B,,Hxxxxxxxxxxxxx,

    If not, how do you merge the two Commission Dates?

Re: Consolidating data from lots of different sources
by Transient (Hermit) on Jun 08, 2005 at 21:47 UTC
    Is this a file or a database? What have you actually tried? split? push? Hashes of Arrays?
Re: Consolidating data from lots of different sources
by tlm (Prior) on Jun 08, 2005 at 21:52 UTC

    The following code assumes that the different data lines are consistent (i.e. for a given ID there is a unique non-empty value for each field):

    use strict; use warnings; my %data; while ( <DATA> ) { my ( $id, @fields ) = split /\s*,\s*/, $_, -1; for my $i ( 0..$#fields ) { $data{ $id }[ $i ] = $fields[ $i ] if length $fields[ $i ]; } } __END__

    Update:I've attempted to fix one of the problems pointed out by ikegami below (specifically, added the -1 argument to split). I still focus only on the problem of consolidating from various lines, which I think is what the OP was having problems with, and ingore the issue of output.

    the lowliest monk

      Your solution doesn't output anything.

      It leaves (any trailing whitespace including the) newlines in the data.

      And only creates two fields for Srv_0G, causing the trailing comma to be omitted from the output (which I suppose could be fixed at print-time).

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://464850]
Approved by Zaxo
[shmem]: that's why it is called "Windows"
[stevieb]: shmem thanks for the 'insight' :P
[shmem]: good thing that Sun already took "OpenWindows", otherwise I'd not stop to shudder imagining an "OpenWindows" from MS
[shmem]: more garbage in, more garbage out that would be
[stevieb]: I found that win10 broke a C# library I was using for one project while enhancing tests for a Perl dist, which breaks other Perl dists, and I'm about to throw my hands up on berrybrew. win2k12 broke one thing, win10 breaks something...
[stevieb]: ...unrelated which requires replacing a lot of code and a whole lib. I'm about to go nix only ffs
[shmem]: stevieb: what you're doing sounds afwully complex. Too much for me this evening to provide brighter insight ;-)
[stevieb]: I don't even own a Windows computer. Both my girl and I have a laptop each with Linux. I'm supporting Windows in some of my projects and I can't even guage whether it's worth it or not.
[stevieb]: shmem It's something I desired to have years ago, which is why I took over berrybrew. Cross-platform build/test automation locally, or over the network Test::BrewBuild

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2017-03-28 22:04 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (342 votes). Check out past polls.