Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
more useful options
 
PerlMonks  

Re: Joining separate data files to make one.

by BrowserUk (Pope)
on Oct 06, 2010 at 11:54 UTC ( #863763=note: print w/ replies, xml ) Need Help??


in reply to Joining separate data files to make one.

From your meagre description of the files, I assume that each files data is keyed by date & time?

If so, the rather than loading all the data into arrays, accumulate it in a hash:

my %data; open FILE, '<', 'gravity' or die; while( <FILE> ) { my @fields = split ' ', $_; $data{ @fields[ 0, 1 ] } = join "\t", @fields; } close FILE; open FILE, '<', 'magnetics' or die; while( <FILE> ) { my @fields = split ' ', $_; ## Pad the hash if we didn't see this date/time in the gravity fil +e $data[ "@fields[ 0, 1 ]" } //= join "\t", @fields[ 0,1 ], ('n/a') +x 3; $data{ "@fields[ 0, 1 ]" } .= join "\t", @fields[ 2 .. $#fields ]; } close FILE; open FILE, '<', 'bathymetry' or die; while( <FILE> ) { my @fields = split ' ', $_; ## Pad the hash if we didn't see this date/time before (How many field +s added by the magnetics?) $data[ "@fields[ 0, 1 ]" } //= join "\t", @fields[ 0,1 ], ('n/a') +x ???; $data{ "@fields[ 0, 1 ]" } .= join "\t", @fields[ 2 .. $#fields ]; } close FILE; for my $key ( sort keys %data ) { print $data{ $key }; }

Depending upon your date & time formats, you might need a more sophisticated sort.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Joining separate data files to make one.
Download Code
Re^2: Joining separate data files to make one.
by msexton (Initiate) on Oct 07, 2010 at 09:23 UTC

    Hi,

    Thanks for this, I learned a lot.

    Actually it took me nearly all day to figure it out. There are two typos ( [ should be { ). A reply after yours constructed some sample data files and I used them as input to your script. The only drawback with your script is that if one of the files ends before a later file, the "n/a" is not appended to the hash for the file before. Your script handles the situation where files start after the one before.

    I at least "understand" your script, but I am having trouble with the other one. From your script I can see how to add varying fields from each of the files (ie 4 from gravity and magnetics, 3 from bathymetry). With the other script, I can't see how to vary the number of fields to read.

      . There are two typos ( [ should be { ).

      Sorry. It was typed directly into the edit box and so was never tested. I apologise for that. I wanted to describe a viable alternative approach to the problem--and I find describing with code far more efficient and clear than using words. I was aware that it wasn't a complete working solution as posted.

      The only drawback with your script is that if one of the files ends before a later file, the "n/a" is not appended to the hash for the file before.

      I would handle that in the output loop. If when you come to write a record, it is "too short", pad it with the appropriate numbers of 'n/a's. Of course, as coded with concatenating strings, determining how much to add is a pain.

      You could split "\t" to get the field count, and the padding and the rejoin, but that would be a bit silly. Better to build up the records as (a hash of) arrays, pushing the fields as you go, and then just join them at the end. After padding if necessary.

      Something like:

      my %data; open FILE, '<', 'gravity' or die; while( <FILE> ) { my @fields = split ' ', $_; $data{ @fields[ 0, 1 ] } = \@fields; } close FILE; open FILE, '<', 'magnetics' or die; while( <FILE> ) { my @fields = split ' ', $_; ## Pad the hash if we didn't see this date/time in the gravity fil +e $data{ "@fields[ 0, 1 ]" } //= [ @fields[ 0,1 ], ('n/a') x 3 ]; push @{ $data{ "@fields[ 0, 1 ]" } }, @fields[ 2 .. $#fields ]; } close FILE; open FILE, '<', 'bathymetry' or die; while( <FILE> ) { my @fields = split ' ', $_; ## Pad the hash if we've never seen it before) ## (??? == No of fields added by the magnetics) $data{ "@fields[ 0, 1 ]" } //= [ @fields[ 0,1 ], ('n/a') x ( 3 + ? +?? ) ]; ## We saw it in gravity, but not magnetics. push @{ $data{ "@fields[ 0, 1 ]" } }, ('n/a') x ??? if @{ $data{ "@fields[ 0, 1 ]" } } < 3 + ???; push @{ $data{ "@fields[ 0, 1 ]" } }, @fields[ 2 .. $#fields ]; } close FILE; for my $key ( sort keys %data ) { my $nFields = @{ $data{ $key } }; ## Pad: ??? === total number of fields push @{ $data{ $key } }, ('n/a') x ( ??? - $nFields ); print join "\t", @{ $data{ $key } }; }

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Hi, once again

        I love your work

        I spent a fair part of the day on this and got it to work just as I wanted. Your contribution was great. The biggest problem I had was extracting the elements from the final hash to output them to the final desired file.

        I think I proved that an infinite number of monkeys typing on an infinite number of typewriters will eventually end up typing the Bible.

        The thing that amazes me is that I could never come up with the solution you did, even reading all the texts and going to courses. Every time I have read about hashes or attended a course, we end up using Barney Rubble, Fred Flintstone, etc as examples and then extracting their surname or a Bedrock phone number.

        One last question. The construct //= I can see what it does, but I cannot find a reference? Presumably the // is a match against null? Is that correct. I asked a couple of people at work who count themselves as good Perl programmers and even they said that they had seen nothing like it.

        Many thanks for your help. You should see the dog's breakfast that it replaced

        Mike

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://863763]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (19)
As of 2014-04-17 14:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (450 votes), past polls