Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Merge two files with similar column entries

by pratapsingh
on Apr 08, 2012 at 20:56 UTC ( #964028=perlquestion: print w/ replies, xml ) Need Help??
pratapsingh has asked for the wisdom of the Perl Monks concerning the following question:

Hi , I have few files which contains user name and data transfer rate in MBs and this data is collected for year and for each month report is saved in 12 different files I have to merge all the files to prepare the final report Files are as below
Filename1 : January #User Name #Data Transferred A. Paul 300004 Jason 600000 Mayur Pandey 40000 Kelly H 459000 Ryan M 349000
Filename2 : March #User Name #Data Transferred Senthl V R 600000 Mayur Pandey 40000 Kelly H 459000 Pratap S 349000 A. Paul 300004
Similarly I have 10 more files I have to merge all these files to one file and final output should be like below one:
Final Report: #Username #January .... #March ......... #December +#Total A. Paul 300004 300004 Not avail +able 600008 Pratap S Not availanble 30000 32000 + 7899887 Kelly H 459000 459000 459000 + 3424448274 Mayur Pandey 40000 400000 400000 + 242424442 Senthl V R Not available 600000 34544 + 53546464 Jason 600000 Not available 3434343 + 43434355
I need a perl script to automate this rather using excel or doing manually . These reports are generated monthly and names are not stored in sorted order. Names are stored in recent user who has used data transfer facility. And also there are occasions when a user have not used data transfer at all for particular month in this case name of the user will not appear in monthly report file. In this case I have to add not available in the month name column. Thank you

Comment on Merge two files with similar column entries
Select or Download Code
Re: Merge two files with similar column entries
by CountZero (Bishop) on Apr 08, 2012 at 21:39 UTC
    I need a perl script to automate this rather using excel or doing manually
    Perlmonks is not a "write my code for me" service.

    What have you tried and where did you have problems?

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Merge two files with similar column entries
by Anonymous Monk on Apr 09, 2012 at 00:36 UTC
Re: Merge two files with similar column entries
by Marshall (Prior) on Apr 09, 2012 at 21:17 UTC
    I'll attempt to give some big hints..
    I would suggest a HashOfArray (HoA) for the data structure. Each name is a hash key that points to an array of "monthly data"
    If a user name doesn't appear in the %Spreadsheet hash table, then create a new "blank" entry with 12 zeroes for the months - I just used 4 months to demo the technique.

    Your file format is space separated. I didn't use the most efficient technique, but it is a tool for your toolbox and it is straight-forward.

    I used Perl "here-docs" to represent the 2 files. That changes the code a bit.
    have fun and good luck!

    #!/usr/bin/perl -w use strict; use Data::Dump qw(pp); my %Month2Index = (January => 0, Febuary =>1, March =>2); my $january =<<END; #filename January A. Paul 300004 Jason 600000 Mayur Pandey 40000 Kelly H 459000 Ryan M 349000 END my $march =<<END; #filename March Senthl V R 600000 Mayur Pandey 40000 Kelly H 459000 Pratap S 349000 A. Paul 300004 END my %spreadsheet; # real code would have a file name which # states the month - I embedded the file name # into the heredoc variable for this example # a Perl variable can be opened just like a file # for reading (or even perhaps writing) # foreach my $fileref (\$january, \$march) { open my $file, '<', $fileref or die "$!"; my $comment = <$file>; #first line - throw away in real thing my $month_name = (split ' ',$comment)[-1]; my $month_index = $Month2Index{$month_name}; process_monthly_file ($file, $month_index); } sub process_monthly_file { my ($file, $month_index) = @_; while (<$file>) { s/^\s*//; #remove leading spaces # this is a space separated format, but we # want the last column and all of the columns # before that should be "squished into one column" # one way is to reverse the line, limit the # split and then reverse again. my $reversed = reverse $_; my ($data, $name) = split (' ', $reversed,2); # note: the split does an implicit "chomp" $data = reverse $data; $name = reverse $name; # if this name not seen before, create a new # hash entry with a blank array, here just 4 # columns (jan,feb,mar,april) $spreadsheet{$name} ||= [0,0,0,0]; #or perhaps... $spreadsheet{$name} ||= [qw(NA NA NA NA)]; #now enter the data into correct column # @{$spreadsheet{$name}}[$month_index] = $data; } } print pp \%spreadsheet; __END__ prints: { "A. Paul" => [300004, 0, 300004, 0], Jason => [600000, 0, 0, 0], "Kelly H" => [459000, 0, 459000, 0], "Mayur Pandey" => [40000, 0, 40000, 0], "Pratap S" => [0, 0, 349000, 0], "Ryan M" => [349000, 0, 0, 0], "Senthl V R" => [0, 0, 600000, 0], }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://964028]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (12)
As of 2014-09-01 08:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (299 votes), past polls