Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

How can one merge multiple excel files sheet-wise?

by supriyoch_2008 (Monk)
on Mar 29, 2020 at 13:45 UTC ( #11114773=perlquestion: print w/replies, xml ) Need Help??

supriyoch_2008 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perlmonks,

I am interested in merging multiple (3) excel files sheet-wise i.e. cereal.xls, pulse.xls and fruit.xls,each having two sheets i.e. sheet1 and sheet2, into a single outputfile like result.xls. The sheet1 of result file must contain the contents of sheet1 of 3 input files in sequential rows. Similarly, the sheet2 of result file must contain the contents of sheet2 of 3 input files. While searching the web I have come across a program, given below, to merge excel files but the result file obtained from this program contains the values of sheet1 of 3 input files as sheet1, sheet2 and sheet3; rather the contents of sheet1, sheet2 and sheet3 should be in sheet1 of result.xls. I am using a Windows 7 64-bit pc and comfortable with .xls and not .xlsx. I have tried to get the desired output but failed. I request Perlmonks to suggest me how to sort out this problem.

#!/usr/bin/perl use warnings; use strict; use Spreadsheet::ParseExcel; use Spreadsheet::WriteExcel; my $parser = Spreadsheet::ParseExcel->new(); #Creating Object to Read +Worksheet my @files = qw/ cereal.xls pulse.xls fruit.xls /; my $write_workbook = new Spreadsheet::WriteExcel('Result.xls'); my $n = 1; foreach my $file (@files) { my $parse_workbook = $parser->parse($file); my $write_worksheet = $write_workbook->add_worksheet("Sheet$n"); if ( !defined $parse_workbook) { die $parser->error(), ".\n"; } for my $parse_worksheet ( $parse_workbook->worksheets() ) { my ( $row_min, $row_max ) = $parse_worksheet->row_range(); my ( $col_min, $col_max ) = $parse_worksheet->col_range(); for my $row ( $row_min .. $row_max ) { for my $col ( $col_min .. $col_max ) { my $cell = $parse_worksheet->get_cell( $row, $col ); next unless $cell; my $cell_value = $cell->value(); my $cell_format = $cell->get_format(); $write_worksheet->write($row, $col, $cell_value); } } } $n++; } print "\n\n ####################################### Result produced\n #######################################\n"; exit;
The contents of cereal.xls are as follows: Sheet1 No. Name Sugar (g/100g) 1 Wheat 0.3 2 Maize 7.7 3 Rice 0.1 Sheet2 No. Name Protein (g/100g) 1 Wheat 10 2 Maize 7 3 Rice 2.6
The contents of pulse.xls are as follows: Sheet1 No. Name Sugar (g/100g) 1 Pea 6 2 Chickpea 11 3 Pigeonpea 1.8 Sheet2 No. Name Protein (g/100g) 1 Pea 5 2 Chickpea 8.4 3 Pigeonpea 7.6
The contents of fruit.xls are as follows: Sheet1 No. Name Sugar (g/100g) 1 Apple 10 2 Pear 10 3 Pineapple 10 Sheet2 No. Name Protein (g/100g) 1 Apple 0.3 2 Pear 0.4 3 Pineapple 0.5
The desired result.xls file should look like: Sheet1 No. Name Sugar (g/100g) 1 Wheat 0.3 2 Maize 7.7 3 Rice 0.1 1 Pea 6 2 Chickpea 11 3 Pigeonpea 1.8 1 Apple 10 2 Pear 10 3 Pineapple 10
Sheet2 No. Name Protein (g/100g) 1 Wheat 10 2 Maize 7 3 Rice 2.6 1 Pea 5 2 Chickpea 8.4 3 Pigeonpea 7.6 1 Apple 0.3 2 Pear 0.4 3 Pineapple 0.5

Replies are listed 'Best First'.
Re: How can one merge multiple excel files sheet-wise?
by roboticus (Chancellor) on Mar 29, 2020 at 15:56 UTC

    supriyoch_2008:

    That should be easy enough. One way to do it would be to read all the data from your worksheets into a hash structure. Then after reading all your data, loop over your hash structure and build the new workbook.

    Alternatively, you could build a hash to hold your output sheets and next available row number. Then, as you process each input sheet, access that hash to fetch the output sheet and next available row to use. It'll be essentially the same a the code you have, but with some logic to create and/or fetch the sheet, something like:

    if (! exists $OutSheets{$SheetName}) { # We don't have that sheet yet, create a new one my $Sheet = $OutWB->add_worksheet($SheetName); $OutSheetCache{$SheetName}{Sheet} = $Sheet; $OutSheetCache{$SheetName}{NextRow} = 0; } # Fetch the sheet and next available row from last time my $OutSheet = $OutSheetCache{$SheetName}{Sheet}; my $OutRow = $OutSheetCache{$SheetName}{NextRow}; . . . copy sheet data ... # Save the next available row for next time... $OutSheetCache{$SheetName}{NextRow}= $OutRow;

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Hi roboticus,

      Thank you for your suggestions. I shall learn how to use hash and try to sort out the problem.

      With regards,

        Hi Perlmonks

        This has reference to my earlier question on merging of multiple excel files sheet-wise. Since the use of hash is difficult for me, I have written a script wx.pl to produce the result.xls file. The script runs well in cmd and shows all cell values for sheet 1 from three input excel files. But the output file i.e. result.xls contains only the cell values of sheet 1 from cereal.xls file and not others. I need your suggestions and guidance to solve this problem so that result.xls contains the cell values of all three files sheet-wise.

        I have given the code of wx.pl in the following:

        # Name: wx.pl #!/usr/bin/perl use warnings; use strict; use OLE::Storage_Lite; use Spreadsheet::ParseExcel; use Spreadsheet::WriteExcel; use Digest::MD5; # Create a new Excel workbook my $workbookx = Spreadsheet::WriteExcel->new('result.xls'); # Add a worksheet my $worksheet1x = $workbookx->add_worksheet(); ############################################# my @arr=qw/cereal.xls pulse.xls fruit.xls/; my $row_min=0; my $row_max=11000; # out of 65536 rows in xls my $col_min=0; my $col_max=255; # out of 256 columns in xls print "\n row_min: $row_min; row_max: $row_max col_min: $col_min; col_max: $col_max\n"; ######################################### foreach my $item (@arr) { # foreach LOOP starts my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse($item); if ( !defined $workbook ) { die $parser->error(), ".\n"; } my $worksheet1=$workbook->worksheet('Sheet1'); # say from sheet +1 for my $row ( $row_min .. $row_max ) { # for LOOP2 starts for my $col ( $col_min .. $col_max ) { # for LOOP3 starts my $cell = $worksheet1->get_cell( $row, $col ); next unless $cell; print "\n"; print "### $item\n"; print " Row, Column = ($row, $col)\n"; my $value=$cell->value(); print " value: $value\n"; $worksheet1x->write($row,$col,$value); } # for LOOP3 ends } # for LOOP2 ends } # foreach LOOP ends ################### print "\n ###### Over ##########\n"; exit;

        Here goes the result in cmd for wx.pl:

        row_min: 0; row_max: 11000 col_min: 0; col_max: 255 ### cereal.xls Row, Column = (0, 0) value: No. ### cereal.xls Row, Column = (0, 1) value: Name ### cereal.xls Row, Column = (0, 2) value: Sugar (g /100g) ### cereal.xls Row, Column = (1, 0) value: 1 ### cereal.xls Row, Column = (1, 1) value: Wheat ### cereal.xls Row, Column = (1, 2) value: 0.3 ### cereal.xls Row, Column = (2, 0) value: 2 ### cereal.xls Row, Column = (2, 1) value: Maize ### cereal.xls Row, Column = (2, 2) value: 7.7 ### cereal.xls Row, Column = (3, 0) value: 3 ### cereal.xls Row, Column = (3, 1) value: Rice ### cereal.xls Row, Column = (3, 2) value: 0.1 ### pulse.xls Row, Column = (0, 0) value: No. ### pulse.xls Row, Column = (0, 1) value: Food ### pulse.xls Row, Column = (0, 2) value: Sugar (g /100g) ### pulse.xls Row, Column = (1, 0) value: 1 ### pulse.xls Row, Column = (1, 1) value: Pea ### pulse.xls Row, Column = (1, 2) value: 6 ### pulse.xls Row, Column = (2, 0) value: 2 ### pulse.xls Row, Column = (2, 1) value: Chickpea ### pulse.xls Row, Column = (2, 2) value: 11 ### pulse.xls Row, Column = (3, 0) value: 3 ### pulse.xls Row, Column = (3, 1) value: Pigeonpea ### pulse.xls Row, Column = (3, 2) value: 1.8 ### fruit.xls Row, Column = (0, 0) value: No. ### fruit.xls Row, Column = (0, 1) value: Name ### fruit.xls Row, Column = (0, 2) value: Sugar (g /100g) ### fruit.xls Row, Column = (1, 0) value: 1 ### fruit.xls Row, Column = (1, 1) value: Apple ### fruit.xls Row, Column = (1, 2) value: 10.12 ### fruit.xls Row, Column = (2, 0) value: 2 ### fruit.xls Row, Column = (2, 1) value: Pear ### fruit.xls Row, Column = (2, 2) value: 10 ### fruit.xls Row, Column = (3, 0) value: 3 ### fruit.xls Row, Column = (3, 1) value: Pineapple ### fruit.xls Row, Column = (3, 2) value: 10.0009 ###### Over ##########

        Hi roboticus,

        This has reference to my earlier post. I have not been successful in bringing several excel files in a single excel file sheet-wise. I need your help to solve this problem.

        supriyoch_2008

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11114773]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2020-07-10 09:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?