Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

parse only one sheet at time In Spreadsheet::ParseExcel

by cibien (Novice)
on Jan 30, 2013 at 22:38 UTC ( [id://1016168]=perlquestion: print w/replies, xml ) Need Help??

cibien has asked for the wisdom of the Perl Monks concerning the following question:

Goodmorning Monks, In Spreadsheet::ParseExcel, how I can parse one big excel file(with a lot of sheet) and parse only one sheet at time, for reducing the memory usage and speed up parsing?
#!/usr/bin/perl -w use strict; use Spreadsheet::ParseExcel; my $parser = Spreadsheet::ParseExcel->new( CellHandler => \&cell_handler, NotSetCell => 1 ); my $workbook = $parser->parse('file.xls'); sub cell_handler { my $workbook = $_[0]; my $sheet_index = $_[1]; my $row = $_[2]; my $col = $_[3]; my $cell = $_[4]; #### thise 'if' parse only the first sheet if ( $sheet_index > 1) { $workbook->ParseAbort(1); return; } } ### I think must add up something like this for(my $iSheet=0; $iSheet < $workbook->{SheetCount} ; $iSheet++) { my $oWkS = $workbook->{Worksheet}[$iSheet];
can you help me? thanks

Replies are listed 'Best First'.
Re: parse only one sheet at time In Spreadsheet::ParseExcel
by Kenosis (Priest) on Jan 30, 2013 at 23:39 UTC

    Here are two examples, adapted from the Spreadsheet::ParseExcel documentation, showing how to iterate through all sheets and to access just one sheet:

    use strict; use warnings; use Spreadsheet::ParseExcel; my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse('Book1.xls'); if ( !defined $workbook ) { die $parser->error(), ".\n"; } # Iterate through all worksheets, show value at A1 for my $worksheet ( $workbook->worksheets() ) { next unless my $cell = $worksheet->get_cell( 0, 0 ); # row, col print 'Worksheet: ' . $worksheet->get_name() . '; A1: ' . $cell->value() . "\n"; } print "\n"; # Get only one sheet, show value at A1 if ( my $worksheet = $workbook->worksheet('Sheet5') ) { # Can be in +dex 0-4, in this example if ( my $cell = $worksheet->get_cell( 0, 0 ) ) { # row, col print 'Worksheet: ' . $worksheet->get_name() . '; A1: ' . $cell->value() . "\n"; } }

    Output:

    Worksheet: Sheet1; A1: 42 Worksheet: Sheet2; A1: 21 Worksheet: Sheet3; A1: 17 Worksheet: Sheet4; A1: 58 Worksheet: Sheet5; A1: 100 Worksheet: Sheet5; A1: 100

    Hope this helps!

Re: parse only one sheet at time In Spreadsheet::ParseExcel
by Neighbour (Friar) on Jan 31, 2013 at 07:29 UTC

      Use Spreadsheet::ParseExcel::Stream which does not parse entire document to the memory and creates no memory overhead.

      For more information check the below link:

      https://metacpan.org/module/Spreadsheet::ParseExcel::Stream

      Happy Programming.

        Thing is, it *does* parse the entire document to memory and yet it doesn't :). It loads the binary OLE-object and parses it. This creates a bit of memory overhead. However, this is (much) less memory than Spreadsheet::ParseExcel uses.
        The difference is in the fact that it doesn't *keep* your entire document in memory. As soon as you've read a row or sheet, it is removed from memory. There's also another bunch of things that it doesn't do with data you haven't read yet from the stream, but I don't know the details of exactly what all that is.
        Bottom line: Spreadsheet::ParseExcel::Stream is not perfect, but it's a whole lot better concerning memory usage compared to Spreadsheet::ParseExcel.

        Please see the memory benchmarking results comparing Spreadsheet::ParseExcel::Stream against Spreadsheet::ParseExcel.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1016168]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2024-04-19 12:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found