Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: parse only one sheet at time In Spreadsheet::ParseExcel

by Neighbour (Friar)
on Jan 31, 2013 at 07:29 UTC ( #1016240=note: print w/ replies, xml ) Need Help??

Comment on Re: parse only one sheet at time In Spreadsheet::ParseExcel
Re^2: parse only one sheet at time In Spreadsheet::ParseExcel
by Rahul6990 (Beadle) on Jan 31, 2013 at 07:36 UTC

    Use Spreadsheet::ParseExcel::Stream which does not parse entire document to the memory and creates no memory overhead.

    For more information check the below link:

    https://metacpan.org/module/Spreadsheet::ParseExcel::Stream

    Happy Programming.

      Thing is, it *does* parse the entire document to memory and yet it doesn't :). It loads the binary OLE-object and parses it. This creates a bit of memory overhead. However, this is (much) less memory than Spreadsheet::ParseExcel uses.
      The difference is in the fact that it doesn't *keep* your entire document in memory. As soon as you've read a row or sheet, it is removed from memory. There's also another bunch of things that it doesn't do with data you haven't read yet from the stream, but I don't know the details of exactly what all that is.
      Bottom line: Spreadsheet::ParseExcel::Stream is not perfect, but it's a whole lot better concerning memory usage compared to Spreadsheet::ParseExcel.

        A memory benchmarking of the two modules supports better memory usage by Spreadsheet::ParseExcel::Stream for the task below on a 1.9M SS, 20 sheets, each having 500 x 26 cells filled:

        use strict; use warnings; use Memchmark qw(cmpthese); use Spreadsheet::ParseExcel; use Spreadsheet::ParseExcel::Stream; my $xls_file = 'Book1.xls'; cmpthese( Spreadsheet_ParseExcel_Stream => sub { my $xls = Spreadsheet::ParseExcel::Stream->new($xls_file); while ( my $sheet = $xls->sheet() ) { my $cellA1 = $sheet->row->[0]; } }, Spreadsheet_ParseExcel => sub { my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse($xls_file); for my $worksheet ( $workbook->worksheets() ) { my $cellA1 = $worksheet->get_cell( 0, 0 )->value; } } );

        Results:

        test: Spreadsheet_ParseExcel, memory used: 199147520 bytes test: Spreadsheet_ParseExcel_Stream, memory used: 17633280 bytes

        As a side note, Spreadsheet::ParseExcel::Stream is a front end for Spreadsheet::ParseExcel, and its author asserts that its memory management is optimized.

      Please see the memory benchmarking results comparing Spreadsheet::ParseExcel::Stream against Spreadsheet::ParseExcel.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1016240]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2014-07-25 00:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (167 votes), past polls