Beefy Boxes and Bandwidth Generously Provided by pair Networks httptech
"be consistent"
 
PerlMonks  

Re^3: parse only one sheet at time In Spreadsheet::ParseExcel

by Neighbour (Friar)
on Jan 31, 2013 at 08:50 UTC ( #1016257=note: print w/ replies, xml ) Need Help??


in reply to Re^2: parse only one sheet at time In Spreadsheet::ParseExcel
in thread parse only one sheet at time In Spreadsheet::ParseExcel

Thing is, it *does* parse the entire document to memory and yet it doesn't :). It loads the binary OLE-object and parses it. This creates a bit of memory overhead. However, this is (much) less memory than Spreadsheet::ParseExcel uses.
The difference is in the fact that it doesn't *keep* your entire document in memory. As soon as you've read a row or sheet, it is removed from memory. There's also another bunch of things that it doesn't do with data you haven't read yet from the stream, but I don't know the details of exactly what all that is.
Bottom line: Spreadsheet::ParseExcel::Stream is not perfect, but it's a whole lot better concerning memory usage compared to Spreadsheet::ParseExcel.


Comment on Re^3: parse only one sheet at time In Spreadsheet::ParseExcel
Re^4: parse only one sheet at time In Spreadsheet::ParseExcel
by Kenosis (Priest) on Feb 01, 2013 at 00:57 UTC

    A memory benchmarking of the two modules supports better memory usage by Spreadsheet::ParseExcel::Stream for the task below on a 1.9M SS, 20 sheets, each having 500 x 26 cells filled:

    use strict; use warnings; use Memchmark qw(cmpthese); use Spreadsheet::ParseExcel; use Spreadsheet::ParseExcel::Stream; my $xls_file = 'Book1.xls'; cmpthese( Spreadsheet_ParseExcel_Stream => sub { my $xls = Spreadsheet::ParseExcel::Stream->new($xls_file); while ( my $sheet = $xls->sheet() ) { my $cellA1 = $sheet->row->[0]; } }, Spreadsheet_ParseExcel => sub { my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse($xls_file); for my $worksheet ( $workbook->worksheets() ) { my $cellA1 = $worksheet->get_cell( 0, 0 )->value; } } );

    Results:

    test: Spreadsheet_ParseExcel, memory used: 199147520 bytes test: Spreadsheet_ParseExcel_Stream, memory used: 17633280 bytes

    As a side note, Spreadsheet::ParseExcel::Stream is a front end for Spreadsheet::ParseExcel, and its author asserts that its memory management is optimized.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1016257]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2014-04-17 03:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (439 votes), past polls