Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^3: parse only one sheet at time In Spreadsheet::ParseExcel

by Neighbour (Friar)
on Jan 31, 2013 at 08:50 UTC ( #1016257=note: print w/ replies, xml ) Need Help??


in reply to Re^2: parse only one sheet at time In Spreadsheet::ParseExcel
in thread parse only one sheet at time In Spreadsheet::ParseExcel

Thing is, it *does* parse the entire document to memory and yet it doesn't :). It loads the binary OLE-object and parses it. This creates a bit of memory overhead. However, this is (much) less memory than Spreadsheet::ParseExcel uses.
The difference is in the fact that it doesn't *keep* your entire document in memory. As soon as you've read a row or sheet, it is removed from memory. There's also another bunch of things that it doesn't do with data you haven't read yet from the stream, but I don't know the details of exactly what all that is.
Bottom line: Spreadsheet::ParseExcel::Stream is not perfect, but it's a whole lot better concerning memory usage compared to Spreadsheet::ParseExcel.


Comment on Re^3: parse only one sheet at time In Spreadsheet::ParseExcel
Re^4: parse only one sheet at time In Spreadsheet::ParseExcel
by Kenosis (Priest) on Feb 01, 2013 at 00:57 UTC

    A memory benchmarking of the two modules supports better memory usage by Spreadsheet::ParseExcel::Stream for the task below on a 1.9M SS, 20 sheets, each having 500 x 26 cells filled:

    use strict; use warnings; use Memchmark qw(cmpthese); use Spreadsheet::ParseExcel; use Spreadsheet::ParseExcel::Stream; my $xls_file = 'Book1.xls'; cmpthese( Spreadsheet_ParseExcel_Stream => sub { my $xls = Spreadsheet::ParseExcel::Stream->new($xls_file); while ( my $sheet = $xls->sheet() ) { my $cellA1 = $sheet->row->[0]; } }, Spreadsheet_ParseExcel => sub { my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse($xls_file); for my $worksheet ( $workbook->worksheets() ) { my $cellA1 = $worksheet->get_cell( 0, 0 )->value; } } );

    Results:

    test: Spreadsheet_ParseExcel, memory used: 199147520 bytes test: Spreadsheet_ParseExcel_Stream, memory used: 17633280 bytes

    As a side note, Spreadsheet::ParseExcel::Stream is a front end for Spreadsheet::ParseExcel, and its author asserts that its memory management is optimized.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1016257]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (11)
As of 2015-07-06 22:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (83 votes), past polls