Don't ask to ask, just ask | |
PerlMonks |
Spreadsheet::XLSX memory and speedby runrig (Abbot) |
on Jun 08, 2012 at 19:24 UTC ( [id://975226]=perlquestion: print w/replies, xml ) | Need Help?? |
runrig has asked for the wisdom of the Perl Monks concerning the following question:
I've been trying to see if Spreadsheet::XLSX can be made to not consume so much memory, and I've implemented the CellHandler and NotSetCell attributes from Spreadsheet::ParseExcel, and this helps somewhat, but part of the problem is the extraction of each file in the zip archive to an in memory variable (an xlsx file is a zip archive of many xml files).
E.g., one of the worksheet xml files is about 8MB, and the code that parses it like so:
$member_sheet is an Archive::Zip object for the worksheet xml file and contents() returns the entire contents of the file. Then the entire file is parsed into an array of tags and text that the foreach loop processes. In trying to save memory, I first was just trying to see if I could process the contents with a while loop like so:
While this seems to work, it's about 100 times slower. I don't know for sure why it's 100 times slower, but I've tried to make a benchmark that shows it should only be about 50% slower:
Is there something wrong with my benchmark (or just something wrong with trying to benchmark this)? Something else going on in Spreadsheet::XLSX? Anyone with enough tuits to look at or comment on this? TIA for any insights
Back to
Seekers of Perl Wisdom
|
|