Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Delete parse data spreadsheet::parseexcel

by moritz (Cardinal)
on Feb 05, 2013 at 18:58 UTC ( #1017257=note: print w/ replies, xml ) Need Help??


in reply to Delete parse data spreadsheet::parseexcel

Once you're done parsing a file, simply stop keep a reference to the parser object. Then the memory allocated for that object is freed.


Comment on Re: Delete parse data spreadsheet::parseexcel
Re^2: Delete parse data spreadsheet::parseexcel
by cibien (Novice) on Feb 05, 2013 at 19:58 UTC

    Sorry but I'am novice in perl....

    how I do "simply stop keep a reference to the parser object" in spreadsheet::parseexcel?

    Thanks

        my first part of the code:
        use Encode; use utf8; use XML::LibXML; use Spreadsheet::ParseExcel; system("cls"); # Read command line arguments #--------------------------------------------------------------------- +------------------- my $materialmapping_file = shift; my $data_folder = shift; print "Material mapping generator - Version 0.0.6 - Final \nmaterialma +pping_file:$materialmapping_file\ndata_folder:$data_folder\n"; # subroutine sub getSQLTimeStamp { my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localt +ime(time); return sprintf "%4d-%02d-%02d %02d:%02d:%02d",$year+1900,$mon+1 +,$mday,$hour,$min,$sec; } # Main #--------------------------------------------------------------------- +------------------- unless (defined $materialmapping_file and defined $data_folder) { print "Usage: $0 <materialmapping_file> <data_folder>\n"; exit; } opendir ( DIR, $data_folder ) || die "Error in opening dir $data_f +older\n"; my $materialmapping_table_xml = XML::LibXML->createDocument( "1.0" +, "UTF-8"); my $materialmapping_table_xml_root = $materialmapping_table_xml->c +reateElement("masterdata"); $materialmapping_table_xml_root->setAttribute('version',getSQLTime +Stamp()); $materialmapping_table_xml->setDocumentElement($materialmapping_ta +ble_xml_root); print "loading...\n"; while(($filename = readdir(DIR))){ next unless $filename =~ /\.xls$/i; print " - $filename \n"; my $parser = Spreadsheet::ParseExcel->new(); my $oBook = $parser->parse($data_folder.$filename); next unless defined $oBook; for(my $iSheet=0; $iSheet < $oBook->{SheetCount} ; $iS +heet++) { $oWkS = $oBook->{Worksheet}[$iSheet]; .........................................................
        Please help me :-(

        my spreadsheet::parseexcel version is 0.59 and perl version is 5.10.1

        Sorry, here is the correct and simpled code with no error

        #!/usr/bin/perl -w my $version="0.0.2"; use strict; use Encode; use utf8; use XML::LibXML; use Spreadsheet::ParseExcel; system("cls"); # Read command line arguments #--------------------------------------------------------------------- +------------------- my $materialmapping_file = shift; my $data_folder = shift; print "OPTIONS\nmaterialmapping_file:$materialmapping_file\ndata_folde +r:$data_folder\n"; # subroutine sub getSQLTimeStamp { my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localt +ime(time); return sprintf "%4d-%02d-%02d %02d:%02d:%02d",$year+1900,$mon+1 +,$mday,$hour,$min,$sec; } # Main #--------------------------------------------------------------------- +------------------- if(length($materialmapping_file) gt 0 and length($data_folder) gt 0) { opendir ( DIR, $data_folder ) || die "Error in opening dir $data_f +older\n"; my $materialmapping_table_xml = XML::LibXML->createDocument( "1.0" +, "UTF-8"); my $materialmapping_table_xml_root = $materialmapping_table_xml->c +reateElement("masterdata"); $materialmapping_table_xml_root->setAttribute('version',getSQLTime +Stamp()); $materialmapping_table_xml->setDocumentElement($materialmapping_ta +ble_xml_root); print "Parsing files...\n"; while( my $filename = readdir(DIR)){ if($filename =~ /\.xls$/i) { my $parser = Spreadsheet::ParseExcel->new(); my $oBook = $parser->parse($data_folder.$filename); if (defined $oBook ) { for(my $iSheet=0; $iSheet < $oBook->{SheetCount} ; $iS +heet++) { my $oWkS = $oBook->{Worksheet}[$iSheet]; # find the needed columns my $map_Cmin = -1; # the first is internal my $map_Cmax = -1; my $artno_col = -1; my $active_col = -1; my $colorzones_col = -1; my $family_col = -1; my $tec_desc_col = -1; my $brand_Cmin = -1; my $brand_Cmax = -1; my $customer_Cmin = -1; my $customer_Cmax = -1; my $market_Cmin = -1; my $market_Cmax = -1; my $title_row = -1; for (my $iC = $oWkS->{MinCol}; defined $oWkS->{Max +Col} && $iC <= $oWkS->{MaxCol} ; $iC++) { my $oWkC = $oWkS->{Cells}[$oWkS->{MinRow}][$iC +]; if(defined $oWkC) { if(decode('cp1252',$oWkC->{Val}) eq "Confi +gurator mapping") { for (my $iR = $oWkS->{MinRow} +1; defi +ned $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++) { my $oWkC_int = $oWkS->{Cells}[$iR] +[$iC]; if(defined $oWkC_int and decode('c +p1252',$oWkC_int->{Val}) eq "internal") { $title_row = $iR; } } if($title_row > 0) { foreach my $area ( @{ $oWkS->{Merg +edArea} } ) { if($area->[1] eq $iC and $area +->[0] eq $oWkS->{MinRow}){ $map_Cmax = $area->[3]; } } $map_Cmin = $iC; $artno_col = $map_Cmin -2; $active_col = $map_Cmin -1; $family_col = $map_Cmin +1; $colorzones_col = $map_Cmax +2; $tec_desc_col = $map_Cmax +1; } } elsif(decode('cp1252',$oWkC->{Val}) eq " +Market"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $market_Cmax = $area->[3]; } } $market_Cmin = $iC; } elsif(decode('cp1252',$oWkC->{Val}) eq " +Customer"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $customer_Cmax = $area->[3]; } } $customer_Cmin = $iC; } elsif(decode('cp1252',$oWkC->{Val}) eq " +Brand"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $brand_Cmax = $area->[3]; } } $brand_Cmin = $iC; } } } if($map_Cmin >= 0 and $map_Cmin >= 0 and $title_ro +w >= 0){ for(my $iR = $title_row +1; defined $oWkS->{Ma +xRow} && $iR <= $oWkS->{MaxRow} ; $iR++){ my $internal_cell = $oWkS->{Cells}[$iR][$m +ap_Cmin]; my $active_cell = $oWkS->{Cells}[$iR][$act +ive_col]; my $family_cell = $oWkS->{Cells}[$iR][$fam +ily_col]; if(defined $internal_cell and defined $act +ive_cell and defined $family_cell and length(decode('cp1252',$interna +l_cell->{Val})) gt 0 and !(length(decode('cp1252',$active_cell->{Val} +)) gt 0) and length(decode('cp1252',$family_cell->{Val})) gt 0) { #for every family print a line my @families = split(/\,/, decode('cp1 +252',$family_cell->{Val})); foreach my $family(@families){ my $materialmapping_item = $materi +almapping_table_xml->createElement("item"); #set internal $materialmapping_item->setAttribut +e("internal",decode('cp1252',$internal_cell->{Val})); #set family $materialmapping_item->setAttribut +e("pr_family",$family); #set item_number my $item_number_cell = $oWkS->{Cel +ls}[$iR][$artno_col]; if(defined $item_number_cell and l +ength(decode('cp1252',$item_number_cell->{Val})) gt 0) { $materialmapping_item->setAttr +ibute("item_number",decode('cp1252',$item_number_cell->{Val})); } #set colorzones my $colorzones_cell = $oWkS->{Cell +s}[$iR][$colorzones_col]; if(defined $colorzones_cell and le +ngth(decode('cp1252',$colorzones_cell->{Val})) gt 0) { $materialmapping_item->setAttr +ibute("colorzone",decode('cp1252',$colorzones_cell->{Val})); } #set properties for(my $prC = $map_Cmin+2; $prC <= + $map_Cmax ; $prC++) { my $pr_cell_name = $oWkS-> +{Cells}[$title_row][$prC]; my $pr_cell = $oWkS->{Cell +s}[$iR][$prC]; if(defined $pr_cell and de +fined $pr_cell_name and length(decode('cp1252',$pr_cell->{Val})) gt 0 + and length(decode('cp1252',$pr_cell_name->{Val})) gt 0) { $materialmapping_item- +>setAttribute(decode('cp1252',$pr_cell_name->{Val}),decode('cp1252',$ +pr_cell->{Val})); } } #set description my $description_cell = $oWkS->{Cel +ls}[$iR][$tec_desc_col]; if(defined $description_cell and l +ength(decode('cp1252',$description_cell->{Val})) gt 0) { $materialmapping_item->setAttr +ibute("description",decode('cp1252',$description_cell->{Val})); } #set Brands if($brand_Cmin gt 0 and $brand_Cma +x gt 0) { my $brand_string = ""; my $ignored_brand_string = ""; for(my $brandC = $brand_Cmin; +$brandC <= $brand_Cmax ; $brandC++) { my $brand_cell_name = $oWk +S->{Cells}[$title_row][$brandC]; if(defined $brand_cell_nam +e and decode('cp1252',$brand_cell_name->{Val}) ne "Epta std") { my $brand_cell = $oWkS +->{Cells}[$iR][$brandC]; if(defined $brand_cell + and length(decode('cp1252',$brand_cell->{Val})) gt 0) { if(decode('cp1252' +,$brand_cell->{Val}) eq '0') { $ignored_brand +_string = $ignored_brand_string . decode('cp1252',$brand_cell_name->{ +Val}) . ","; } else { $brand_string += $brand_string . decode('cp1252',$brand_cell_name->{Val}) . ","; } } } } if(length($brand_string) gt 0) +{ $materialmapping_item->set +Attribute('brand',$brand_string); } if(length($ignored_brand_strin +g) gt 0){ $materialmapping_item->set +Attribute('ignore_brand',$ignored_brand_string); } } #set Customer if($customer_Cmin gt 0 and $custom +er_Cmax gt 0) { my $customer_string = ""; my $ignored_customer_string = +""; for(my $customerC = $customer_ +Cmin; $customerC <= $customer_Cmax ; $customerC++) { my $customer_cell_name = $ +oWkS->{Cells}[$title_row][$customerC]; my $customer_cell = $oWkS- +>{Cells}[$iR][$customerC]; if(defined $customer_c +ell_name and defined $customer_cell and length(decode('cp1252',$custo +mer_cell->{Val})) gt 0 and length(decode('cp1252',$customer_cell_name +->{Val})) gt 0) { if(decode('cp1252' +,$customer_cell->{Val}) eq '0') { $ignored_custo +mer_string = $ignored_customer_string . decode('cp1252',$customer_cel +l_name->{Val}) . ","; } else { $customer_stri +ng = $customer_string . decode('cp1252',$customer_cell_name->{Val}) . + ","; } } } if(length($customer_string) gt + 0){ $materialmapping_item->set +Attribute('customer',$customer_string); } if(length($ignored_customer_st +ring) gt 0){ $materialmapping_item->set +Attribute('ignore_customer',$ignored_customer_string); } } #set Market if($market_Cmin gt 0 and $market_C +max gt 0) { my $market_string = ""; my $ignored_market_string = "" +; for(my $marketC = $market_Cmin +; $marketC <= $market_Cmax ; $marketC++) { my $market_cell_name = $oW +kS->{Cells}[$title_row][$marketC]; my $market_cell = $oWkS->{ +Cells}[$iR][$marketC]; if(defined $market_cel +l_name and defined $market_cell and length(decode('cp1252',$market_ce +ll->{Val})) gt 0 and length(decode('cp1252',$market_cell_name->{Val}) +) gt 0) { if(decode('cp1252' +,$market_cell->{Val}) eq '0') { $ignored_marke +t_string = $ignored_market_string . decode('cp1252',$market_cell_name +->{Val}) . ","; } else { $market_string + = $market_string . decode('cp1252',$market_cell_name->{Val}) . ","; } } } if(length($market_string) gt 0 +){ $materialmapping_item->set +Attribute('market',$market_string); } if(length($ignored_market_stri +ng) gt 0){ $materialmapping_item->set +Attribute('ignore_market',$ignored_market_string); } } $materialmapping_table_xml_root->a +ddChild($materialmapping_item); }; } } } } } print "Parsed $filename \n"; } } $materialmapping_table_xml->toFile($materialmapping_file,2); }

        really thanks

        Andrea

        Yes You are right, Spreadsheet::parseexcel not store the parsed data when start parse new file. The problem is libxml ($materialmapping_table_xml).. strange because the xml output is only 60 mb.... yes I try your solution "that writes each $materialmapping_item to a file", can you help me to do it? :) Thankyou Moritz

Re^2: Delete parse data spreadsheet::parseexcel
by runrig (Abbot) on Feb 06, 2013 at 00:19 UTC
    simply stop keep a reference to the parser object. Then the memory allocated for that object is freed.

    No, not for Spreadsheet::ParseExcel objects...they contain circular references (workbooks reference worksheets and vice versa, etc.), so they don't go away until the end of your program. Best thing to do is to parse each spreadsheet in a separate process (using Parallel::ForkManager or just plain fork or just launch a separate system command or something).

      they contain circular references (workbooks reference worksheets and vice versa, etc.), so they don't go away until the end of your program

      If that's really the case, please open a bug report.

      However the source does use Scalar::Util::weaken, and my very basic testing shows that the objects do get released when I set the reference to the workbook to undef.

        Looks like you're right. This has been fixed since around 2007 in v0.30.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1017257]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2014-09-03 02:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (35 votes), past polls