Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re: Parsing huge XML file

by zentara (Archbishop)
on Sep 03, 2011 at 15:53 UTC ( #924005=note: print w/ replies, xml ) Need Help??

in reply to Parsing huge XML file

In case you havn't seen them, see How to Parse Huge XML Files ? and Katrina Parseing perl script for some methods.

I'm not really a human, but I play one on earth.
Old Perl Programmer Haiku ................... flash japh

Comment on Re: Parsing huge XML file
Replies are listed 'Best First'.
Re^2: Parsing huge XML file
by Gangabass (Vicar) on Sep 04, 2011 at 00:40 UTC

    Thank you for pointing me right direction. I have rewritten it:

    use strict; use warnings; use XML::Twig; my %bedrooms; my @bedrooms; my @good_division_numbers = qw( 30 31 32 35 38 ); my $xml = XML::Twig->new( twig_roots => { DivisionHouseRoom => \&count_bedroom +s, } ); $xml->parsefile( 'divisionhouserooms-v3.xml'); #$xml->parsefile('test.xml'); print "=" x 40, "\n"; open my $fh, ">>", "Result.csv" or die $!; foreach my $house_code (@bedrooms) { print $fh join( "\t", $house_code, $bedrooms{$house_code} ), "\n"; } close $fh; sleep 1; sub count_bedrooms { my ( $twig, $element ) = @_; my $house_code = $element->first_child_text('HouseCode'); print $house_code, "\n"; unless ( exists $bedrooms{$house_code} ) { push @bedrooms, $house_code; } my ($divisions) = $element->children('Divisions'); my @divisions = $divisions->children('Division'); for my $division (@divisions) { next unless grep { $_ eq $division->first_child_text('DivisionNum +ber') } @good_division_numbers; $bedrooms{$house_code} += $division->first_child_text('DivisionQuantity'); } $element->purge; }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://924005]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (9)
As of 2015-10-08 18:27 GMT
Find Nodes?
    Voting Booth?

    Does Humor Belong in Programming?

    Results (222 votes), past polls