go ahead... be a heretic | |
PerlMonks |
Re^2: Memory errors while processing 2GB XML file with XML:Twig on Windows 2000by nan (Novice) |
on May 16, 2005 at 10:53 UTC ( [id://457406]=note: print w/replies, xml ) | Need Help?? |
Hi Zaxo,
Thank you for the advice. My Perl is v5.8.6 built for MSWin32-x86-multi-thread. My XML sample file is shown below: Basically, the XML file has two key parallelled nodes: <Topic/> and <ExternalPage/>. If there is a <link/> child existing in <Topic/>, <ExternalPage/> node will be existing for showing more detailed information about the content of this <link/> such as <d:Title/> and <d:Description/>.However, not every <Topic/> node has one or more <link/> child, so I need to write a loop to find out if <link/> is a child of <Topic/> nodes. If there are some <link/> nodes existing, I will check each of <ExternalPages> to output more information.
<RDF>
<Topic r:id="Top"> <catid>1</catid> </Topic> <ExternalPage about=""> <topic>Top/</topic> </ExternalPage> <Topic r:id="Top/Arts"> <catid>2</catid> </Topic> <Topic r:id="Top/Arts/Movies/Titles/1/10_Rillington_Place"> <catid>205108</catid> <link r:resource="http://www.britishhorrorfilms.co.uk/rillington.shtml"/> <link r:resource="http://www.shoestring.org/mmi_revs/10-rillington-place.html"/> </Topic> <ExternalPage about="http://www.britishhorrorfilms.co.uk/rillington.shtml"> <d:Title>British Horror Films: 10 Rillington Place</d:Title> <d:Description>Review which looks at plot especially the shocking features of it.</d:Description> <topic>Top/Arts/Movies/Titles/1/10_Rillington_Place</topic> </ExternalPage> <ExternalPage about="http://www.shoestring.org/mmi_revs/10-rillington-place.html"> <d:Title>MMI Movie Review: 10 Rillington Place</d:Title> <d:Description>Review includes plot, real life story behind the film and realism in the film.</d:Description> <topic>Top/Arts/Movies/Titles/1/10_Rillington_Place</topic> </ExternalPage> </RDF> my codes are shown below which is quite straightforward:
#!/usr/bin/perl
Thanks again,use warnings; use strict; use XML::Twig; my $twig= new XML::Twig; $twig->parsefile( "./content.example.txt"); my $root = $twig->root; chdir "F:/httpserv"; #set initial directory foreach my $topic ($root->children('Topic')) { if ($topic->children('link')){ #if element <link/> is a child of <Topic/>, change directory for index writing chdir $topic->att('r:id'); foreach my $link ($topic->children('link')) { foreach my $extpage ($root->children('ExternalPage')) { if ($link->att('r:resource') eq $extpage->att('about')){ print $extpage->first_child_text('d:Title'), "\n"; print $extpage->first_child_text('d:Description'), "\n"; $twig->purge; #I'm not sure if I need to purge in each loop. } } $twig->purge; } $twig->purge; chdir "F:/httpserv"; #reset directory pointer to local root directory } }
In Section
Seekers of Perl Wisdom
|
|