Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

XML processing

by shreya (Novice)
on Aug 05, 2005 at 22:18 UTC ( #481380=perlquestion: print w/replies, xml ) Need Help??
shreya has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

After a lot of documentation reading and testing and spending almost 2 days trying to understand XML:Parser module I finally have to ask for some help here.

I receive some data in a XML file and am required to extract part of the data

<rootNode> <a> ...text + other XML </a> <a> ...text + other XML </a> </rootNode>

Now I need to extract all information between

<a> and </a>
tags and I need to do this using XML:Parser.
Any suggestions on how I could do this ?
below is my non-working code:
use XML::Parser; my $XmlFile = "foo.xml"; die "Can't find file \"$XmlFile\"" unless -f $XmlFile; my $pFoundFlag = 0; my $NewFile; my $parser = new XML::Parser(Style => 'Debug'); $parser->setHandlers( Start => \&startElement(), Char => \&char_handler(), End => \&end_Handler, ); $parser->parsefile($XmlFile); sub startElement { my ($parserInst, $element, %attr) = @_; if ($element eq "a") { $pFoundFlag = 1; } if ($pFoundFlag) { if(not $NewFile) { $NewFile = $element; } else { $NewFile .= $element; } } } sub char_handler { my ($parserInst, $data); if ($pFoundFlag) { $NewFile .= $data; } } sub end_Handler { my ($parserInst, $element); if ($pFoundFlag) { $NewFile .= $element; } if ($element eq "a") { $pFoundFlag = 0; } } $OpFile = "foo_outpur.xml"; open (OP, ">$OpFile") or die ("can't open output file!"); print OP $NewFile; close(OP);

Replies are listed 'Best First'.
Re: XML processing
by borisz (Canon) on Aug 05, 2005 at 22:55 UTC
    Do yourself a favor and use another Modul to Filter your XML. I suggest XML::Twig or XML::SAX. The tool xml_grep from XML::Twig does exactly what you want.
    xml_grep a your_xmlfile.xml
Re: XML processing
by Tanktalus (Canon) on Aug 06, 2005 at 02:44 UTC

    Seconding borisz's advice, here's some (untested) XML::Twig code to do this, in case you want to continue doing stuff with the results in perl:

    use XML::Twig; # ... my $twig = XML::Twig->new(); $twig->parsefile($XmlFile); my @a_elements = $twig->get_xpath("//a"); foreach my $el (@a_elements) { my $text = $el->text(); # do stuff with $text. }
    It's actually pretty easy, once you know which APIs you want ;-)

    Update: Changed loop variable from $a to $el when reminded by graff. Tks.

      But it would be better not to use "$a" as the name of the iterator variable in the "for" loop, in case "doing stuff" includes using "sort".

        something like $ElementA .= $element->sprint;?

        Thanks Mirod. This worked. I am still trying to get used to this response method. Didnt realise that someone replied ot my old posts. Wish there was a way to notify by email when someone replies to your posts. Later
Re: XML processing
by mrborisguy (Hermit) on Aug 06, 2005 at 03:26 UTC
    sub char_handler { my ($parserInst, $data);
    sub end_Handler { my ($parserInst, $element);
    You may want to include the important  = @_ here as well.


Re: XML processing
by shreya (Novice) on Aug 08, 2005 at 13:50 UTC
    Thanks borisz, Tanktalus, graff and mrborisguy for your comments.

    I did a little bit of reading on XML::Twig before posting. However I really wanted to use XML:Parser instead of XML:Twig.

    I dont have rights to install modules on our company server. i guess for now I am going to run XML::Twig from my own directory since I need to get a demo by EOD.

    Thanks again guys.
Re: XML processing
by shreya (Novice) on Aug 10, 2005 at 22:59 UTC
    I started using XML:Twig. However facing a problem.

    my $twig = XML::Twig->new(twig_handlers => {a'=> \&Test ); # Parse the file $twig->parsefile($XmlFile); # Handler sub routine sub Test { my ($parser, $element) = @_; # This prints element <a> along with its contents to some out +put file $element->print(\*OP); # However What I really want done over here is have element <a> a + long with its subelements and text be copied to another variable ins +tead of being printed on screen }
    Problem: What I really want done over here is have element along with its subelements and text be copied to another variable instead of being printed on screen
    Something like
    $ElementA .= $element->print;
    Pls advice. Thanks,

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://481380]
Approved by kwaping
What's the matter? Cat got your tongue?...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (12)
As of 2017-02-23 12:55 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (347 votes). Check out past polls.