Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

best xml parser to use

by ftumsh (Scribe)
on Jul 06, 2006 at 18:03 UTC ( #559636=perlquestion: print w/ replies, xml ) Need Help??
ftumsh has asked for the wisdom of the Perl Monks concerning the following question:

Lo all,
Given that
1) the xml I have to use is quite simple (no cdata or PI etc)
2) but possibly large (< 5meg)
3) I need to know a tag before it's children are parsed
4) have a small memory footprint
5) be very fast
6) Linux only

Should I be using XML::Parser or XML::LibXML?

Thx
John

Comment on best xml parser to use
Re: best xml parser to use
by Tanktalus (Canon) on Jul 06, 2006 at 18:07 UTC

    My general decision tree goes sorta like this (non-XML parts removed):

    /----------\ +-----------+ < Parse XML? > --NO>-- | (removed) | \----------/ +-----------+ | YES V | +---------------+ | Use XML::Twig | +---------------+
    Hopefully this decision tree helps you decide what is best.

    ;-)

      heh :)
      I normally always use xml::twig, but I don't actually know the format of xml so I can't in this case. I've been googling and I'm going to try xml::sax.

      I appear to have only got xml::libxml::sax, is this good enough? the pod mention it might not be any good for production use. What others could I use?

        What do you mean by "I don't actually know the format of xml"? How will switching parsers (though, technically, XML::Twig is a front-end, not a parser itself) fix that?
      if ($xml->is_simple() and $xml->is_table_like()) { use XML::RAX; } elsif (size($xml) < too_big()) { use XML::Simple; } else { use XML::Twig; ... my $Data = $DataObj->simplify(forcearray => [...], keyattr =>{ ... } +, group_tags => {...}); }

      Update 2007-2-6: Tastes change. While I would probably still use XML::RAX for some tasks with table-like XML and XML::Simple for very simple XMLs, I'd most probably use my XML::Rules now. May look a bit twisted at first, but it's convenient and powerfull. IMHO of course ;-)

Re: best xml parser to use
by coreolyn (Parson) on Jul 06, 2006 at 19:01 UTC

    It seems to me that performance is so much better just parsing xml with regex's I quit caring that it's xml. I might add some coding time to my development but I really don't see the overall benifit to xml modules unless I have to provide xml output -- even then it's 'iffy'.

Re: best xml parser to use
by planetscape (Canon) on Jul 06, 2006 at 19:28 UTC

      Agreed, although I sometimes still use XML::TreeBuilder.

      Love Tanktalus's decision tree though. :)


      DWIM is Perl's answer to Gödel
Re: best xml parser to use
by xdg (Monsignor) on Jul 10, 2006 at 12:54 UTC

    I'm not sure how it stacks up against these criteria, but if you're not 100% sure that your incoming data is entirely valid, you might want to check out XML::Liberal (a standin for LibXML). Avoiding re-parsing a large XML file because of a small nit might be worthwhile form of efficiency.

    The author gave a nice Lightning Talk about it at YAPC::NA.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://559636]
Approved by Tanktalus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2014-10-22 22:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (122 votes), past polls