Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: XML parsing vs regex

by Jim (Curate)
on May 14, 2013 at 05:53 UTC ( #1033419=note: print w/ replies, xml ) Need Help??


in reply to XML parsing vs regex

Here are a few rudimentary points that summarize my personal take on the classic XML parser versus regular expressions debate.

  • Perl is a general-purpose scripting language that is especially well-suited for text processing using arbitrarily complex regular expression patterns.
  • XML is plain text. Its inventors chose this simple format intentionally. (At least one of its inventors was a Perl hacker.)
  • All the XML I've ever had to work with has been data-oriented rather than document-oriented. It has been generated by stable software in such a way that its format was uniform, constant and predictable. For the duration of time I've had to work with any particular XML data structure, the format of the XML has never changed.
  • I've mostly ever had to do just two things with XML data using Perl:  make small changes to XML files, or extract small amounts of specific data from them.
  • I know Perl regular expressions well because I use them all the time, for all kinds of applications. I don't know any of the multiple different XML parsing technologies very well (XML::Parser, XML::LibXML, XML::Twig, etc.) because I rarely have to use them.
  • If the XML changes over time, it seems to me most likely to change in ways that would require a Perl script that parses it to be updated regardless of how it's parsing the XML:  either using a proper XML parser such as XML::LibXML or using regular expression patterns.
  • If you need to parse a whole XML data structure into a whole Perl data structure, don't try to write your own XML parser in Perl, silly! That would be senseless and foolhardy.

Jim


Comment on Re: XML parsing vs regex

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1033419]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (16)
As of 2014-10-01 20:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (38 votes), past polls