Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

(fongsaiyuk)Re: Re: Which is the Best Perl XML Tool?

by fongsaiyuk (Pilgrim)
on Jan 18, 2001 at 09:22 UTC ( #52704=note: print w/replies, xml ) Need Help??

in reply to Re: Which is the Best Perl XML Tool?
in thread Which is the Best Perl XML Tool?

What I know is that I _HATE_ the DOM. Not only is it clumsy and verbose, I also think it leads to insecure programming (a well placed comment in the XML can usually crash a DOM program). So I guess you have my position on Xerces ;--)

Doesn't Xerces validate the XML document? If a document is validated can it still cause a DOM program to crash?

I do not deny that DOM processing has very high overhead compared to straight XML::Parser, especially with very large XML documents, but that is the nature of DOM. Easy to understand the tree concept of data but costly, in terms of resources, to realize.

TIA for your comments,


  • Comment on (fongsaiyuk)Re: Re: Which is the Best Perl XML Tool?

Replies are listed 'Best First'.
Re: (fongsaiyuk)Re: Re: Which is the Best Perl XML Tool?
by mirod (Canon) on Jan 18, 2001 at 12:30 UTC

    If a document is validated can it still cause a DOM program to crash?

    The problem is that IMHO the DOM only offers one safe method to select elements: getElementsByTagName. This will always behave properly in a DWIM way. All the navigation functions, such as getFirstChid, getLastChild, getPreviousSibling and getNextSibling return a node. Now what if that node is a comment? How many of the DOM scripts out there check every time they use one of those methods that the result is really what they expected, usually an element? Remember that even if the DTD says that a dt is always followed by a dd there can be any number of comments and processing instructions in between. Who codes that defensively and systematically writes this?

    my $dd= $dt; $dd= $dt->getNextSibling until( $dd->getNodeName eq 'Element');

    Hence my guess that a well placed comment can probably wreak havoc in most of the DOM code around (and certainly in most of my own tries at taming the DOM). Practically I suspect people code for a subset of XML, one that excludes comments and processing instructions. This is dangerous for the exact same reasons I described in On XML parsing.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://52704]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2019-08-21 02:34 GMT
Find Nodes?
    Voting Booth?

    No recent polls found