Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^2: Best XML library to validate XML from untrusted source

by Jenda (Abbot)
on Oct 20, 2014 at 15:07 UTC ( [id://1104450]=note: print w/replies, xml ) Need Help??


in reply to Re: Best XML library to validate XML from untrusted source
in thread Best XML library to validate XML from untrusted source

XML::LibXML::Reader is way too low-level and while the pull style tends to lead to a (very slightly) more readable code than ordinary, node-level push, it's still nothing I would dare to recommend ... to anyone.

XML::Rules and XML::Twig give you the file in bite sized chunks which IMNSHO works much better than forcing a decomposition to individual atoms.

Speaking of XML::Rules ... it's based on XML::Parser::Expat and allows setting its handlers so I think setting the Expat's ExternEnt to your handler should provide vsespb with the protection he's after.

Jenda
Enoch was right!
Enjoy the last years of Rome.

Replies are listed 'Best First'.
Re^3: Best XML library to validate XML from untrusted source
by ikegami (Patriarch) on Oct 20, 2014 at 15:16 UTC

    I have no idea why you wouldn't recommend

    use XML::LibXML::Reader qw( ); my $reader = XML::LibXML::Reader->new( location => $file_or_url, load_ext_dtd => 0, expand_entities => 0, ); 1 while $reader->read;

    Wrapping this up just so you get something you can call higher-level simply is pure waste.

      Say, because it doesn't do anything? I mean, yes, it does some kind of basic format validation, but once you actually need to extract some data out of the file, things start getting complicated very quickly.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

        Nothing? What are you talking about? It validates the file as the OP requested.

        ...Actually, that's now what he requested. He requested

        use XML::LibXML::Reader qw( XML_READER_TYPE_ENTITY_REFERENCE ); my $reader = XML::LibXML::Reader->new( location => $file_or_url, load_ext_dtd => 0, expand_entities => 0, ); while ($reader->read) { die if $reader->nodeType == XML_READER_TYPE_ENTITY_REFERENCE; }

        What exactly is the problem???

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1104450]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-03-29 00:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found