Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

It might be difficult but I'll try anyway ;--).

At least here are a few hints:

  • document size: big documents excludes most tree-oriented modules, such as XML::Simple, XML::DOM and XML::XPath
    big depends on your RAM and on the expansion factor of the module, typically between 7 an 10
  • type of XML: document-oriented XML excludes modules such as XML::Simple and XML::SimpleObjects
    those modules don't deal with mixed content (<p>this is <b>mixed</b> content</p>),
  • ease of use: although this is higly subjective XML::Simple seems to be considered really easy to use as it completely masks the XML by loading it into a Perl data structure (a pretty convoluted data-structure IMHO, use Data::Dumper!), tree-based modules (XML::XPath, XML::DOM, XML::Twig) are generally easier to use than stream-based ones, although for simple data extraction XML::PYX is very convenient,
  • speed: at the moment XML::Parser is the fastest (all other modules are based on it) but modules based on libXML should be faster soon (XML::XPath 2.0 for example). Stream-based modules are usually faster than tree-based ones,

In reply to Re: XML Module decision tree? by mirod
in thread XML Module decision tree? by John M. Dlugosz

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    [LanX]: Choroba: do you miss chaos with ties? apply at the US government.. ;)
    [ambrus]: Corion: those are good rules.
    [ambrus]: Discipulus: oh sure. the input data has different filenames every time I get them.
    [ambrus]: the directory structure may be 1, 2, or 3 deep, it may have spaces in the filename or not, it has dates in various format, different keywords for the same meanings, and the dates and other keywords are assembled in various ways.
    [Discipulus]: no ambrus by specification i mean for example license per core instead of per socket, so fields are different

    How do I use this? | Other CB clients
    Other Users?
    Others browsing the Monastery: (9)
    As of 2017-03-29 12:15 GMT
    Find Nodes?
      Voting Booth?
      Should Pluto Get Its Planethood Back?

      Results (350 votes). Check out past polls.