http://www.perlmonks.org?node_id=217797


in reply to XML log files

Entities to the rescue!

You can just create a wrapper that will include just the root element and a call to an entity referencing the log file, which itself has no root tag:

log.xml is:

<?xml version="1.0"?> <!DOCTYPE log [ <!ENTITY data SYSTEM "log.data"> ]> <log>&data;</log>

log.data is:

<event time='1234' type='this'> <detail>blah</detail><detail>blahblah</detail> </event> <event time='1236' type='this'> <detail>blah</detail><detail>blahblah</detail> </event> <event time='2234' type='that'> <detail>weeble</detail><detail>blahblah</detail> </event>

XML processors should have no problem with this (tested with perl -MXML::Simple -MData::Denter -e'print Denter XMLin( "log.xml");'). You just output your log data to log.data and use log.xml when you want to do XML processing on it.

Replies are listed 'Best First'.
Re: Re: XML log files
by gjb (Vicar) on Dec 05, 2002 at 16:27 UTC

    This is a very neat trick, I like it a lot.

    There might be one catch though: according to the XML specs a oon-validating parser may, but doesn't have to include the external entity (ie. the file the URI is refering to).

    If I interprete the specs correctly, this means that this feature is implementation dependent.

    Just my 2 cents, -gjb-

      Indeed XML::Parser will do it (and thus all modules based on it, such as XML::Simple, XML::Twig, XML::DOM, XML::XPath...), I suspect XML::LibXML will do it, along with modules based on it (you can base most of the SAX modules on it) but I don't think XML::SAX::PurePerl will.

Re: Re: XML log files
by dingus (Friar) on Dec 05, 2002 at 16:27 UTC
    Entities to the rescue!

    I knew there would be a nice XML way to do this. Merci Beaucoup, Muchos Gracias, Vielen Dank, Spacebo, Arigatou, Kiitos, Tusen Takk, Obrigado and Sanctuary match!

    In fact for temporary hacking use at least it is possible to omit log.xml - the following code works:

    $logfn = '/path/to/log.data'; print Dumper (XMLin(<<EOENT )); <?xml version="1.0"?> <!DOCTYPE log [ <!ENTITY data SYSTEM "$logfn"> ]> <log>&data;</log> EOENT

    Dingus


    Enter any 47-digit prime number to continue.
      That is "Tusen Tack", not "Takk".

      *Grin*


      You have moved into a dark place.
      It is pitch black. You are likely to be eaten by a grue.

      You actually forgot xičxie, m'goi sai, khop-khun krap, terima kasih, salamat and Danggschee? What an ungrateful person... ;-)

Re: Re: XML log files
by grantm (Parson) on Dec 05, 2002 at 20:35 UTC

    That is a cool trick. In real life of course XML::Simple (or any of the DOM modules) would probably be unsuitable for processing the log files since they all create a tree representing the whole contents of the file in memory. XML::Twig would be a fine choice or alternatively a SAX approach would work too.

    On the other hand, as merlyn pointed out, YAML might be a better fit. While the extra 'fluff' of XML can compress well, you will have to uncompress the whole file to process it (ie: without root elements, you couldn't parse the XML from a unzip stream).