http://www.perlmonks.org?node_id=560820

geektron has asked for the wisdom of the Perl Monks concerning the following question:

When I first started looking for XML generation modules, XML::Generator looked like the cleanest, simplest tool for the job: convert a database row (after some data massaging) into an XML document.

Unfortunately, the file I needed to generate has some 250 lines, and my code is almost a method-for-attribute match. While annoying, not the biggest problem with the code.

The real problem comes when I need to generate the following "looped" section:

<imageinfo> <imageurl1>http://www.company.com/images/photo1.jpg</imageurl1> <imagecaption1>Caption 1</imagecaption1> <imageurl2>http://www.company.com/images/photo2.jpg</imageurl2> <imagecaption2>Caption 2</imagecaption2> <imageurl3>http://www.company.com/images/photo3.jpg</imageurl3> <imagecaption3>Caption 3</imagecaption3> <imageurl4>http://www.company.com/images/photo4.jpg</imageurl4> <imagecaption4>Caption 4</imagecaption4> </imageinfo>
XML::Generator can handle that with a simple looping construct wrapped around the calls, i.e.:
my $imageSection; foreach my $imageId ( 1 .. 4 ) { my $urlName = "imageurl${imageId}"; my $urlCaption = "imagecaption${imageId}"; $imageSection .= $generator->$urlName(), $imageSection .= $generator->$urlCaption(), }
This generated fragment is added to a larger document later, via:
$generator->imageinfo( $imageSection ),

The problem comes when trying to avoid the native escaping of data. As expected, I can turn off escaping ... but that's not something I feel I should trust. Two generator objects would *help* -- but the larger part of the generation would be forced to *not* use the escaping.

I have a feeling there's a much better way of doing this, and I'd rather discover that now before I spend the next N days fighting with the mappings needed with some of the DB columns ... only to 'discover' a better way of generating this later on ...

Replies are listed 'Best First'.
Re: nested loops and escaping in XML::Generator
by Tanktalus (Canon) on Jul 12, 2006 at 21:11 UTC

    I'm confused. On more than one aspect of the question.

    I'm confused about why you want to avoid XML::Generator's auto-escaping of data. Because when you use an XML parser, the data should be auto-unescaped.

    I'm also confused about your data format. Using imageurl# as tagnames seems to be the equivalent of, well, using $var1, $var2, $var3, etc., and variable names and then trying to access them via eval STRING. Instead, I would expect:

    <imageinfo> <image number="1"> <url>http://www.company.com/images/photo1.jpg</url> <caption>Caption 1</caption> </image> <image number="2"> <url>http://www.company.com/images/photo2.jpg</url> <caption>Caption 2</caption> </image> <image number="3"> <url>http://www.company.com/images/photo3.jpg</url> <caption>Caption 3</caption> </image> <image number="4"> <url>http://www.company.com/images/photo4.jpg</url> <caption>Caption 4</caption> </image> </imageinfo>
    This should make things easier to deal with in many ways. (The number attribute probably shouldn't be there either.)

    I'm just not sure why you're doing things this way - both the attempt to bypass escaping, and the data layout.

      I'm confused about why you want to avoid XML::Generator's auto-escaping of data. Because when you use an XML parser, the data should be auto-unescaped.

      Because I'm trying (for lack of a better way of doing it) to insert pre-rolled XML into a larger XML document. Without disabling escaping, I get:

      <imageinfo> &lt;imageurl1 /&gt;&lt;imagecaption1 /&gt;&lt;imageurl2 /&gt;&lt;image +caption2 /&gt;&lt;imageurl3 /&gt;&lt;imagecaption3 /&gt;&lt;imageurl4 + /&gt;&lt;imagecaption4 /&gt; </imageinfo>
      From:
      $generator->imageinfo( $imageSection ),
      (remember that $imageSection was generated within a loop). Not a pretty sight, not valid XML ...

      I'm also confused about your data format. So am I. It's the requirement from a third party, and cannot be changed. Essentially, I'm trying to work *around* someone else's bizarro-world XML format.

        Ok, that's extremely helpful information. The data format is a "I wanna slap the guy who forced this on the world" answer. I can accept that. There are many people I know who deserve slapping. Of course, I say this with a severe Captain-Jack-Sparrow accent ;-)

        The first part of your node, however, is really helpful. I didn't understand the problem at first. Now, however, it looks like you want to do this (based only on the docs, not on trying it myself):

        use XML::Generator escape => 1; $generator->imageinfo( \$imageSection );
        To me, this seems like a reasonable tradeoff. You're saying, "Trust me, this is valid XML here." And you're right - it is. Other than abandoning XML::Generator, this seems like your best option. You get escaping most of the time (which is what you should want), and can override it in the rare circumstance where you really want to.

        Personally, I wouldn't have used XML::Generator, but XML::Twig, based on my XML decision tree. But the wonderful thing about Perl is that there's so many ways to do things - do it the way that makes the most sense to you ;-)