Try to do a merge sort on two streams that give you records via call backs. It is just plain impossible. Callbacks are superficially/psychologically different than iterators, true. But they are also fundamentally less flexible. So your comparison only applies in the simple cases. In the complex cases, you find your hands tied and your work much more difficult.

But, there is no reason why XML modules can't be cast as iterators instead of forcing you to use callbacks. That is just too-lazy/not-smart-enough module design (it just takes a bit more work and knowing enough to realize that it is possible and can be important). With an interator-based XML module, things will be better.

But that would still force a linear approach which isn't as flexible as the (often inefficient) data structure approach. Providing ways to 'seek' your XML iterators can help with some cases but not others.

You can also come from the other end of the spectrum and make the data structure versions more efficient by having them compute things only as they are needed which is also one step on the road to being able to not keep everthing in RAM at once. On-demand data structures for XML is probably about the closest you can come to "the best of both worlds". Of course, the complexity involved means that it will never be as efficient/convenient when a much simpler approach fits what needs to be done. But that difference in efficiency is not something that I find worth worrying about (though I do prefer having choices for the sake of convenience).

It appears that XML::Twig tries to be several of these points on the spectrum. I parse XML with regular expressions1 so I've never used it, but I hear good things about it. (:

                - tye

1And I suspect many that say you shouldn't parse XML with a regex don't know how to do it right. For example, "ways to rome" just does it wrong. Of course you shouldn't do it that way!

In reply to Re: Are you looking at XML processing the right way? (merge) by tye
in thread is XML too hard? by thraxil

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":