in reply to [SOLVED] Moving elements with XML::Twig: Document root closing tag is duplicated on output

Even if i dunno exactly what happen with your code, i'm sure it is wrong. the way you set up the handler made me immediatly suspicious. In fact I never seen a syntax like your

'/shipment/box' => sub { _move(@_); 1; }

Even the SYSNOPSIS for the XML::Twig reports the example on how to call a subroutine to process the part intercepted by th handler:

list    => \&my_list_process,

The, perfectly correct and valid, style to call a sub with &subname is described in perlsub where is stated &NAME;       # Makes current @_ visible to called subroutine. If someone tell you is Perl4 rubbish.. let me know.. ;)

So changing your

# '/shipment/box' => sub { _move(@_); 1; } #wrong # to '/shipment/box' => \&_move,

makes everything much clearer and Twiggy.

For the rest I can say that, also you have an unneeded 1; as last thing in your sub. Seems more logical to have an exit condition if no slip are found (..ooops):

sub _move { unless ($_[1]->descendants('slip')){ $_[0]->purge(); return; } foreach my $descendant ...

This way the program runs as we expected:

#!/usr/bin/perl # use 5.020; unneeded aside from the say use strict; use warnings; use XML::Twig; ## ## This works as expected ## my $working_xml = XML::Twig->new( twig_handlers => { #'/shipment/box' => sub { _move(@_); 1; }, #why? and why 1; a +t the end? '/shipment/box' => \&_move, }, pretty_print => 'indented', )->safe_parse( <<"XML" <?xml version="1.0" encoding="UTF-8"?> <shipment> <box> <packing> <slip>potato</slip> <slip>pear</slip> <slip>peach</slip> </packing> </box> <box> <packing> <slip>apple</slip> </packing> </box> </shipment> XML ); print "\n\n---------------------\n\n"; ## ## This ALSO WORKS FINE NOW!! ## my $broken_xml = XML::Twig->new( twig_handlers => { '/shipment/box' => \&_move, }, pretty_print => 'indented', )->safe_parse( <<"XML" <?xml version="1.0" encoding="UTF-8"?> <shipment> <box> <packing> <slip>potato</slip> <slip>pear</slip> <slip>peach</slip> </packing> </box> <box> <packing> <note>nothing here!</note> </packing> </box> </shipment> XML ); # ---------- sub _move { unless ($_[1]->descendants('slip')){ $_[0]->purge(); return; } foreach my $descendant ( $_[1]->descendants('slip') ) { $descendant->set_att('_TYPE' => 'produce'); $descendant->move('before', $_[1]); } $_[1]->delete; $_[0]->flush(); #1; # why? } ## OUTPUT <?xml version="1.0" encoding="UTF-8"?> <shipment> <slip _TYPE="produce">potato</slip> <slip _TYPE="produce">pear</slip> <slip _TYPE="produce">peach</slip> <slip _TYPE="produce">apple</slip> </shipment> --------------------- <?xml version="1.0" encoding="UTF-8"?> <shipment> <slip _TYPE="produce">potato</slip> <slip _TYPE="produce">pear</slip> <slip _TYPE="produce">peach</slip> </shipment>

L*

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
  • Comment on Re: Moving elements with XML::Twig: Document root closing tag is duplicated on output -- solution
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Moving elements with XML::Twig: Document root closing tag is duplicated on output -- solution
by Tanktalus (Canon) on Feb 21, 2016 at 03:54 UTC

    There's nothing wrong with '/shipment/box' => sub { _move(@_); 1; }. This is perfectly valid code. It means, call _move with all my parameters, but always return 1 - and this is all wrapped up with a code reference (it's an anonymous, or unnamed, sub). That return code, if you read the XML::Twig documentation, is relevant: "If a handler returns a true value and other handlers apply, then the next applicable handler will be called." I'm not sure what other handler there is, but there may be something trying to flush again here that I'm not seeing. When you change the return value, you may be changing the calling of this other handler, and that may stop the flushing. However, since handlers are documented to have meaningful return codes, it's probably best if you provide said meaningful return code.

      thanks Tanktalus to have taken the time to clarify this matter to me.

      Anyway maybe mirod can explain the details. You know (or well maybe not..) that my Perl knoweledge is a little.. empyric and if by one hand i tend to try many times unusefull and dangerous things with Perl, by other hand i tend to play gently when using modules, trying to keep me more close that i can to what the module's docs say. Because i ignore what XML::Twig does under the wood. Someone can learn Perl, no one CPAN (at least not me).

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re^2: Moving elements with XML::Twig: Document root closing tag is duplicated on output -- solution
by ateague (Monk) on Feb 22, 2016 at 19:59 UTC

    Pretty much what Tanktalus said earlier. This code is a vastly simplified example piece that reproduces the issue in question. I tried to trim out as much unnecessary cruft to make the example as small and concise as possible while still remaining relevant

    In answer to some of your questions:

    1. the way you set up the handler made me immediatly suspicious.

      the live code has some additional processing that happens after the twig is processed. It would look something like the following simplified example:

      sub { my $local_state = _move(@_); $global_state = _do_stuff_with_local_state($local_state->{SUMMARY}, \%VARS);  1; },
    2. The, perfectly correct and valid, style to call a sub with &subname

      Correct. The actual code passes extra variables to the twig processing subroutine in addition to the default TWIG and ELEMENT variables passed to the handler:

      sub { _move(@_, $local_state, \%OPTIONS, $warning_flag); 1; },
    3. For the rest I can say that, also you have an unneeded 1; as last thing in your sub.

      Pretty much what Tanktalus said earlier.

      The function called by the handler returns '1' in my simplified example (instead of variables) because it is an explicit 'true' return value

      The anonymous sub returns '1' because there are multiple twig handlers that may apply to a given twig:

      '/shipment/box' => sub { _move(@_); 1; }, '/shipment/box[@location='Vault_111']' => sub { _moveAgain(@_); 1; },

    Hope that clarifies things.

      yes clarify enough..

      I see you have more experience that me. I just tried to bypass the unwanted behaviour cleaning as much as possible your code.

      As for my first sentence i dunno what happen with your code. Anyway i msgd also the XML::Twig author about the possibility to spend some spare time with this problem.

      Anyway the general advise of keep things simple is still valid. A stream of XML and the parser involved is something that can block even rough stomachs, at least mine. I'm sure (at this point) that you read carefully the docs, but in the part dedicated to twig_handlers is declared:

      Warning: if you have used purge or flush on the twig the element might not be complete, some of its children might have been entirely flushed or purged, and the start tag might even have been printed (by flush) already, so changing its gi might not give the expected result.

      that might be relevant in your case.

      Have you noticed that changing $_[0]->flush(); to $_[0]->print(); in the sub does not provoke the unwanted behaviour?

      But in the docs is said that print must be used AFTER the parse, so I've commented it in the sub and I've added it after the safe_parse $working_xml->print(); and $broken_xml->print(); respectively: the bad doubled closing tag does not appears anymore.

      Share your further discoveries and good luck.

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.