http://www.perlmonks.org?node_id=62784


in reply to XML::Parser Tutorial

This is nice, but I would rather now how to use XML::Parser by subclassing it. All of my attempts to do this ended up in very unclean, OOP-unfriendly code. I ended up with storing results in package-global variables rather than object attributes. This is both ugly and thread-unsafe.

Is there some clean way how to subclass XML::Parser?

Replies are listed 'Best First'.
Re: Re: XML::Parser Tutorial
by mirod (Canon) on Mar 07, 2001 at 21:29 UTC

    The problem is probably that XML::Parser is an object factory: it generates XML::Parser::Expat objects with each parse or parsefile call. The handlers then receive XML::Parser::Expat objects and not XML::Parser objects.

    There is a way to store data in the XML::Parser object and to access it in the handlers though: use the 'Non-Expat-Options' argument when creating the XML::Parser:

    #!/bin/perl -w use strict; use XML::Parser; my $p= new XML::Parser( 'Non-Expat-Options' => { my_option => "toto" }, Handlers => { Start => \&start, } ); $p->parse( '<a />'); sub start { my( $pe, $elt, %atts)= @_; print "my option: ", $pe->{'Non-Expat-Options'}->{my_option}, "\n" +; }

    This is certainly ugly but it works!

    Update: note that the data is still stored in the XML::Parser object though, as shown by this code:

    #!/bin/perl -w use strict; use XML::Parser; my $p= new XML::Parser( 'Non-Expat-Options' => { my_option => "1" }, Handlers => { Start => \&start, } ); $p->parse( '<a />'); $p->parse( '<b />'); sub start { my( $pe, $elt, %atts)= @_; print "element: $elt - my option: ", $pe->{'Non-Expat-Options'}->{my_option}++, "\n"; $p->parse( '<c />') unless( $pe->{'Non-Expat-Options'}->{my_option} > 3); }

    Which outputs:

    element: a - my option: 1 element: c - my option: 2 element: c - my option: 3 element: b - my option: 4
Re: Re: XML::Parser Tutorial
by merlyn (Sage) on Mar 07, 2001 at 21:11 UTC
    Why do you want to subclass it? It works much better as a "has-a" than an "is-a", unless you want to get very cozy from the base class implementation, which is a maze of twisty tiny packages all alike.

    Just delegate the methods that you want to provide in your interface, and handle the rest. Make a hash with one of the elements being your "inherited" parser. I believe it's called the "wrapper" pattern, but I don't name my patterns—I just use them!

    -- Randal L. Schwartz, Perl hacker

      Well, but .... (there is allways a 'but') :-)

      Suppose I do not subclass XML::Parser. But then, how do I pass parameters to XML::Parser handler methods and collect results of their run without using global variables of XML::Parser package? Only class that I get to handler methods is expat itself and there is no place for any aditional parameters/results of handler methods.

      And if I subclass XML::Parser, only advantage that I gain is using my own package namespace for global variables instead of XML::Parser's namespace. This do not looks to me like a good example of object oriented programming style.

      Possible silution is the one mirod suggested using Non-Expat-Options but it is just a little bit less ugly than these two.

      There best solution will be forcing XML::Parser to use my custom subclass of XML::Parser::Expat instead of XML::Parser::Expat itself. Is there some way how to do that?

        The way to do this, without relying on the fact that the $p is a hashref, is to pass a closure as the handlers, and have an object that you created saved in the closure. This is how PerlSAX is implemented.

        Witness:

        my $handler = bless {}, "MyHandler";
        my $p = XML::Parser->new(Handlers => {
           Start => sub { $handler->handle_start(@_) } 
        });
        
        package MyHandler;
        
        sub handle_start {
          my ($handler, $p, $element, %attribs) = @_;
          ...
        }