Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

What you're missing is a grasp of "event-driven" programming. It's a distinct style of programming, just as OOP is (though they are not mutually exclusive; XML::Twig is both OO and event-driven). An event-driven parser is a good example of this programming model. (It is also the norm in GUI programming.) Such a parser has a core functionality (namely, parsing text according to some syntax), but the programmer can customize it by "registering" subroutines with the parser, to be associated with specific parsing events (e.g. finding a closing tag). The parser will then invoke these pre-registered subroutines, with a pre-specified set of arguments, at the appropriate times during the parsing. These subroutines one "registers" with the parser are called "callbacks" or "handlers"1.

The subs topic and extpage are two such handlers. They get invoked by the parser whenever it finishes parsing a Topic or ExternalPage section. They each receive two arguments from the parser: the XML::Twig object and the XML element that the parser just finished parsing. (This answers your first question.)

These two subroutines run separately from each other; in other words, neither of them calls the other one. This rules out direct communication between the two subs. One way around this is for them to communicate through shared variables (i.e. %links). In this case indirect communication is necessary since extpage cannot backtrack over the XML to see what links, if any, were found by topic. In the code I wrote only the keys of %links are used; saving the actual link objects as the values corresponding to these keys is just there for some potential future use. The code would work just as well if those values were all 1, say.

Note that these two subroutines run multiple times during the parsing operation. This is a key point. It is not the case that all the calls to topic happen first, and then all the calls to extpage. The multiple calls to these methods alternate.

...coz I'm really doesn't know how these two subroutine connect with each other or what is the run order of them?

The parser takes care of invoking the subroutines at the right time during the parsing; in this case, they get invoked once the parser finishes parsing a Topic or ExternalPage section, respectively. This all happens as the result of the call to $twig->parsefile( './sample.xml'); it is this call that sets off the whole sequence of events that ultimately cause the handlers to be invoked by the parser.

1Sometimes they are also called "hooks", although I have also seen the term "hook" used to refer to the places in the source code for the parser (or whatever) where the callbacks are invoked. You can think of these "hooks" as places provided by the author of the parser where the programmer using the parser can "hang" custom code from.

Update: The first chapter of HOP has a nice discussion of callbacks.

the lowliest monk


In reply to Re^5: Memory errors while processing 2GB XML file with XML:Twig on Windows 2000 by tlm
in thread Memory errors while processing 2GB XML file with XML:Twig on Windows 2000 by nan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-24 15:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found