|Perl: the Markov chain saw|
WWW::Mechanize::TreeBuilder and WWW::Mechanize. Following links but can't return without errorby mdro79 (Initiate)
|on Jan 04, 2013 at 23:07 UTC||Need Help??|
mdro79 has asked for the
wisdom of the Perl Monks concerning the following question:
Hello, I am trying to better my understanding of WWW::Mechanize. I have built a simple website of a few pages to practice traversing with WWW::Mechanize and reading html tags, attributes and content with WWW::Mechanize::TreeBuilder.
The website I built is quite simple for now, it contains a top level index.html, which contains a single table. In the table rows are a few cells, containing text and links. I am trying to read the links, follow them to the next page, gather some data, print it, then come back to the next row of the table.
Ultimately I would like to traverse a large table, and make decisions row-by-row on whether to store data from that row, and follow a link to a following page, or whether to skip that row as it doesn't meet my criteria and move on to the next one with no action taken.
I am starting with a simple test skeleton, my index.html page, with rows and links leading to a few other pages -- s1.html, s2.html, s3.html
I run into problems after leaving the current page while looping through the list of links. I would like to leave, gather/print some data, and come back and continue my loop onto the next. What actually happens is my program crashes at this point, complaining of unitialized values in /path/to/HTML/Element.pm. With all that said, here is the code I am having problems with. If I can get my page following and retreating logic nailed down properly that will be a big step for me.
What I believe is happening is the program runs the main loop OK, but when it leaves the current page, something happens to @list. I don't know what, but leaving the page with $mech->get() seems to break my program.