|
|
| laziness, impatience, and hubris | |
| PerlMonks |
Re^3: Parsing HTMLby mirod (Canon) |
| on Jun 12, 2012 at 11:56 UTC ( #975759=note: print w/ replies, xml ) | Need Help?? |
|
It's a bit of a pain to figure out where to look, but the as_text method comes from HTML::Element. If you look at the docs, you'll see that in addition to as_text there is also a as_trimmed_text method. I looks like you could use it. The secon foreach loop comes from looking at the HTML source for the page. The data you want is in the p with a class of itinerari-info, in consecutive span. Some of the span's can be discarded, the ones with classes of note and strike. That's what the XPath experssion returns. Each span includes a b element with the title, which I get in $info_title, display then detach to get it out of the way. The rest of the span is the information itself. Does this help?
In Section
Seekers of Perl Wisdom
|
|
||||||||||||||||||||||