|Perl: the Markov chain saw|
Re^3: Extracting HTML content between the h tagsby vagabonding electron (Chaplain)
|on Aug 05, 2012 at 14:09 UTC||Need Help??|
Thank you very much!
Just tried the both approaches, it works even if the last h2-tag is missing ( appears in about 10 pages from > 400, for which I used the following workaround:
... with substr as before ...
Fortunately they have only one hr-tag in the page :-)
With your approach it is not necessary anymore.
BTW the content after the <h2> is not important.