Well instead of trolling why not supply a working example to help ??
Its always the Anonymous Monk lacking courage to put a name to a comment | [reply] |
Well instead of trolling why not supply a working example to help ??
Its always the Anonymous Monk lacking courage to put a name to a comment
How is it trolling to point out the shortcomings of a "solution"? Maybe you should look up the definition of troll
What courage is required to point out a simple fact about HTML::Parser? Are you under the impression that HTML::Parser is a high level parser?
Your "solution" doesn't fetch the portion of page from class = lastUnit to class = line margin10 -- its incomplete -- it is lots easier/shorter/simpler to use m{\Q$start\E(.+?)\Q$end\E}i instead of that HTML::Parser low-levelness
Have you seen Re: How to grab a portion of file with regex (don't)? Its not unlike a minimum of three different tutorials/walkthroughs/step-by-step-instructions on extracting/xpathing the dom , some even compare/contrast with HTML::Parser
| [reply] [d/l] |
You make some valid points. The example in the question didn't seem to need the content of the div.
I do agree that working with the DOM is a much better way to parse HTML.
| [reply] |
And for html files that are 9,000 GB's in size? | [reply] |
Always limits to everything. I must remind you that I am not the one wanting to parse HTML. I am simply trying to offer guidance. I understand that HTML parsing is a hot topic. However, as a solution to the question asked HTML::Parser works fine.
| [reply] |
And for html files that are 9,000 GB's in size?
Nevermind that that 9k-GB html-files don't exit, you can still use XML::Twig, naturally
| [reply] |