WWW::Mechanize reading HTML

vit has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I tried to find how to feed WWW::Mechanize with HTML instead of URL without changing the module. One way is to read it from file like

my $url  = "file:///D:/webpage.html";
[download]

but then I need to indicate a root which I do not want because it is changing and also I need to write HTML to this file which is also not good.

Basically what I need is to parse HTML and get all links from there with some conditional filtering.

Comment on WWW::Mechanize reading HTML Download Code

Replies are listed 'Best First'.
Re: WWW::Mechanize reading HTML by afoken (Chancellor) on Sep 12, 2011 at 18:09 UTC
Basically what I need is to parse HTML and get all links from there with some conditional filtering. Use a HTML parser, like WWW::Mechanize does. WWW::Mechanize uses HTML::Form and HTML::TokeParser. Involving WWW::Mechanize in analysing a HTML file is just nonsense. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]
Re^2: WWW::Mechanize reading HTML by vit (Friar) on Sep 12, 2011 at 23:15 UTC
Thanks a lot! I used HTML::TokeParser which inputs HTML. Works perfect.	[reply]
Re: WWW::Mechanize reading HTML by ikegami (Patriarch) on Sep 12, 2011 at 18:19 UTC
Relative to current work dir: `use URI::file qw( ); my $url = URI::file->new_abs("webpage.html");` [download] Relative to script dir (assuming you didn't change work directory): `use Cwd qw( realpath ); use URI::file qw( ); my $url = URI::file->new("webpage.html")->abs(realpath($0));` [download]	[reply] [d/l] [select]
Re: WWW::Mechanize reading HTML by Corion (Patriarch) on Sep 12, 2011 at 18:08 UTC
Have you looked at the WWW::Mechanize documentation? Especially the `->update_html` method sounds like what you need.	[reply] [d/l]


Perl-Sensitive Sunglasses
	PerlMonks