Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

WWW::Mechanize reading HTML

by vit (Friar)
on Sep 12, 2011 at 18:03 UTC ( [id://925528]=perlquestion: print w/replies, xml ) Need Help??

vit has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I tried to find how to feed WWW::Mechanize with HTML instead of URL without changing the module. One way is to read it from file like
my $url = "file:///D:/webpage.html";
but then I need to indicate a root which I do not want because it is changing and also I need to write HTML to this file which is also not good.

Basically what I need is to parse HTML and get all links from there with some conditional filtering.

Replies are listed 'Best First'.
Re: WWW::Mechanize reading HTML
by afoken (Chancellor) on Sep 12, 2011 at 18:09 UTC
    Basically what I need is to parse HTML and get all links from there with some conditional filtering.

    Use a HTML parser, like WWW::Mechanize does. WWW::Mechanize uses HTML::Form and HTML::TokeParser. Involving WWW::Mechanize in analysing a HTML file is just nonsense.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      Thanks a lot! I used HTML::TokeParser which inputs HTML. Works perfect.
Re: WWW::Mechanize reading HTML
by ikegami (Patriarch) on Sep 12, 2011 at 18:19 UTC

    Relative to current work dir:

    use URI::file qw( ); my $url = URI::file->new_abs("webpage.html");

    Relative to script dir (assuming you didn't change work directory):

    use Cwd qw( realpath ); use URI::file qw( ); my $url = URI::file->new("webpage.html")->abs(realpath($0));
Re: WWW::Mechanize reading HTML
by Corion (Patriarch) on Sep 12, 2011 at 18:08 UTC

    Have you looked at the WWW::Mechanize documentation? Especially the ->update_html method sounds like what you need.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://925528]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2024-04-19 22:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found