Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re^3: Need help with WWW::Mechanize and Chrome cookies

by bakiperl (Beadle)
on Jul 09, 2021 at 11:33 UTC ( #11134847=note: print w/replies, xml ) Need Help??

in reply to Re^2: Need help with WWW::Mechanize and Chrome cookies
in thread Need help with WWW::Mechanize and Chrome cookies

I also wonder why this code in WMC
my $file_map = $mech->saveResources_future( target_file => 'this_page.html', target_dir => 'this_page_files/', wanted => sub { $_[0]->{url} =~ m!^https?:!i }, )->get();
downloads the files in these two links
<link rel="stylesheet" href="css/file.css" type="text/css" /> <img id="logo" src="/images/image.gif" alt="" title="" />
But not this one
<a class="txt" href="file.txt"> Text File </a>

Replies are listed 'Best First'.
Re^4: Need help with WWW::Mechanize and Chrome cookies
by marto (Cardinal) on Jul 09, 2021 at 11:42 UTC

    The latter is a hyperlink to another page/resource, you would never want the 'Save Complete page' method to follow links like that, it's not what it's for. Saving the same page in a browser will not save hyperlink targets.

      In this case, is there a different approach to download the hyperlink targets from within WMC?

        Either find the links, get them, save them, or inject something like this and call it from the page for each target you've identified, or submit a patch to add the required functionality to this module, or choose something else to achieve your goal. Unless you need JavaScript there should be alternatives, but your post lacks enough detail to expand on that.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11134847]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2023-03-25 10:20 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (63 votes). Check out past polls.