Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^13: Need help with WWW::Mechanize and Chrome cookies

by bakiperl (Beadle)
on Jul 11, 2021 at 16:50 UTC ( [id://11134919]=note: print w/replies, xml ) Need Help??


in reply to Re^12: Need help with WWW::Mechanize and Chrome cookies
in thread Need help with WWW::Mechanize and Chrome cookies

I modified the code to eliminate the loop and I changed the URL to absolute.
my @urls = map { $_->url_abs } @links;
The error message went away but these calls still don't return anything.
$mech->get($foo, ':content_file'=>$filename); my $file_content = $mech->get($foo);

Replies are listed 'Best First'.
Re^14: Need help with WWW::Mechanize and Chrome cookies
by Corion (Patriarch) on Jul 11, 2021 at 19:07 UTC

    Why do you expect the following to work for WWW::Mechanize::Chrome? This is not a documented call:

    $mech->get($foo, ':content_file'=>$filename);

    I wonder why you say that the following "return anything":

    my $file_content = $mech->get($foo);

    ->get() is documented to return a response, so I suggest that you print it, or inspect it using Data::Dumper.

      Corion,
      I get an empty string when I print the file content.
      my $file_content = $mech->get($foo); print $file_content;
      Here is the result returned by the Data::Dumper
      $VAR1 = bless( { '_headers' => bless( { '::std_case' => { 'x-frame-opt +ions' => 'X-Frame-Options', 'expect-ct' +=> 'Expect-CT', 'content-sec +urity-policy' => 'Content-Security-Policy', 'x-xss-prote +ction' => 'X-XSS-Protection', 'x-content-t +ype-options' => 'X-Content-Type-Options', 'strict-tran +sport-security' => 'Strict-Transport-Security', 'referrer-po +licy' => 'Referrer-Policy' }, 'content-security-policy' => ' +default-src \'self\' data: https: \'unsafe-eval\' \'unsafe-inline\'', 'date' => 'Sun, 11 Jul 2021 19 +:46:11 GMT', 'strict-transport-security' => + 'max-age=31536000; includeSubDomains', 'etag' => '"06c81313776d71:0"' +, 'expect-ct' => 'enforce, max-a +ge=30, report-uri="https://{$subdomain}.report-uri.com/r/d/ct/enforce +"', 'x-frame-options' => 'SAMEORIG +IN', 'server' => '', 'x-content-type-options' => 'n +osniff', 'x-xss-protection' => '1;mode= +block', 'accept-ranges' => 'bytes', 'referrer-policy' => 'no-refer +rer' }, 'HTTP::Headers' ), '_request' => undef, '_content' => '', '_rc' => 304, '_msg' => 'Not Modified' }, 'HTTP::Response' );
        Corion,
        It looks like the issue is related to the file type. If the .csv file is replaced with .html file, the ->get() returns the content of the file.
      Corion says: Did you set the download directory (download_directory option) in the constructor?
      -----------

      The download_directory finally worked after finding out that chrome does not support paths in this format c:/path...
      The download path has to use the backslash instead ( c:\path... ).
      my $downloads = = "C:\\path\\"; $mech->set_download_directory( $downloads); $mech->get($foo);
      My final question is how to stop chrome browser from loading some documents so that the download can be executed with WMC. If a document (such as a jpg) is loaded in the browser, the file does not download.
      Thank you for your patience.

        If a file does not download, have you tried inspecting the HTTP::Response object you receive from the ->get() call?

        my $response = $mech->get($url); open my $output, '>:raw', '/tmp/output.jpg'; print { $output } $response->decoded_content;

        Edit: You might need to touch the ->content of the browser first so everything has time to initialize first:

        my $response = $mech->get($url); my $c = $mech->content; # Dummy request to initialize everything open my $output, '>:raw', '/tmp/output.jpg'; print { $output } $response->decoded_content;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11134919]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2024-10-15 02:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.