Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

WWW::Mechanize cannot connect to HTTPS site

by Marshall (Canon)
on May 29, 2024 at 06:04 UTC ( [id://11159705]=perlquestion: print w/replies, xml ) Need Help??

Marshall has asked for the wisdom of the Perl Monks concerning the following question:

One of my users runs this program and gets the error show below:
use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get('https://www.cpan.org'); my $html_string = $mech->content(); print $html_string; ### Can't connect to www.cpan.org:443 at test2.pl line 7.
The browser on his system can display that CPAN url and any other https site without problems. Mechanize works fine in other Perl programs on regular http sites.

He is using Perl 5.24 from ActiveState (I am also). We have verified that these modules are installed:
Crypt::SSLeay
Net::SSLeay
IO::Socket::SSL
Net::SSL
LWP::Protocol::https (update: forgot to mention in the original post)

I can't think of anything else that might be needed. Of course, this works fine on my machine. Any ideas?

Update: OS is Windows 10 and LWP::Protocol::https is installed.

Replies are listed 'Best First'.
Re: WWW::Mechanize cannot connect to HTTPS site
by Corion (Patriarch) on May 29, 2024 at 07:55 UTC

    The classic thing that trips me up when doing this is that I forget to install LWP::Protocol::https. I think the result of $mech->get should return 599 then. For a somewhat verbose output, dump the result of ->get:

    use Data::Dumper; print Dumper $mech->get('https://www.cpan.org');

    If this fails with a non-599 error, consider cross-checking with wget or curl. If these two cannot connect to the outside world, then outgoing https connections are likely blocked by your network somewhere. If wget or curl work, this means that there is likely an environment variable that tells the two to use a proxy to connect to the outside world. This would likely be a SOCKS proxy, since WWW::Mechanize automatically picks up other proxies from the environment variables HTTP_PROXY or HTTPS_PROXY.

      I've emailed the user with your code. Note that the browsers work and can access the URL. I don't why Chrome would be able to and a Perl script would not. I will also ask the user to run the test script in a command window with Admin privileges. Also this is in a home environment, not corporate.

      Update: I did some testing. wget doesn't come with Windows 10, but curl does. curl https://www.cpan.org works on my machine and I suggested this test to the user.

Re: WWW::Mechanize cannot connect to HTTPS site
by marto (Cardinal) on May 29, 2024 at 07:52 UTC
      Yes, LWP::Protocol::https is installed. I can compare exact version numbers with the user, but we both installed modules from PPM (Perl Package Manager) not from CPAN.

        You'll probably experience problems with out of date dependencies, things like Mozilla::CA need to be reasonably up to date. IIRC ActiveState cut off PPM access for older perls without a support contact in place.

Re: WWW::Mechanize cannot connect to HTTPS site
by syphilis (Archbishop) on May 29, 2024 at 11:33 UTC
    Any ideas?

    Oddly, I get the same "Can't connect to www.cpan.org:443" error using Strawberry Perl 5.24.1 on Windows 11, whereas Strawberry Perl 5.24.4 (on the very same machine) retrieves the page just fine.
    On SP-5.24.1:
    D:\>perl -le "print $];" 5.024001 D:\>\b\pmver WWW::Mechanize 1.83 D:\>\b\pmver Crypt::SSLeay 0.72 D:\>\b\pmver Net::SSLeay 1.80 D:\>\b\pmver IO::Socket::SSL 2.043 D:\>\b\pmver Net::SSL 2.86 D:\>\b\pmver LWP::Protocol::https 6.06
    On SP-5.24.4:
    D:\>perl -le "print $];" 5.024004 D:\>\b\pmver WWW::Mechanize 1.88 D:\>\b\pmver Crypt::SSLeay 0.72 D:\>\b\pmver Net::SSLeay 1.85 D:\>\b\pmver IO::Socket::SSL 2.056 D:\>\b\pmver Net::SSL 2.86 D:\>\b\pmver LWP::Protocol::https 6.07
    ("pmver" is just a local batch file that prints out the version number of its argument.)

    <UPDATE>:
    They both use different versions of openssl, too. SP-5.24.1 has 1.0.2j and SP-5.24.4 has 1.0.2o
    </UPDATE>

    Is there anything in there that helps ?

    Cheers,
    Rob
      I checked a little bit further and found the same brokenness in both SP-5.24.2 and SP-5.24.3.
      SP-5.24.3 (still broken) had the following versions installed:
      D:\>perl -le "print $];" 5.024003 D:\>\b\pmver WWW::Mechanize 1.86 D:\>\b\pmver Crypt::SSLeay 0.72 D:\>\b\pmver Net::SSLeay 1.81 D:\>\b\pmver IO::Socket::SSL 2.051 D:\>\b\pmver Net::SSL 2.86 D:\>\b\pmver LWP::Protocol::https 6.07
      Its openssl version was 1.0.2l

      UPDATE:
      Upgrading IO::Socket::SSL to version 2.056 (on SP-5.24.3) did not fix the problem - so it would seem that the issue lies NOT with any of those perl modules.
      And I couldn't see any flags being waved in 5.24.4 perldelta, so I guess the problem is with either the openssl version, or some other module.

      Cheers,
      Rob
        And Mozilla::CA?
Re: WWW::Mechanize cannot connect to HTTPS site
by sectokia (Pilgrim) on Jun 04, 2024 at 04:50 UTC

    The problem seems to be that 'ISRG Root X1' was only added to Mozilla::CA in version 20180117. Being the main CA for lets encrypt - any site using lets encrypt won't work if Mozilla::CA is older than 20180117.

    You should probably update Mozilla::CA anyway because many entire dodgy CA's have been removed since 2018.

Re: WWW::Mechanize cannot connect to HTTPS site
by Anonymous Monk on May 29, 2024 at 09:15 UTC

    I vaguely remember Mozilla::CA was required to be fresh, plus (be it unusual or not) reboot after its update, in similar case.

      Such a problem with Mozilla::CA should result in "certificate verify failed" rather than just the generic "can't connect" error. Here's what I see when I force that (host redacted):

      Error GETing https://HOST/: Can't connect to HOST:443 (certificate verify failed)

      Note that this isn't on MS Windows.


      🦛

      Windows loves being rebooted. It not illegal or fattening so its worth a try.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11159705]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-10-06 23:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The PerlMonks site front end has:





    Results (43 votes). Check out past polls.

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.