Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

"406 not acceptable" errors with LWP::UserAgent::get

by jimhenry (Acolyte)
on Aug 17, 2020 at 23:31 UTC ( [id://11120857] : perlquestion . print w/replies, xml ) Need Help??

jimhenry has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a podcatcher and I ran into a problem where one particular podcast (whose RSS feed downloads fine in Firefox or wget) was failing to download when I used the default HTTP headers I used with all the other podcasts I've tested with. It would give a "406 Not Aceptable" error, which research suggested was caused by bad Accept headers. I ran wget -d to see what headers it was using and copied them into my script (or rather the config file for my script), and got the same error with this podcast. Below, find a simplified version of the code that reproduces the problem.

#! /usr/bin/perl use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new; $ua->agent('Mozilla/5.0'); $ua->show_progress( 1 ); my %headers = ( 'Accept' => '*/*', 'Accept-Encoding' => 'identity', 'Connection' => 'Keep-Alive', # 'Host' => '' ); 'Host' => '' ); #my $url = ''; my $url = ''; my $response = $ua->get( $url, %headers ); if ( $response->is_success ) { print "Okay!\n"; } else { my $status = $response->status_line; print "failed to download $url: $status\n"; }

If I use the commented out lines instead for the Host and $url value (checking a different podcast's RSS feed), everything works fine. I also tried using the default Firefox Accept: header, based on, and got the same 406 error. The same podcast also give me a 406 error when I try to get an individual mp3 file.

Any ideas how to narrow the problem down further, if not fix it?

I'm using Perl v5.26.1 and LWP::UserAgent version 6.31.

Replies are listed 'Best First'.
Re: "406 not acceptable" errors with LWP::UserAgent::get
by Your Mother (Archbishop) on Aug 17, 2020 at 23:44 UTC

    Seems the main thing it doesn’t like is the agent. Your code gave me the same error, even tweaked and pruned down until I swapped the agent for something that isn’t, apparently, blacklisted.

    use 5.10.0; use LWP::UserAgent; my $ua = LWP::UserAgent->new; $ua->agent('DomoArigato/3.0'); my $url = ''; my $response = $ua->get($url); say $response->is_success ? "OK!" : $response->as_string; __END__ OK!

      Usually to get around these kinds of blocks the easiest way is to set the user agent string to be the same one your browser uses

      "Excuse me for butting in, but I'm interrupt-driven..."

        Disagree in the main. :P People blacklist agents, not whitelist, and a one-off for an agent that is NEVER abusive to a service is less likely to get crapcanned than one that shares a (base) name with 30% of the traffic.

        Thanks for the replies. I tried setting the agent to the user-agent string used by my current version of Firefox and had no trouble. (I'm still not sure why I was getting a 406 error for this podcast instead of a 403 Forbidden error, which I was getting on a number of podcasts with the default agent of "libwww-perl".)
Re: "406 not acceptable" errors with LWP::UserAgent::get
by tobyink (Canon) on Aug 19, 2020 at 11:42 UTC

    As an aside, there's no need to set a Host header. Yes, it's required by HTTP (unless you're still using HTTP 1.0), but LWP::UserAgent will do that for you. As would HTTP::Tiny or any other HTTP client library worth its salt.

    The only benefit you get from specifying it manually is the wonderful opportunity to screw things up.