Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^3: Fetch Problem uri

by 1nickt (Canon)
on Jul 04, 2015 at 14:45 UTC ( [id://1133154]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Fetch Problem uri
in thread Fetch Problem uri

(Scroll down for an answer to your latest question ...)

Strange that there is no available module that seems to cope rightly ith this URI. Or maybe is the URI to be "non standard

Your URI is fine (until you added a space at the start, heh). The module is maybe what is "non-standard," I am afraid.

First the problem addressed by Perlbotics' patch; the method $ff->output_file not being a method to set the value, as it would appear to be.

Then the ungraceful handling of a problem URI (e.g. with a leading space as in your OP):

my $url = ' http://www.perlmonks.com/foo?bar=baz'; print "Downloading >$url<\n"; # note use of delimiters to make a stray + # leading space more visible in your deb +ug my $ff = File::Fetch->new(uri => $url); say "scheme: " . $ff->scheme; say "host: " . $ff->host; say "path: " . $ff->path; say "file: " . $ff->file; say "output_file: " . $ff->output_file;
## outputs: Use of uninitialized value in concatenation (.) or string at ./foo.pl +line 10. scheme: # <- error host: http: # <- error path: //www.perlmonks.com/ # <- error file: foo?bar=baz output_file: foo

These two things alone would make me consider looking for a different solution on CPAN.

Now I just have to figure out how to get the right file name out of the URI.

You are on the right track with a path-parsing module. But if all your files are of the format you showed, you might want to use a regexp:

#!/usr/bin/env perl -w use strict; my $url = 'http://www.ekey.net/downloads-475?download=2132cbe2-2fb1-ee +ff-583c-50a39b6aba6c&name=v2_ITA_12-Seiter_Programm_1207_web.pdf'; (my $output_name = $url) =~ s/^.*name=(.*)$/$1/; print "$output_name\n"; __END__
## outputs: v2_ITA_12-Seiter_Programm_1207_web.pdf
Remember: Ne dederis in spiritu molere illegitimi!

Replies are listed 'Best First'.
Re^4: Fetch Problem uri
by Anonymous Monk on Jul 04, 2015 at 16:00 UTC

    Thank you very much for your valuable feedback. I finally decided to roll back to LWP using the following"

    my $filename = (URI->new($url)->path_segments)[-1]; print "Filename is $filename \n"; $response = $browser->get($url, ':content_file' => "$filename ", ); if ($response->is_error()) { my $error= "\nCould not open $url\n", $response->status_line; }

    The problem with the file name remains (unfortunately the URIs can have any possible form so that I'm not sure your Regex will match any URI

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1133154]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 23:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found