How To Download DAT Files From Unsecured Website

Marjan
I need to download some data files from a website, and would welcome any suggestions.

The website has three pull-down menus: firm name, date, and file format. I want to download data for every firm and date. In other words, firm1-June2001, firm1-July2001, …, firm1-December2011, firm2-June2001, firm2-July2001, …, firm2-December2011. I would also like to choose “dat” from the format pull-down menu, and need to press the download button to download the file to my machine.

I also would like to slow the download speed down so I don’t overload the website’s server, and have a file that indicates which firm-date files are downloaded and whether errors occurred. For instance, I want to distinguish between a missing file and a download error.

I am running this program on a Windows machine with Chrome.

I found the following code at and am looking for any suggestions on how to adapt it. The notations are my additions.
#!/usr/bin/perl -w use strict; use LWP::UserAgent; #$ is a scalar variable, key-->value, LWP is a virutal browser; my $ua = LWP::UserAgent->new; my $user = 'username'; my $pass = 'password'; my $URL = ''; #Creating a file name from the URL; my $filename = substr( $URL, rindex( $URL, "/" ) + 1 ); #Prints and /n adds new line; print "$filename\n"; #Output filename into IN; open( IN, ">$filename" ) or die $!; print "Fetching $URL\n"; my $expected_length; my $bytes_received = 0; #Fetches a file from a website; my $req = HTTP::Request->new(GET => $URL); $req->authorization_basic($user, $pass); my $res = $ua->request($req, sub { #@_ is plural of $_ 9 (default variable); my ( $chunk, $res ) = @_; # = assigns a variable, length is a length function, bytes; #_received number; $bytes_received += length($chunk); #printf is a special print function, SD Error stream, decimal; #number with percent; #symbol; unless ( defined $expected_length ) { $expected_length = $res->content_length || 0; } if ($expected_length) { printf STDERR "%d%% - ", 100 * $bytes_received / $expected +_length; } print STDERR "$bytes_received bytes received\n"; # XXX Should really do something with the chunk itself print IN $chunk; } ); print $res->status_line, "\n"; #I think IN holds the file; close IN; exit;

Re: How To Download DAT Files From Unsecured Website
    I'm not sure that plucking code form the Internet (that you may or may not understand) and asking ' can you refactor this to do x and y?' is a brilliant strategy. Second, grabbing a whole bunch of files from somebody's website is probably a violation of their terms of service. Nevertheless, you might try something like this:

    linux> perl -e 'for my $month (qw|June July|) {for(1..3){my $url = qq| +http://localhost/foo| . $_ . qq|\.tar&dt=| . qq|$month-2001|; print q +q|Doing >>> $url\n|; `wget $url` }}'
    Replace the start of the URL with your target. Of course you'll need wget, but you can even download that for Windows nowadays. Please don't let this discourage you from learning Perl and trying to understand code that you find on the web. It just isn't a very good place to start if your intent is to learn this language. If your intent isn't to learn the language then you are in the wrong place. There are websites that will write scripts for you for a modest sum.

      Thank you for your help. The website makes the data available for downloading; as I'm a student I'm using it for research purposes.

