Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^5: Making an array from a downloaded web page

by moklevat (Priest)
on Jan 18, 2007 at 16:40 UTC ( #595281=note: print w/ replies, xml ) Need Help??


in reply to Re^4: Making an array from a downloaded web page
in thread Making an array from a downloaded web page

It's a reasonable question, but you still haven't posted any code, so you may indeed get dinged. In the monestary, code begets code. If your original post had included some kind of code, you probably would have had a lot more input from monks and might have a working solution by now.

On to your question.

The documentation for the get() method in Net::FTP mentions that for get(REMOTE_FILE[,LOCAL_FILE,WHERE]), LOCAL_FILE may be a filename or a filehandle. If you open() a filehandle for writing you can write as many index files as you want and they will be concatenated in the order they were written. WHERE is optional in the method, but you could use it to skip the first unnecessary header bytes of the index file. You can also open() an "in memory" filehandle that is held as a scalar. This is probably what you want. Here is a quick script that grabs the index files for all 4 quarters in 2 years and writes the concatenated indices to a file. I have also included a commented out option to use a scalar as a filehandle. This is what you will probably ultimately want to use.

#!/usr/bin/perl use strict; use warnings; use Net::FTP; my $host = "ftp.sec.gov"; my $username = 'anonymous'; my $password = 'yourmail@domain.com'; my $indexdir = '/edgar/full-index'; my @years = qw/2005 2006/; my @quarters = qw/QTR1 QTR2 QTR3 QTR4/; my $indexbyfirm = 'company.idx'; my $indexoutfile = "./complete_index"; ##This opens an "in memory" filehandle as a scalar #open my $indexsave, '>', \ my $pseudo_file # or die "Couldn't open memory handle: $!"; open my $indexsave, '>', $indexoutfile or die "Couldn't open filehandle: $!"; my $ftp= Net::FTP->new("$host", Timeout => 30, Debug => 1) or die "Couldn't connect: $@\n"; $ftp->login($username, $password) or die "Couldn't authenticate.\n"; for my $year (@years) { for my $quarter (@quarters) { $ftp->cwd("$indexdir/$year/$quarter") or die "Couldn't change directories : $!\n"; $ftp->get($indexbyfirm, $indexsave) or die "Couldn't fetch $indexbyfirm : $!\n"; } } ## You can work with the "in memory" file like any scalar # print "$pseudo_file"; $ftp -> quit();


Comment on Re^5: Making an array from a downloaded web page
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://595281]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2014-12-28 21:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (182 votes), past polls