http://www.perlmonks.org?node_id=1066904

bowei_99 has asked for the wisdom of the Perl Monks concerning the following question:

I have the following script, which is supposed to recurse through Dell's ftp site and download text files. It gets in, but doesn't download anything. Any thoughts on why?

Below is the code and my analysis/questions.

#!/usr/bin/perl use strict; use warnings; use Carp; use Data::Dumper; use Net::hostent; #use Net::Ping; use Net::FTP::Recursive; my %params = ( site => "ftp.dell.com", dir => "Browse_For_Drivers/Servers, Storage & Networking/Power +Edge", ); FTPConnect (\%params); sub FTPConnect { my $ref_params = shift @_; my $retval = ""; my $ftp = Net::FTP::Recursive->new($ref_params->{site}, Debug => 1 +, Timeout => 15); if ($ftp) { $retval = "OK: connected via FTP to " . $ref_params->{site} . + "\n\n" ; $ftp->login("anonymous",'me@here.there'); $ftp->binary; $ftp->cwd($ref_params->{dir}); #$ftp->rget( ParseSub => \&GetFiles($ftp), $ftp->rget( ParseSub => \&GetFiles, FlattenTree => 1, #MatchDirs => qr/PowerEdge (R810|R610|R720|R62 +0|M620|M1000E)/, #MatchFiles => qr/\.txt/, ); $ftp->quit; } else { $retval = "ERROR: FTP for host $ref_params->{site}\n\n" } return $retval; } sub GetFiles { #my $my_ftp = shift @_; #$my_ftp->get("*.txt"); return; }
With nothing in GetFiles (only a return statement), I get this when I run it:

Net::FTP::Recursive=GLOB(0xa4bee8)<<< 250 CWD command successful. Net::FTP::Recursive=GLOB(0xa4bee8)>>> PWD Net::FTP::Recursive=GLOB(0xa4bee8)<<< 257 "/Browse_For_Drivers/Servers +, Storage & Networking/PowerEdge" is current directory. Net::FTP::Recursive=GLOB(0xa4bee8)>>> PASV Net::FTP::Recursive=GLOB(0xa4bee8)<<< 227 Entering Passive Mode (143,1 +66,147,12,231,73) Net::FTP::Recursive=GLOB(0xa4bee8)>>> LIST Net::FTP::Recursive=GLOB(0xa4bee8)<<< 150 Opening BINARY mode data con +nection. Net::FTP::Recursive=GLOB(0xa4bee8)<<< 226 Transfer complete. drwxrwxrwx 1 owner group 0 Aug 29 2012 Dell KVM 10 +81AD drwxrwxrwx 1 owner group 0 Sep 25 2012 Dell KVM 10 +82DS ... Net::FTP::Recursive=GLOB(0xa4bee8)>>> PWD Net::FTP::Recursive=GLOB(0xa4bee8)<<< 257 "/Browse_For_Drivers/Servers +, Storage & Networking/PowerEdge" is current directory. Net::FTP::Recursive=GLOB(0xa4bee8)>>> QUIT Net::FTP::Recursive=GLOB(0xa4bee8)<<< 221 Thank you for using the Dell + FTP site, please come again.
When I add the code to get files, I get the same, except for:
Net::FTP::Recursive=GLOB(0x1054fa8)<<< 250 CWD command successful. Net::FTP::Recursive=GLOB(0x1054fa8)>>> PASV Net::FTP::Recursive=GLOB(0x1054fa8)<<< 227 Entering Passive Mode (143, +166,135,12,255,118) Net::FTP::Recursive=GLOB(0x1054fa8)>>> RETR *.txt Can't use an undefined value as a symbol reference at /usr/share/perl5 +/Net/FTP/dataconn.pm line 54.
While I could put in code to search for and only download text files, I would have thought that would be already be built into the Recursive module. I get the feeling even modifying the GetFiles subroutine may be overkill, as the cpan page seems to imply (to me, at least) it's not needed. From that page:

If you'd like to provide your own function for parsing the data retrieved from this command (in case the ftp server does not understand the "dir" command), all you need do is provide a function to one of the Recursive method calls.

However, I'm not parsing the data, and I'd think I shouldn't need to, if the correct filter parameter is set. I'm referring to the FlattenTree => 1. From the cpan page:

The FlattenTree optional argument will retrieve all of the files from the remote directory structure and place them in the current local directory. This option will resolve filename conflicts by retrieving files with the same name and renaming them in a "$filename.$i" fashion, where $i is the number of times it has retrieved a file with that name. ... MatchFiles - Only transfer plainish (not a directory or a symlink) files that match this pattern. ... MatchDirs - Only recurse into directories that match this pattern.

I thought it was something with my regexes, so I commented those out, but still the same thing.

What am I missing here? Why isn't the script downloading the text files? UPDATE: Thanks all for your help. Splitting the code out in separate calls for each directory and removing the MatchDirs specification works.

-- Burvil

Replies are listed 'Best First'.
Re: Net::FTP::Recursive code not downloading files
by wazat (Monk) on Dec 12, 2013 at 20:39 UTC

    I suspect your problem is your GetFiles() sub. From what I understand from the module docs, the ParseSub argument is meant to provide an alternative parser for directory listing output. Since you are not parsing the directory listing output then DO NOT include the ParseSub argument.

    I see a commented out line specifying MatchFiles.

    $ftp->rget( FlattenTree => 1, MatchFiles => qr/\.txt/, );

    What happens when you use this?

    Also, that pattern would be more correct if you anchor the the end of the file name.

    MatchFiles => qr/\.txt$/,
      Thanks. If I replace the rget call with this:
      $ftp->rget( FlattenTree => 1, MatchDirs => qr/PowerEdge (R810|R610|R720|R620 +|M620|M1000E)/, MatchFiles => qr/\.txt$/, );
      I get it downloading directories, but it doesn't seem to find any text files, as it doesn't look like it's recursing deep enough. Is there a setting for how deep to go? I get this (snippet shown):
      Net::FTP::Recursive=GLOB(0x2355f08)>>> LIST Net::FTP::Recursive=GLOB(0x2355f08)<<< 150 Opening BINARY mode data co +nnection. Net::FTP::Recursive=GLOB(0x2355f08)<<< 226 Transfer complete. drwxrwxrwx 1 owner group 0 Aug 29 2012 Chassis Sys +tem Management drwxrwxrwx 1 owner group 0 Aug 29 2012 Diagnostics -rwxrwxrwx 1 owner group 144462 Aug 29 2012 index.html drwxrwxrwx 1 owner group 0 Sep 25 2012 Legacy drwxrwxrwx 1 owner group 0 Aug 29 2012 Network drwxrwxrwx 1 owner group 0 Aug 29 2012 Rack Soluti +ons drwxrwxrwx 1 owner group 0 Aug 29 2012 SAS Drive drwxrwxrwx 1 owner group 0 Aug 29 2012 SCSI non-RA +ID drwxrwxrwx 1 owner group 0 Aug 29 2012 Serial ATA drwxrwxrwx 1 owner group 0 Aug 29 2012 Systems Man +agement drwxrwxrwx 1 owner group 0 Aug 29 2012 Tape Automa +tion drwxrwxrwx 1 owner group 0 Aug 29 2012 Tape Drives Net::FTP::Recursive=GLOB(0x2355f08)>>> PWD Net::FTP::Recursive=GLOB(0x2355f08)<<< 257 "/Browse_For_Drivers/Server +s, Storage & Networking/PowerEdge/PowerEdge M1000E" is current direct +ory. Returned from rget in /Browse_For_Drivers/Servers, Storage & Networkin +g/PowerEdge. Net::FTP::Recursive=GLOB(0x2355f08)>>> CDUP Net::FTP::Recursive=GLOB(0x2355f08)<<< 250 CDUP command successful. Net::FTP::Recursive=GLOB(0x2355f08)>>> CWD PowerEdge M620 Net::FTP::Recursive=GLOB(0x2355f08)<<< 250 CWD command successful. Calling rget in /Browse_For_Drivers/Servers, Storage & Networking/Powe +rEdge Net::FTP::Recursive=GLOB(0x2355f08)>>> PASV Net::FTP::Recursive=GLOB(0x2355f08)<<< 227 Entering Passive Mode (143, +166,135,12,247,80) Net::FTP::Recursive=GLOB(0x2355f08)>>> LIST Net::FTP::Recursive=GLOB(0x2355f08)<<< 150 Opening BINARY mode data co +nnection. Net::FTP::Recursive=GLOB(0x2355f08)<<< 226 Transfer complete. drwxrwxrwx 1 owner group 0 Aug 29 2012 Application drwxrwxrwx 1 owner group 0 Aug 29 2012 BIOS drwxrwxrwx 1 owner group 0 Aug 29 2012 Chipset drwxrwxrwx 1 owner group 0 Aug 29 2012 Diagnostics drwxrwxrwx 1 owner group 0 Aug 29 2012 Drivers for + OS Deployment drwxrwxrwx 1 owner group 0 Aug 29 2012 Enterprise +Solutions drwxrwxrwx 1 owner group 0 Aug 29 2012 ESM drwxrwxrwx 1 owner group 0 Aug 29 2012 Fibre Chann +el drwxrwxrwx 1 owner group 0 Aug 29 2012 Firmware -rwxrwxrwx 1 owner group 147115 Aug 29 2012 index.html drwxrwxrwx 1 owner group 0 Aug 29 2012 Lifecycle C +ontroller drwxrwxrwx 1 owner group 0 Aug 29 2012 Network drwxrwxrwx 1 owner group 0 Sep 25 2012 PCIe SSS drwxrwxrwx 1 owner group 0 Aug 29 2012 SAS Drive drwxrwxrwx 1 owner group 0 Aug 29 2012 SAS RAID drwxrwxrwx 1 owner group 0 Aug 29 2012 SCSI non-RA +ID drwxrwxrwx 1 owner group 0 Aug 29 2012 Serial ATA drwxrwxrwx 1 owner group 0 Aug 29 2012 Systems Man +agement drwxrwxrwx 1 owner group 0 Aug 29 2012 Video Net::FTP::Recursive=GLOB(0x2355f08)>>> PWD Net::FTP::Recursive=GLOB(0x2355f08)<<< 257 "/Browse_For_Drivers/Server +s, Storage & Networking/PowerEdge/PowerEdge M620" is current director +y. Returned from rget in /Browse_For_Drivers/Servers, Storage & Networkin +g/PowerEdge. Net::FTP::Recursive=GLOB(0x2355f08)>>> CDUP Net::FTP::Recursive=GLOB(0x2355f08)<<< 250 CDUP command successful. Net::FTP::Recursive=GLOB(0x2355f08)>>> CWD PowerEdge R610

      -- Burvil

        I suspect your MatchDirs may be preventing the recursion. If the matching doesn't use the full path then the subdirectory names won't include the parent directory names.

        MatchDirs => qr/PowerEdge R810|R610|R720|R620|M620|M1000E)/,

        If that is the case, then you need to run your code on these 6 directories separately.

        As wazat points out the problem is with MatchDirs, though not with path names, as you can see from the output that it is entering the directories you have in MatchDirs. Once it gets in to one of those directories though, it won't go any further because subsequent directories are not in MatchDirs - it checks every directory for a match before recursing, not just the top level.

        The solution is to get rid of the MatchDirs and call your code for each directory, setting the path in $params{dir}.
        e.g. dir => "Browse_For_Drivers/Servers, Storage & Networking/PowerEdge/PowerEdge R810"