http://www.perlmonks.org?node_id=638752

Gangabass has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks

I need a way to parse results of FTP ls command course for some reasons i can't use Net::FTP::Common. Results look like so:

03-26-07 05:23AM <DIR> html 03-26-07 04:27AM <DIR> ibm-laptop 03-26-07 03:16AM <DIR> images 03-27-07 11:00PM 6397 index.html 03-26-07 03:45AM 10186 index1.html

So i need a way to get file/directory name and item type (directory or file). But i can't write regex for that :-(. Maybe regex is a wrong way in this case?

Replies are listed 'Best First'.
Re: Parsing FTP ls command results
by DrHyde (Prior) on Sep 13, 2007 at 09:36 UTC

    People would be able to help you more if you showed us what you've already tried, and explain how the results differ from what you expect and from what you want.

    I would also note that the text you get back from ls is dependent on both the server OS and the particular ftp server in use. It might even, because of ftp's "helpful" line ending conversion, depend on the client OS too.

Re: Parsing FTP ls command results
by dwm042 (Priest) on Sep 13, 2007 at 11:06 UTC
    Though others have pointed out the FTP text is variable, one way to parse the provided text into directory and file listings is as follows:

    #!/usr/bin/perl use warnings; use strict; while(<DATA>) { chomp; my @elements = split /\s+/, $_; if ( $elements[2] eq '<DIR>' ) { print "Directory: ",$elements[3], "\n"; } else { print "The size of $elements[3] is $elements[2] bytes.\n"; } } __DATA__ 03-26-07 05:23AM <DIR> html 03-26-07 04:27AM <DIR> ibm-laptop 03-26-07 03:16AM <DIR> images 03-27-07 11:00PM 6397 index.html 03-26-07 03:45AM 10186 index1.html
    And if this code is run:

    ~/perl/monks$ ./parselist.pl Directory: html Directory: ibm-laptop Directory: images The size of index.html is 6397 bytes. The size of index1.html is 10186 bytes.
      I'm not sure how file names with spaces will be displayed by the FTP command but I would probably use the three argument form of split just to be safe. I would also use a list on the LHS rather than an array for readability.

      my ($date, $time, $typeOrSize, $name) = split /\s+/, $_, 4;

      Cheers,

      JohnGG

        thanks to you john... your comment help me so much, thanks again!

      dwm042:

      Yes, but some FTP servers will give you some even uglier output, like:

      125 List started OK. Volume Unit Date Ext Used Recfm Lrecl BlkSz Dsorg Dsname APCSPL 3380D 07/16/97 1 1 FB 80 8800 PS ETC.RPC APCSPL 3380D 08/03/97 1 1 FB 80 3200 PS ETC.SERVICES APCSPL 3380D 08/03/97 1 1 FB 80 3120 PS FTP.DATA APCSPL 3380D 08/02/97 1 1 F 158 158 PS HOSTS.ADDRINFO APCSPL 3380D 08/03/97 1 1 FB 80 3120 PS HOSTS.LOCAL APCSPL 3380D 07/30/97 1 1 F 56 56 PS HOSTS.SITEINFO 250 List completed successfully.

      So if you're trying to be tolerant of multiple FTP servers, be aware that you may have to have multiple parsers. (Note: The above was snipped from the IBM help pages.)

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

Re: Parsing FTP ls command results
by roboticus (Chancellor) on Sep 13, 2007 at 10:21 UTC
    Gangabass:

    Yes, it's unfortunate that the FTP protocol doesn't specify the format for the ls and dir commands. Certainly there would be differences between the available information on different operating systems, but rather than leaving it totally unspecified, I'd've preferred that they have a fixed set of information on the front of each line, with "extra" information in free-format afterwards so this wouldn't be a problem.

    For example, the first line should be a header line (to describe the columns). Then each remaining line should have the filename, the size (if known), and date/time last changed (if known) with other information changing by OS after that. Were it specified that way, then your client could *always* use the same code to find the filenames available if it didn't care about the other information.

    I have that problem frequently, as many of my programs must interface with an FTP server running on a zOS mainframe with its own peculiarities....

    So you'll likely want to read 'perldoc perlfunc' and look at the entries for unpack and substr to see how to parse it.

    ...roboticus

Re: Parsing FTP ls command results
by randyk (Parson) on Sep 13, 2007 at 14:31 UTC
    You could use the parse_dir function of File::Listing:
    use strict; use warnings; use Net::FTP; use File::Listing qw(parse_dir); my $ftp = Net::FTP->new('some.host'); $ftp->login('user', 'password'); my $ls = $ftp->dir('/etc'); foreach my $entry (parse_dir($ls)) { my ($name, $type, $size, $mtime, $mode) = @$entry; if ($type eq 'd') { print "The directory is $name\n"; } else { next unless ($type eq 'f' and $size > 0); print "$name has size $size\n"; } } $ftp->quit;
Re: Parsing FTP ls command results
by princepawn (Parson) on Sep 13, 2007 at 14:39 UTC