http://www.perlmonks.org?node_id=548567

avo has asked for the wisdom of the Perl Monks concerning the following question:

Hi there! I've been looking arround for an explanation of how to get md5 sum of a file remotely with perl. I have a script that downloads a file using net::ftp and then storing it localy... obviously if the file transfer is not complete the file stored localy is not the right one... So I've been thinkig - what is the best way to get a MD5 sum of the remote file using either net::ftp / net::sftp or (maybe) net::ssh ? Thanks for the help!

Note: I am not creating the files on the ftp server, which means that I have no way to create a checksum file there either.

Regards
  • Comment on get md5 sum of a remote file via net::ftp / net::sftp

Replies are listed 'Best First'.
Re: get md5 sum of a remote file via net::ftp / net::sftp
by BrowserUk (Patriarch) on May 11, 2006 at 00:32 UTC

    As pointed out, unless there is a precalculated md5 of the file published at the remote end that you can compare your locally calculated value against, calculating an md5 locally doesn't buy you anything.

    For a simplistic check, but one that will detect common errors like ascii-v-binary tranfers and partial transfers, use the size method of NET::Ftp to get the remote size of the file and compare that against the local size after transfer. It won't detect in-transit corruption nor deliberate replacements, but then neither will a local MD5 without something reliable to compare against.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Very good point. I think I will write a daemon that does MD5 for me via TCP/IP and then do the FTP having the MD5 of the remote file before the transfer. SSH is a good idea as well. I will now just have to decide on the quickest. I have POE server in mind - that will be quick enough ... Will post the script here when done.
Re: get md5 sum of a remote file via net::ftp / net::sftp
by TedPride (Priest) on May 10, 2006 at 22:57 UTC
    use strict; use warnings; use Net::FTP; use Digest::MD5 qw(md5 md5_hex md5_base64); my ($ftp, $host, $user, $pass, $handle, $dir, $fname, $hash); $host = "ftp.mysite.com"; $user = "user"; $pass = "pass"; $dir = "/www/htdocs"; $fname = "robots.txt"; $ftp = Net::FTP->new($host, Debug => 0); $ftp->login($user, $pass) || die "Bad login"; $ftp->cwd($dir) || die "Unable to change directories"; $ftp->get($fname) || die "Unable to download file"; $ftp->quit; open($handle, $fname); $_ = join '', <$handle>; close($handle); $hash = md5_hex($_); print $hash;
    You might want something more advanced that gets a directory list, queues all the files, downloads each one and hashes it, and requeues files that fail to download (up to x number of tries per file), but this should get you started. I don't feel like writing a whole application right now :)
Re: get md5 sum of a remote file via net::ftp / net::sftp
by thor (Priest) on May 11, 2006 at 00:03 UTC
    Keep in mind that successfully ftping a file doesn't guarantee that it'll have the same checksum at the source and destination. If you ftp in ASCII mode, ftp will convert line endings for you (which I consider a good feature). This will destroy your scheme.

    thor

    The only easy day was yesterday

Re: get md5 sum of a remote file via net::ftp / net::sftp
by eXile (Priest) on May 11, 2006 at 14:24 UTC
    If the remote machine has the 'md5' command you could ssh into the machine and run the md5 command on the specified files. If not you could write your own md5-ing perl script (using examples above), and run that on the remote machine (if you can write files there).

    I'd use Expect to script the ssh-ing and running 'md5'.

Re: get md5 sum of a remote file via net::ftp / net::sftp
by TedPride (Priest) on May 11, 2006 at 07:58 UTC
    Hmm. I guess the simplest thing to do then would be connect via ssh, ask for a directory of files, then FTP over just the ones updated since the last run.