Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Perl - Socket and Data Compression

by tptass (Sexton)
on Jul 17, 2009 at 22:12 UTC ( [id://781195] : perlquestion . print w/replies, xml ) Need Help??

tptass has asked for the wisdom of the Perl Monks concerning the following question:

I am looking to transfer large amount of data through a socket from one machine to another. Usually I would just use scp or ssh, but this needs to all be done automatically, and setting up ssh keys cannot be done. I am trying to find a good way to do something link the following:

tar czf - <directory> | ssh <username>@<hostname> tar xzf -

Currently I am using sysread and syswrite to read and write to the socket and can transfer a single file at a time. However, I have directories that are 6 - 10 GB in size, so I was trying to reduce the transfer cost by tarring the directory to be sent. I could tar the directory prior to sending, but that seems foolish and I may run out of space on some machines if I do that. Is there a way to pipe tar into a socket, rather than to ssh? If so, can you please provide a small snippet. Thanks!

Replies are listed 'Best First'.
Re: Perl - Socket and Data Compression
by rcaputo (Chaplain) on Jul 17, 2009 at 22:24 UTC

    There's netcat, and I hear there are other, possibly better alternatives.

    tar czf - <directory> | nc <hostname> <port>

    You can mix in pipeview to monitor your progress.

    tar czf - <directory> | pv | netcat <hostname> <port>

Re: Perl - Socket and Data Compression
by jethro (Monsignor) on Jul 17, 2009 at 22:59 UTC

    If you want a perl solution you could use something like the following to get the tar-gzipped data into perl

    open(F,'-|',"tar czf - $dir") or die ...
Re: Perl - Socket and Data Compression
by Marshall (Canon) on Jul 18, 2009 at 00:01 UTC
    I have found 7-ZIP, to compress far better than .zip, tar or variations.

    This is not a solution to encryption, just a data compression solution. but it often does about 30% better.

    I think you your problem is to distribute the SW and get it downloaded. Put up a site that can be accessed via SSL. Let your clients download this humongous this level of bandwidth required, you may not even have to worry about data compression! Or at least for what your clients pay their ISP.

      There are a number of possibilities:

      You could use Expect to fake a password interaction with SSH. That's probably your best bet because it does everything you want.

      You could read the stdin and then send it to a socket, so your script has a similer user interface to ssh. E.G

      $ tar -czgf- /path/to/stuff | ./your_script

      You could use named pipes (AKA fifo).

      $ mkfifo mypipe $ tar -czf mypipe /path/to/stuff & $ ./your_script --infile mypipe

      The best place to read about all this and more is perlipc.

      If you're going to do your own networking, and not use SSH, check out IO::Socket::SSL.

        My post was just "how to compress the bits". How to securely access the compresed bits is a different question which I didn't address.

        From my experience the .7Z format is cool. But I've found my typical users can't install this thing even though its freeware. However, I've also found that I can use this thing to make a .zip file about 10x as fast and 20% smaller than the Windows .zip program. The Windows "unzip" can read this 7zipped, .zip file.

Re: Perl - Socket and Data Compression
by tptass (Sexton) on Jul 18, 2009 at 13:53 UTC

    I am currently using IO::Socket::SSL to create the Socket connection between the two machines. I will try OPEN a file handler for the tar command and see if I can get it to the other end compressed.

    Using Expect would be fine if I were at home and didn't really care for password protection. However, I do not like leaving passwords in files that could later be broken into, even with file permissions and what not.

    netcat could have been a usable solution, but I don't think a secure SSL socket can be create in perl with it. I will have to look into this more to see if this could actually be done.

    Thanks, everyone for the ideas.
GNU tar has remote option
by unixtechie (Initiate) on Jul 20, 2009 at 10:21 UTC
    GNU tar, and as far as I remember, proprietary unix tars, have ability to pour contents to a _remote machine_
    With gnu tar it implemented via "rmt" program.

    The default is to use "rsh" for transferring data, but there is a tar option that assigns the "rsh-command", so I presume you could redirect it through ssh, too.

    Remember though that ssh encryption/decryption takes its toll, and with sizable backups you will be set back very considerably, both in transmission times and in CPU load at least.

    P.S. .. and for processing without human intervention one could use "ssh-agent". After initial sending of keys everything should automate routinely