Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Reading a GZIP network stream

by weismat (Friar)
on May 13, 2009 at 08:17 UTC ( #763699=perlquestion: print w/ replies, xml ) Need Help??
weismat has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to write a script which reads data from a GZip compressed network stream. The first byte is defined as a "Z" - the following data is then ziped. My current attempt looks as follow:
my $s=new IO::Socket::INET(PeerAddr => "$server", PeerPort => $port ); print $s "C\n";#ask for the challenge my $chall; my $status; my $status=$s->recv($temp, 1); $status=$s->recv($temp, 1024); $chall=Compress::Zlib::uncompress($chall);
Unfortunately $chall is always undef indicating a compression error. Given the sizes of $temp and $chall, I am already not sure about the reading of the data from the network for binary data - I have used recv sofar only with text - does recv also work for binary data?
What is the correct way to decompress a GZip stream?

Comment on Reading a GZIP network stream
Download Code
Re: Reading a GZIP network stream
by przemo (Scribe) on May 13, 2009 at 08:33 UTC
    I am already not sure about the reading of the data from the network for binary data - I have used recv sofar only with text - does recv also work for binary data?

    According to docs, if you didn't set anything special on socket with binmode(), it will receive bytes.

    I don't know if it is important, but do all the message fit in 1024 bytes?

    I'd also check compression/decompression starting with a simpler test: take a string, compress it, then decompress it and see if you received the original data back again. If it's fine, then try the same method on the socket.

    You may also try with other modules, like IO::Uncompress::Gunzip.

      I have found one error with reading the bytes as the first parameter of the receive is where the actual data goes.
      But I still think that I am not reading the full required set of bytes. What is the correct way to determine if there are more bytes to read ? Do I need to use select?

        If you're not reading all of the bytes, then the decompression is never going to work. I (and you also, apparently) have doubts that all of your data is fitting into 1024 bytes.

        Where is this data coming from? Did you write the server code? If not, then I don't know that I can help you. If yes, then you probably need to modify the server (and client) to follow some sort of sane protocol.

        Typically, a simple way to handle this is to send the length of the message in the first x bytes of the message. You can use 4 bytes (32 bits), or if that's too much or too little, adjust to your liking. Then instead of using recv, you'll have to use read or sysread in a loop until you've got all of your data.

        ## Warning: Incomplete and untested ## my $buffer; my $got_bytes = 0; while ($got_bytes < $total_bytes) { my $bytes = sysread $s, $buffer, $total_bytes - $got_bytes, $got_b +ytes; }

Re: Reading a GZIP network stream
by Perlbotics (Abbot) on May 13, 2009 at 18:52 UTC

    What result do you get if you change the last line into:

    $chall = Compress::Zlib::uncompress($temp); # ...($chall) -> ...($temp +)
    ? It appears, your script tries to uncompress an undefined scalar.

    HTH, but if it works it is only the first step. See comment by lostjimmy.

      I have found out now that I need to use inflate instead of gzip.
      I would like to use IO::Uncompress::Inflate , but I do not manage to get it working with sockets and not with files.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://763699]
Approved by almut
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2014-07-12 18:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (240 votes), past polls