Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^3: File integrity checker

by no_slogan (Deacon)
on Aug 24, 2017 at 22:44 UTC ( #1197973=note: print w/replies, xml ) Need Help??


in reply to Re^2: File integrity checker
in thread File integrity checker

You can shoot out one of the files in a .zip with something like this:
open my $ZIP, '+<', 'test.zip'; local $/ = undef; my $data = <$ZIP>; my @offset; while ($data =~ /PK\x03\x04/g) { # find all file headers push @offset, pos($data) - 4; } print "Found headers at: @offset\n"; substr($data, $offset[1]+14, 1) ^= "\x01"; # change the crc seek $ZIP, 0, 0; print $ZIP $data;
(This isn't foolproof, but probably good enough.) The structure of a zip file is described here: Zip (file format)#File headers

Replies are listed 'Best First'.
Re^4: File integrity checker
by roperl (Beadle) on Aug 29, 2017 at 17:40 UTC
    Ok so now I'm confused. I was able to corrupt the CRC on a zip file. The unzip utility run with the -t shows the corruption
    # unzip -t bad.zip Archive: bad.zip testing: testfile.txt OK testing: testfile2.txt bad CRC 32e1dbe6 (should be 32f +0dbe6) At least one error was detected in bad.zip.
    But when I run the ziptest.pl utility from the examples directory it shows the CRC as good
    # ./ziptest.pl bad.zip Length Size Last Modified CRC-32 Name -------- -------- ------------------------ -------- ---- 26 14 Mon Aug 28 18:07:58 2017 75b0ca95 testfile.txt 15 9 Mon Aug 28 18:08:10 2017 32e1dbe6 testfile2.txt All CRCs check OK
    So what exactly is $member->crc32 doing? Shouldn't it be checking the crc listing as expected in the zip file and not getting the CRC after the file is extracted?
      Buried treasure! A zip file contains two copies of the file metadata, one just before the data and one at the end of the zip. Unzip uses the first one, and ziptest.pl uses the second. Neither one notices that the two copies don't match. You can modify my little script like this to make ziptest.pl complain:
      while ($data =~ /PK\x01\x02/g) ... # find all file headers substr($data, $offset[1]+16, 1) ^= "\x01"; # change the crc
        Ok that did it. Now it seems to work as expected
        # ./ziptest.pl test.zip Length Size Last Modified CRC-32 Name -------- -------- ------------------------ -------- ---- 26 14 Mon Aug 28 18:07:58 2017 75b0ca95 testfile.txt 15 9 Mon Aug 28 18:08:10 2017 32e1dbe7 testfile2.txt Member testfile2.txt CRC error: file says 32e1dbe7 computed: 32e1dbe6 CRC errors found # unzip -t test.zip Archive: test.zip testing: testfile.txt OK testing: testfile2.txt OK No errors detected in compressed data of test.zip.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1197973]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2019-05-26 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you enjoy 3D movies?



    Results (153 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!