Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

check *.tmp wether it's a zip or not

by Anonymous Monk
on Nov 26, 2004 at 10:52 UTC ( #410553=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm quite happily using Archive::Zip, but what I'm missing is a test wether the file is a zip file or not.

I receive attached files (*.txt, *.log and *.zip) by email.
My Mailprogram now calls my perl-prg
and 'delivers' all attachements as tmp-files (/tmp/../uihm7934.tmp),
so I'm loosing the post-scripts to recognize wether it is a zip-file or not.

No I want to check if the actual file (kluai8qwz.tmp) is it a zip-file (-> continue) or not (->next).

Right now I try to access that file

status = zip->read( $tmp );
and I get this (unnecessary) errormessage:
format error: can't find EOCD signature Archive::Zip::Archive::_findEndOfCentralDirectory('Archive::Zi +p::Archive=HASH(0x85fd7ac)','IO::File=GLOB(0x85fd974)') called at /us +r/lib/perl5/sit e_perl/5.8.3/Archive/Zip.pm line 955 Archive::Zip::Archive::readFromFileHandle('Archive::Zip::Archi +ve=HASH(0x85fd7ac)','IO::File=GLOB(0x85fd974)','/tmp/kde-cas/kmailn0q +acc.tmp') calle d at /usr/lib/perl5/site_perl/5.8.3/Archive/Zip.pm line 929 Archive::Zip::Archive::read('Archive::Zip::Archive=HASH(0x85fd +7ac)','/tmp/kde-cas/kmailn0qacc.tmp') called at /home/QTS/bin/zipBarC +.pl line 52
Well the Error_Code is set as well, but this would be enough.

Does anybody know a more elegant way of a zip-check?

Thanks in advance
Carl Carl

Comment on check *.tmp wether it's a zip or not
Select or Download Code
Replies are listed 'Best First'.
Re: check *.tmp wether it's a zip or not
by tomhukins (Curate) on Nov 26, 2004 at 11:04 UTC
    You could use something like File::Type to check whether the file looks like an application/zip file or not, before passing it to Archive::Zip.
Re: check *.tmp wether it's a zip or not
by guha (Priest) on Nov 26, 2004 at 11:17 UTC

    My quick tests show that a zip file's first four bytes seem to be 50 4B 03 04 in hexadecimal notation. That should allow you to filter out most of the non-zip files.

    Then you could use your zip->read() in a block eval to trap those other files that starts with the magic byte sequence.

    However, consider this a hack to be used only if you can't find a better-founded idea.

    Update:
    Well it seems it couldn't be better founded. The byte sequence is described in the format documentation as the "Local file header signature".

      Thanks,
      for your replies, I've installed File::Type that works.
      But I've written to the author of Archive::Zip, that such a test might be usefull to include.

      Carl

Re: check *.tmp wether it's a zip or not
by Hena (Friar) on Nov 26, 2004 at 11:40 UTC
    If you are using *nix, try command 'file'.
Re: check *.tmp wether it's a zip or not
by Anonymous Monk on Nov 26, 2004 at 10:57 UTC
    hmm,
    I forgot to say that I tried(-T $tmp) but all tmp-files (that had been *.log, *.txt, *.zip) were recognized as non-plain/text files.

    Carl

Re: check *.tmp wether it's a zip or not
by rev_1318 (Chaplain) on Nov 26, 2004 at 12:57 UTC
    On would expect that wrapping the code in eval {...} would catch the error (at least, I would), but it doesn't.
    Anybody knows why? What am I missing here?

    Paul

      Maybe the error was a warning, not a die:

      $ perl -e ' eval { warn "xxx" } ' xxx at -e line 1. $ perl -e ' eval { die "xxx" } ' $

        In which case, one might consider a local $SIG{__WARN__} handler:

        sub is_zip { #Load the file local $SIG{__WARN__} = sub { goto CONTINUE }; $status = $zip->read($file); CONTINUE: #Check $status }

        The code above is bad (it's an idea only), but the basic idea is to have a __WARN__ handler that supresses printing of the error and then branches to a goto tag outside itself, which keeps the default handler from being called.

        However, using mime-magic to determine the file type is probably better.

        #Read data into $file if ($file =~ m/^\x50\x4B\x03\x04/) { #Parse as ZIP file } else { #Parse as something else }

        Or automate that type of task with the previously-mentioned File::Type module.


        radiantmatrix
        require General::Disclaimer;
        Perl is

        perl -e "{ local $SIG{__WARN__} = sub { die @_;};eval {warn 'xxx'}; pr +int 'a:'. $@}" __OUTPUT__ a:xxx at -e line 1.

        That is check $@ for success.

        If you only check the signature, which is what my quick check into File::Type confirmed that it is doing, there is a risk even if it is remote that another binary file could start with these magic bytes.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://410553]
Approved by muntfish
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2015-07-29 23:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (269 votes), past polls