Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

check *.tmp wether it's a zip or not

by Anonymous Monk
on Nov 26, 2004 at 10:52 UTC ( #410553=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm quite happily using Archive::Zip, but what I'm missing is a test wether the file is a zip file or not.

I receive attached files (*.txt, *.log and *.zip) by email.
My Mailprogram now calls my perl-prg
and 'delivers' all attachements as tmp-files (/tmp/../uihm7934.tmp),
so I'm loosing the post-scripts to recognize wether it is a zip-file or not.

No I want to check if the actual file (kluai8qwz.tmp) is it a zip-file (-> continue) or not (->next).

Right now I try to access that file

status = zip->read( $tmp );
and I get this (unnecessary) errormessage:
format error: can't find EOCD signature Archive::Zip::Archive::_findEndOfCentralDirectory('Archive::Zi +p::Archive=HASH(0x85fd7ac)','IO::File=GLOB(0x85fd974)') called at /us +r/lib/perl5/sit e_perl/5.8.3/Archive/Zip.pm line 955 Archive::Zip::Archive::readFromFileHandle('Archive::Zip::Archi +ve=HASH(0x85fd7ac)','IO::File=GLOB(0x85fd974)','/tmp/kde-cas/kmailn0q +acc.tmp') calle d at /usr/lib/perl5/site_perl/5.8.3/Archive/Zip.pm line 929 Archive::Zip::Archive::read('Archive::Zip::Archive=HASH(0x85fd +7ac)','/tmp/kde-cas/kmailn0qacc.tmp') called at /home/QTS/bin/zipBarC +.pl line 52
Well the Error_Code is set as well, but this would be enough.

Does anybody know a more elegant way of a zip-check?

Thanks in advance
Carl Carl

Comment on check *.tmp wether it's a zip or not
Select or Download Code
Re: check *.tmp wether it's a zip or not
by Anonymous Monk on Nov 26, 2004 at 10:57 UTC
    hmm,
    I forgot to say that I tried(-T $tmp) but all tmp-files (that had been *.log, *.txt, *.zip) were recognized as non-plain/text files.

    Carl

Re: check *.tmp wether it's a zip or not
by tomhukins (Curate) on Nov 26, 2004 at 11:04 UTC
    You could use something like File::Type to check whether the file looks like an application/zip file or not, before passing it to Archive::Zip.
Re: check *.tmp wether it's a zip or not
by guha (Priest) on Nov 26, 2004 at 11:17 UTC

    My quick tests show that a zip file's first four bytes seem to be 50 4B 03 04 in hexadecimal notation. That should allow you to filter out most of the non-zip files.

    Then you could use your zip->read() in a block eval to trap those other files that starts with the magic byte sequence.

    However, consider this a hack to be used only if you can't find a better-founded idea.

    Update:
    Well it seems it couldn't be better founded. The byte sequence is described in the format documentation as the "Local file header signature".

      Thanks,
      for your replies, I've installed File::Type that works.
      But I've written to the author of Archive::Zip, that such a test might be usefull to include.

      Carl

Re: check *.tmp wether it's a zip or not
by Hena (Friar) on Nov 26, 2004 at 11:40 UTC
    If you are using *nix, try command 'file'.
Re: check *.tmp wether it's a zip or not
by rev_1318 (Chaplain) on Nov 26, 2004 at 12:57 UTC
    On would expect that wrapping the code in eval {...} would catch the error (at least, I would), but it doesn't.
    Anybody knows why? What am I missing here?

    Paul

      Maybe the error was a warning, not a die:

      $ perl -e ' eval { warn "xxx" } ' xxx at -e line 1. $ perl -e ' eval { die "xxx" } ' $

        In which case, one might consider a local $SIG{__WARN__} handler:

        sub is_zip { #Load the file local $SIG{__WARN__} = sub { goto CONTINUE }; $status = $zip->read($file); CONTINUE: #Check $status }

        The code above is bad (it's an idea only), but the basic idea is to have a __WARN__ handler that supresses printing of the error and then branches to a goto tag outside itself, which keeps the default handler from being called.

        However, using mime-magic to determine the file type is probably better.

        #Read data into $file if ($file =~ m/^\x50\x4B\x03\x04/) { #Parse as ZIP file } else { #Parse as something else }

        Or automate that type of task with the previously-mentioned File::Type module.


        radiantmatrix
        require General::Disclaimer;
        Perl is

        perl -e "{ local $SIG{__WARN__} = sub { die @_;};eval {warn 'xxx'}; pr +int 'a:'. $@}" __OUTPUT__ a:xxx at -e line 1.

        That is check $@ for success.

        If you only check the signature, which is what my quick check into File::Type confirmed that it is doing, there is a risk even if it is remote that another binary file could start with these magic bytes.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://410553]
Approved by muntfish
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (10)
As of 2014-07-11 07:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (220 votes), past polls