Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: Determine if a file is password protected

by rubasov (Friar)
on Feb 20, 2010 at 01:44 UTC ( #824309=note: print w/replies, xml ) Need Help??

in reply to Determine if a file is password protected

If you don't want to implement the file format specific test for each of your extensions, then you can peek in the randomness of your data. Any well designed encryption scheme will result random looking encrypted data (to resist statistical analysis). But if the encryption is poorly designed this won't be much help for you.

Beware that this approach has serious caveats: if your data can be real/pseudo random or compressed data, then it will also look like than a pile of random bits, so for example you won't be able to distinguish between a simple and an encrypted rar/zip file. (And don't forget that simple looking document formats can use compression internally.)

For the concrete implementation search for the chi square test on CPAN (I haven't looked but I'm almost sure you'll find some implementation) and try to experiment with it whether it can be good enough for your purpose.

  • Comment on Re: Determine if a file is password protected

Replies are listed 'Best First'.
Re^2: Determine if a file is password protected
by Porculus (Hermit) on Feb 22, 2010 at 00:13 UTC

    +1 because it's a really neat idea, but your caveat applies more widely than you might think; most PDF content is compressed, for example, and MS Office documents in the 2007+ format are literally ZIP files. So in practice it might be more useful to go straight for per-format tests.

Re^2: Determine if a file is password protected
by mpeg4codec (Pilgrim) on Feb 21, 2010 at 21:06 UTC
    ++ for thinking outside the box

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://824309]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2018-04-22 23:02 GMT
Find Nodes?
    Voting Booth?