Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Determine if a file is password protected

by slloyd (Hermit)
on Feb 19, 2010 at 18:31 UTC ( #824224=perlquestion: print w/ replies, xml ) Need Help??
slloyd has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to find a way to determine if a windows file (doc, ppt, pdf, etc) is password protected before attempting to open it. Is there a way to check this?
s/te/ve/

Comment on Determine if a file is password protected
Re: Determine if a file is password protected
by Anonymous Monk on Feb 19, 2010 at 19:10 UTC
    its probably a extended file attribute, Win32::OLE + wshell script
      A PDF-file not being specific to Windows is unlikely to yield its password protection status in a Windows extended file attribute and if it did, at the very least the OS must have opened it to peek inside!.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Determine if a file is password protected
by SuicideJunkie (Priest) on Feb 19, 2010 at 19:41 UTC
    Do you mean file permissions, or encrypted file contents?
    If you're talking about the contents, then you will have to either ask an app or module that knows how to read that particular file format, or consider the specs for each file format (if there are any).
    A password in a .zip file will be completely different from a password in a word document, for example.
Re: Determine if a file is password protected
by rubasov (Friar) on Feb 20, 2010 at 01:44 UTC
    If you don't want to implement the file format specific test for each of your extensions, then you can peek in the randomness of your data. Any well designed encryption scheme will result random looking encrypted data (to resist statistical analysis). But if the encryption is poorly designed this won't be much help for you.

    Beware that this approach has serious caveats: if your data can be real/pseudo random or compressed data, then it will also look like than a pile of random bits, so for example you won't be able to distinguish between a simple and an encrypted rar/zip file. (And don't forget that simple looking document formats can use compression internally.)

    For the concrete implementation search for the chi square test on CPAN (I haven't looked but I'm almost sure you'll find some implementation) and try to experiment with it whether it can be good enough for your purpose.

      ++ for thinking outside the box

      +1 because it's a really neat idea, but your caveat applies more widely than you might think; most PDF content is compressed, for example, and MS Office documents in the 2007+ format are literally ZIP files. So in practice it might be more useful to go straight for per-format tests.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://824224]
Approved by Corion
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2014-09-02 11:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (21 votes), past polls