Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Accessing Meta data from MS WORD

by thmsdrew (Scribe)
on Aug 07, 2012 at 10:46 UTC ( #985935=note: print w/ replies, xml ) Need Help??


in reply to Accessing Meta data from MS WORD

Well a .docx file is actually just an archive file containing the metadata that you speak of. In Perl it is possible to access an archive file, extract the metadata (which would be a .xml file), and then you can parse the .xml file for what you need. These tasks are accomplished with specific Perl modules that can be found on CPAN.


Comment on Re: Accessing Meta data from MS WORD
Re^2: Accessing Meta data from MS WORD
by tobyink (Abbot) on Aug 07, 2012 at 16:13 UTC

    ... assuming of course that the file in question is in Microsoft's OpenXML format. Older versions of Word used the proprietary binary ".doc" format, which is still quite frequently used.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://985935]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (10)
As of 2014-07-24 10:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (159 votes), past polls