Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Parsing, searching, replacing content in MS Word documents

by t0mas (Priest)
on Apr 23, 2002 at 06:39 UTC ( #161224=note: print w/ replies, xml ) Need Help??


in reply to Parsing, searching, replacing content in MS Word documents

You could try something like this:

#!/usr/bin/perl -w # Uses use strict; use Win32::OLE; use Win32::OLE::Variant; use Win32::OLE::Const; use File::Find; # Variables use vars qw($MSWord $wd $startdir); # Where to start the doc search $startdir='c:\tmp'; # Create new MSWord object and load constants $MSWord=Win32::OLE->new('Word.Application','Quit') or die "Could not load MS Word"; $wd=Win32::OLE::Const->Load($MSWord); # Find documents find(\&rTxt,$startdir); sub rTxt { # We only want .doc files (no links...) return unless /\.doc$/ && -f && ! -l; # Open document my $doc = $MSWord->Documents->Open({FileName=>$File::Find::name}); # Exit nicely if we couldn't open doc return unless $doc; # Print some info print $File::Find::name,"\n"; # Content object my $content=$doc->Content; # Find object my $find=$content->Find; $find->ClearFormatting; $find->{Text}="Yoo"; $find->Replacement->ClearFormatting; $find->Replacement->{Text}="Hello"; $find->Execute({Replace=>$wd->{wdReplaceAll},Forward=>$wd->{True}} +); # Close document $doc->Close({SaveChanges=>$wd->{wdSaveChanges}}); }
It uses the built-in Find-And-Replace function that comes with MS Word. If you want to do any other more complex things (like s/^Yoo/Hello/i) you'll have to do stuff to the whole Content object.
The major problem with OLE Automation in my opinion is that it is painfully slow. If you have a large ammount of MS Word docs, this stunt will probably take ages.


/brother t0mas


Comment on Re: Parsing, searching, replacing content in MS Word documents
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://161224]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2015-07-04 22:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls