Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Using OLE to view given Paragraph in MS Word Document

by Ray Smith (Beadle)
on Nov 21, 2011 at 18:49 UTC ( #939296=perlquestion: print w/ replies, xml ) Need Help??
Ray Smith has asked for the wisdom of the Perl Monks concerning the following question:

I currently successfully parse MS Word Documents, extracting paragraph style and text. However, I've been unsuccessful in displaying a Word Document, given a specified paragraph number.

My sample program, demonstrates my problem - I can "read through the Word document" using enumerate->Next() (I'd rather position directly but Skip() doesn't seem to work), and although it appears that I get to the desired paragraph, the display does not appear.

I see that Selection may be what I want but I can't figure how to make that work. I lack the VBA documentation. And when I see some samples, I have not been successful in translating them to Perl / OLE calls.

Thanks for your attention.

#!/usr/bin/perl -w # Simple case to open MS Word Document and view Nth paragraph use strict; use warnings; use Win32::OLE; use Win32::OLE::Enum; use Cwd qw(getcwd abs_path); my $ParaNo = 10; # Default target paragraph my $InFile = shift if @ARGV > 0; # Required file name my $app_name = "Word.Application.8"; # Word's application name my $app; eval {$app = Win32::OLE->GetActiveObject($app_name)}; # Use instanc +e if already running die "Word ($app_name) is not installed" if $@; if (!defined($app)) { $app = Win32::OLE->new($app_name, sub {$_[0]->Quit;}) || die "Could not connect to $app_name $!"; } $app->{'Visible'} = 1; my $abspath = abs_path($InFile); # Word appears to need absolute pa +th my $doc = $app->Documents()->Open({ FileName => $abspath, ReadOnly => 0, }); die "Can't open doc $abspath: $!" if !defined($doc); my $paragraphs = $doc->Paragraphs(); my $enumerate = new Win32::OLE::Enum($paragraphs); if (!defined($enumerate)) { die "Can't get enumerate for $InFile"; } my $paragraph; for (my $i = 0; $i<$ParaNo; $i++) { $paragraph = $enumerate->Next(); } my $style = $paragraph->{Style}->{NameLocal}; my $text = $paragraph->{Range}->{Text}; print "style=$style text=$text\n"; print "Why doesn't the view show this location?\n"; print "ENTER to quit\n"; my $ans = <>;

Comment on Using OLE to view given Paragraph in MS Word Document
Download Code
Re: Using OLE to view given Paragraph in MS Word Document
by ricDeez (Scribe) on Nov 21, 2011 at 21:37 UTC

    I managed to get this to work with the changes made as per below:

    #!/usr/bin/perl -w # Simple case to open MS Word Document and view Nth paragraph use strict; use warnings; use 5.012; use Win32::OLE; use Win32::OLE::Enum; use Cwd qw(getcwd abs_path); my $ParaNo = 10; # Default target paragraph # my $InFile = shift if @ARGV > 0; # Required file name ##################################################################### # For the purposes of testing, I hard-coded the file name and path ##################################################################### my $InFile = "C:/Users/Ric/Desktop/Report-WirelessSurvey.doc"; ##################################################################### # The following makes the code less portable, requiring $app_name to # be modified accordingly! ##################################################################### # my $app_name = "Word.Application.8"; # Word's application nam +e # my $app; ##################################################################### # This approach will use the active instance or will open word if # required ##################################################################### my $doc = Win32::OLE->GetObject ( $InFile ) or die "Could not load $InFile. \n"; my $app = $doc->{Application}; $app->{Visible} = 1; ##################################################################### # This is a good idea ##################################################################### $app->{DisplayAlerts} = 0; # eval {$app = Win32::OLE->GetActiveObject($app_name)}; # Use insta +nce if already running # die "Word ($app_name) is not installed" if $@; # if (!defined($app)) { # $app = Win32::OLE->new($app_name, sub {$_[0]->Quit;}) # || die "Could not connect to $app_name $!"; # } # $app->{'Visible'} = 1; # my $abspath = abs_path($InFile); # Word appears to need absolute +path # my $doc = $app->Documents()->Open({ # FileName => $abspath, # ReadOnly => 0, # }); # die "Can't open doc $abspath: $!" if !defined($doc); ##################################################################### # Why are you using Win32::OLE::Enum? ##################################################################### my $paragraphs = $doc->Paragraphs(); # my $enumerate = new Win32::OLE::Enum($paragraphs); # if (!defined($enumerate)) { # die "Can't get enumerate for $InFile"; # } my $paragraph; # for (my $i = 0; $i<$ParaNo; $i++) { for my $i ( 1 .. $paragraphs->Count()){ last if $i > $ParaNo; #Forgot that you wanted to stop here! $paragraph = $paragraphs->Item( $i ); ################################################################## # This bit needs to be in the loop! ################################################################## my $style = $paragraph->{Style}->{NameLocal}; my $text = $paragraph->{Range}->{Text}; print "style=$style text=$text\n"; # print "Why doesn't the view show this location?\n"; # print "ENTER to quit\n"; # my $ans = <>; # $paragraph = $enumerate->Next(); }

    Try these changes and let me know how you go. This may still trip up on unicode characters!

      Thanks for the example.

      I tried it, first:
      1. using my own test file.
      2. Changing to use 5.10, because that's what I have.
      3. use abs_path(input) file because Word appears to require absolute path.

      Things operate with out error, but my Windows display still leaves the cursor at the beginning of the file.

      Am I missing something here?

        I don't really understand what you want to do!

        If you need to view the paragraphs being selected you could add the following:

        for my $i ( 1 .. $paragraphs->Count()){ last if $i > $ParaNo; $paragraph = $paragraphs->Item( $i ); $paragraph->{Range}->Select(); # <<<<<Added sleep(1); # <<<<<Added my $style = $paragraph->{Style}->{NameLocal}; my $text = $paragraph->{Range}->{Text}; print "style=$style text=$text\n"; }

        I have used placed the sleep in the loop so that you can see the paragraphs being selected in turn, otherwise it would just happen too quickly - especially since you are only interested in the first 10 paragraphs!

Reaped: Re: Using OLE to view given Paragraph in MS Word Document
by NodeReaper (Curate) on Nov 25, 2011 at 12:58 UTC
Reaped: Re: Using OLE to view given Paragraph in MS Word Document
by NodeReaper (Curate) on Nov 28, 2011 at 12:37 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://939296]
Approved by Corion
Front-paged by derby
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2014-12-29 11:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (187 votes), past polls