Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

win 32 OLE Selection property

by wakatana (Novice)
on Aug 22, 2012 at 00:38 UTC ( #988909=perlquestion: print w/ replies, xml ) Need Help??
wakatana has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, i am trying to write simple perl script using win32 ole which will iterate over all M$ word paragraphs (any text that ends with a hard return) and print only those paragraphs that matches the specified condition. The problem is that I need to access font size property. It seems to me that this property is set only once in first paragraph and later is not updated. Please see code with comments
#!/usr/bin/perl use strict; use warnings; use Win32::OLE::Const 'Microsoft Word'; #$Win32::OLE::CP = CP_UTF8; binmode STDOUT, 'encoding(utf8)'; # OPEN FILE SPECIFIED AS FIRST ARGUMENT my $fname=$ARGV[0]; my $fnameFullPath = `cygpath.exe -wa $fname`; $fnameFullPath =~ s/\\/\\\\/g; $fnameFullPath =~ s/\s*$//; unless (-e $fnameFullPath) { print "Error: File did not exists\n"; exi +t 1;} my $Word = Win32::OLE->GetActiveObject('Word.Application') || Win32::OLE->new('Word.Application','Quit') or die Win32::OLE->LastError(); $Word->{'Visible'} = 0; my $doc = $Word->Documents->Open($fnameFullPath); my $paragraphs = $doc->Paragraphs() ; my $enumerate = new Win32::OLE::Enum($paragraphs); while(defined(my $paragraph = $enumerate->Next())) { my $text = $paragraph->{Range}->{Text}; my $sel = $Word->Selection; my $font = $sel->Font; # THIS DOES NOT WORK CORRECTLY BECAUSE $Word->Selection IS SET ONL +Y ONCE (FOR FIRST PARAGRAPH) # IS THERE ANY METHOD HOW TO FORCE Selection TO POINT WHERE ACTUAL + $paragraph IS POINTING ? if ($font->{Size} == 24){ print "Text: ", $text, "\n"; print "Font Bold: ", $font->{Bold}, "\n"; print "Font Italic: ", $font->{Italic}, "\n"; print "Font Name: ", $font->{Name}, "\n"; print "Font Size: ", $font->{Size}, "\n"; # Size('9pt') +; print "=========\n"; } } $Word->ActiveDocument->Close ; $Word->Quit;
Thank you for any idea

Comment on win 32 OLE Selection property
Download Code
Re: win 32 OLE Selection property
by Ratazong (Prior) on Aug 22, 2012 at 05:43 UTC

    Hi

    It seems to me that this property is set only once in first paragraph and later is not updated.
    If that assumption is correct, you might want to store the font-information to a variable outside of the loop and access it inside in case it is not redefined there. Something like:
    my $oldFont; # variable to hold the current font while(defined(my $paragraph = $enumerate->Next())) { my $text = $paragraph->{Range}->{Text}; my $sel = $Word->Selection; my $font = $sel->Font; if (!defined($font)) { $font = $oldFont; } # use the old f +ont instead else { $oldFont = $font; } # use this font + for future paragraphs
    HTH, Rata

      Hello Rata, thank you for your answer but it did not helped me much :( seems pretty similar to what i did (or did not). I tried this:
      #!/usr/bin/perl use strict; use warnings; use Win32::OLE::Const 'Microsoft Word'; #$Win32::OLE::CP = CP_UTF8; binmode STDOUT, 'encoding(utf8)'; # OPEN FILE SPECIFIED AS FIRST ARGUMENT my $fname=$ARGV[0]; my $fnameFullPath = `cygpath.exe -wa $fname`; $fnameFullPath =~ s/\\/\\\\/g; $fnameFullPath =~ s/\s*$//; unless (-e $fnameFullPath) { print "Error: File did not exists\n"; exi +t 1;} my $Word = Win32::OLE->GetActiveObject('Word.Application') || Win32::OLE->new('Word.Application','Quit') or die Win32::OLE->LastError(); $Word->{'Visible'} = 0; my $doc = $Word->Documents->Open($fnameFullPath); my $paragraphs = $doc->Paragraphs() ; my $enumerate = new Win32::OLE::Enum($paragraphs); my $oldFont = $Word->Selection->Font; # add +ed line while(defined(my $paragraph = $enumerate->Next())) { my $text = $paragraph->{Range}->{Text}; my $sel = $Word->Selection; my $font = $sel->Font; if (!defined($font)) { $font = $oldFont; } # use the old f +ont instead else { $oldFont = $font; } # use this font + for future paragraphs if ($font->{Size} == 18){ print "Text: ", $text, "\n"; print "Font Bold: ", $font->{Bold}, "\n"; print "Font Italic: ", $font->{Italic}, "\n"; print "Font Name: ", $font->{Name}, "\n"; print "Font Size: ", $font->{Size}, "\n"; print "=========\n"; } } $Word->ActiveDocument->Close ; $Word->Quit;
      Here is output:
      Text: This is a doc file containing different fonts and size, document + also contain header and footer (Font: TNR, Size: 18) Font Bold: 0 Font Italic: 0 Font Name: Times New Roman Font Size: 18 ========= Text: This is a Perl example (Font TNR, Size: 12) Font Bold: 0 Font Italic: 0 Font Name: Times New Roman Font Size: 18 ========= Text: This is a Python example(Font: Courier New, Size: 10) Font Bold: 0 Font Italic: 0 Font Name: Times New Roman Font Size: 18 =========
      As you can see in output everywhere is Font Size 18 even if in original document are different sizes (Also font name is not updated). This brings me to assumption that $font is set only once in 1st paragraph which is processed. Thus the following condition
      if ($font->{Size} == 18)
      is only evaluated in 1st processed paragraph. This also supports fact that if I change condition to following (Match 2nd paragraph):
      if ($font->{Size} == 12)
      the output is nothing. Because first paragraph is 18 not 12 and thus the condition is false, $font is not updated any more so it wont never be true. What I am doing wrong ? Many thanks
Re: win 32 OLE Selection property
by rpnoble419 (Pilgrim) on Aug 22, 2012 at 13:41 UTC
    Are you working with MS Word 2007 or 2010 docx files? If so you can open the file (its a Zip file) and process the /word/document.xml file. The paragraph start tag looks like this: <w:p w:rsidR="00E56F3D" w:rsidRDefault="00E56F3D" w:rsidP="00E56F3D"> and ends with this </w:p>. You will need to combine the sub tabs to create a whole paragraph as the text is not contiguous (its broken up by formatting tags). The actual paragraph text is contained in tags that are wrapped by <w:t>...</w:t>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://988909]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2014-08-28 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (259 votes), past polls