Scenario:
The following text belongs to a .doc file
File: check1.asm
Function: Monks
Tag: No
Tag: 001
Tag: Yes
Tag: 002
File: check2.asm
Function: Perl Monks
Tag: Yes
Tag: 003
Tag: No
Tag: 004
File: check3.asm
Function: Experts
Tag: No
Tag: 005
Tag: No
Tag: 006
Function: Perl Experts
Tag: No
Tag: 007
Tag: Yes
Tag: 008
I have to extract the tag which have been tagged as Yes and the corresponding function and file name to an excel sheet..
The output have to be like this:
Tags Function File
002 Monks check1.asm
003 Perl Monks check2.asm
008 Perl Experts check3.asm
I have written the following snippet for extracting the tag which is categorized as Yes :
use strict;
use warnings;
use Win32::OLE;
use Win32::OLE qw(in with);
use Win32::OLE::Variant;
use Win32::OLE::Const 'Microsoft Excel';
use Win32::OLE::Const 'Microsoft Word';
use Cwd;
use File::Find;
use Win32::OLE;
use Win32::OLE::Enum;
$Win32::OLE::Warn = 3; # die on errors.
+..
my $out_file = 'check.xls';
open my $out_fh, '>', $out_file or die "Could not open file $out_file:
+$!";
my $print_next = 0;
#Globals
our $Word;
our $reviewchklists;
my @scriptfiles;
@scriptfiles=glob('*.doc');
foreach my $file (@scriptfiles)
{
my $var;
my $filename = "D\:\\";
$var = $filename."$file";
print $var ;
my $document = Win32::OLE -> GetObject("$var");
print "Extracting Text ...\n";
my @array;
my $paragraphs = $document->Paragraphs();
my $enumerate = new Win32::OLE::Enum($paragraphs);
while(my $paragraph = $enumerate->Next())
{
my $text = $paragraph->{Range}->{Text};
$text =~ s/[\n\r\t]//g;
$text =~ s/\x0B/\n/g;
$text =~ s/\x07//g;
chomp $text;
my $Data .= $text;
@array=split(/\.$/,$Data);
foreach my $line( @array)
{
if ($print_next)
{
print $out_fh $line."\n" ; # we add a "\n" ; #No n
+eed to chomp - we print the "\n"
local $\ = "<br>\n";
local $/="\n\n";
}
$print_next = ($line =~ /^Tag\sYes/);
}
}
} #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The above snippet is printing the output as follows:
ID : 002
ID : 003
ID : 008
I dont want the ID to be printed and how to extract the corresponding function and file name?
Help out monks!!!