http://www.perlmonks.org?node_id=1004237

PerlStart has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I am trying to compare the excel file and a word doc.I have few keywords in an excel sheet and when they match in word doc,I should print the whole sentence which contains the keyword to another output file,an excel sheet.As of now I am just able to print the matched keyword in output file.I need help to extract the sentence.Please see the code below which I am using.

#!/usr/bin/perl –w use strict; use warnings; use Spreadsheet::ParseExcel; use Spreadsheet::WriteExcel; my $FileName = "/Users/Reku/Documents/nfrFinal.xls"; my $parser = Spreadsheet::ParseExcel->new(); # Create a new Excel file my $FileName1 = "/Users/Reku/Documents/nfrOutput2.xls"; my $workbook1 = Spreadsheet::WriteExcel->new($FileName1); # Add a worksheet my $worksheet1 = $workbook1->add_worksheet("NFR"); $worksheet1->write(0,0,"NFR"); $worksheet1->write(0,1,"Keyword"); $worksheet1->write(0,2,"Occurance"); my $workbook = $parser->parse($FileName); my $srs = '/Users/Reku/Documents/SRSSample.doc'; die $parser->error(), ".\n" if ( !defined $workbook ); my $col1=0; my $flag=0; my $attribute=0; my $cell2=0; # Following block is used to Iterate through all worksheets # in the workbook and print the worksheet content for my $worksheet ( $workbook->worksheets() ) { # Find out the worksheet ranges my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); my $col=0; for my $col ( $col_min..$row_max ) { for my $row ( $row_min .. $row_max ) { my $count=1; # Return the cell object at $row and $col my $cell = $worksheet->get_cell( $row, $col ); $cell2 = $worksheet->get_cell(0,$col); $attribute = $cell2->value(); $worksheet1->write(0,$col,$attribute); next unless $cell; my $value =$cell->value(); if($value eq "") { } else { open('FILE',$srs) or die $1; while(<FILE>) { my $String = $_; my @temp=split(' ',$String); foreach my $loop(@temp) { $count++ if($loop =~m/$value/i) } } $col1++; my $match=$cell2->value(); if($count>1) { print "\n$value occured $count time(s) in ur srs,so u should consider +$match\n"; #if($match eq $attribute){ $worksheet1->write($count,$col,$value); #} #$worksheet1->write($col1,1,$value); #$worksheet1->write($col1,2,$count); #$col1++; } close(FILE); } } }
}

Replies are listed 'Best First'.
Re: Word and Excel
by Athanasius (Archbishop) on Nov 17, 2012 at 02:20 UTC

    Hello PerlStart, and welcome to the Monastery!

    open('FILE',$srs) or die $1; while(<FILE>)

    Let’s assume for the moment that $srs is a text file. Here are some things to consider:

    • 'FILE' is a string, but FILE is a typeglob, which is a different entity. Remove the quotes from around FILE in the open statement.
    • $1 contains the first capture from the last successful pattern match. I think you meant $!, which contains the last system error.
    • It’s better practice to use lexical filehandles and the 3-argument form of open:
      open(my $FILE, '<', $srs) or die $!;

    But $srs is not a text file, it’s Word file, so you can’t read it this way at all! You need a suitable CPAN module. Not my area of expertise, but a quick search turns up Text::Extract::Word, which looks promising.

    Athanasius <°(((><contra mundum