Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

compare two Excel spreadsheets

by juergenkemeter (Novice)
on Jul 14, 2005 at 02:34 UTC ( #474723=perlquestion: print w/replies, xml ) Need Help??
juergenkemeter has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am a beginner in Perl, just recently installed ActivePerl 5.8.7 MSI on Windows XP with MS Office 2000. I have two Excel *.xls spreadsheets. In one of them general data is written in column 'A', in the second spreadsheet I have the several search strings written in column 'A'. How do I compare these two files so that the output would tell me which of the search strings can be found in the general data column of the first spreadsheet? Thanks, Jurgen

Replies are listed 'Best First'.
Re: compare two Excel spreadsheets
by monarch (Priest) on Jul 14, 2005 at 03:29 UTC
    Use Spreadsheet::ParseExcel.

    Start by reading in the contents of your first spreadsheet. Let's assume you're using worksheet titled "gendata".

    use strict; use Spreadsheet::ParseExcel; my $oBook = Spreadsheet::ParseExcel::Workbook->Parse('first.xls'); if ( ( ! $oBook ) || ( ! defined( $oBook->{Worksheet} ) ) ) { die( "Cannot parse first spreadsheet" ); } my $sheetone = $oBook->Worksheet('gendata') || die( "No such worksheet gendata" );
    Then read in the contents of the second spreadsheet (let's assume you're using worksheet titled "gensearch"):
    my $oBook2nd=Spreadsheet::ParseExcel::Workbook->Parse('second.xls'); if ( ( ! $oBook2nd ) || ( ! defined( $oBook2nd->{Worksheet} ) ) ) { die( "Cannot parse second spreadsheet" ); } my $sheettwo = $oBook->Worksheet('gensearch') || die( "No such worksheet gensearch" );

    Next let's assume that you want to search all the strings in the first 10 rows of column A in workbook two against the same column/row in workbook one.

    for ( my $row = 0; $row < 10; $row++ ) { my $search = qr/$sheettwo->{Cells}[$row][0]->{Val}/; my $data = $sheetone->{Cells}[$row][0]->{Val}; if ( $data =~ m/$search/ ) { printf( "Column A Row %d matches\n", ( $row + 1 ) ); } }
Re: compare two Excel spreadsheets
by davidrw (Prior) on Jul 14, 2005 at 03:26 UTC
    does it have to be in perl? you could use excel's vlookup function (note you'll need to sort the general data, and pay attention to the "fuzzy match" flag)

    anyways, to do it in perl, first check out the Index of Spreadsheet FAQs node. then the general approach will be to read the general data in and probably create a hash with the strings as keys. Then you can read the search strings column, and check if each row exists in the general data hash.
Re: compare two Excel spreadsheets
by tlm (Prior) on Jul 14, 2005 at 03:34 UTC

    You may want to look into Spreadsheet::ParseExcel. It'd go something like this (this is very untested; it's meant only to give you a very rough idea of what to do):

    use Spreadsheet::ParseExcel; my $a_wks = Spreadsheet::ParseExcel->new->Parse('a.xls')->{ Worksheet +}[ 0 ]; my $b_wks = Spreadsheet::ParseExcel->new->Parse('b.xls')->{ Worksheet +}[ 0 ]; my @re = map { my $re = $b_wks->Cell( $_, $b_wks->{ MinCol } )->Val; qr/\Q$re +/ } $b_wks->{ MinRow }..$b_wks->{ MaxRow }; for my $i ( $a_wks->{ MinRow } .. $a_wks{ MaxRow } ) { my $string = $a_wks->Cell( $i, $a_wks->{ MinCol } )->Val; for my $re ( @re ) { if ( $string =~ /$re/ ) { do_the_i_found_it_dance(); last; } } }
    You may also find this article useful.

    the lowliest monk

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://474723]
Approved by Tanktalus
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (11)
As of 2017-02-28 13:14 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (400 votes). Check out past polls.