Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

excelPerl V0.08

by strat (Canon)
on Jan 11, 2007 at 20:41 UTC ( #594230=CUFP: print w/ replies, xml ) Need Help??

Update: V0.13 released which contains some bugfixes; please scroll down for download the source code

Update: V0.12 is the latest version, with added args: -begin and -end. Download from http://www.fabiani.net/ -> Perl -> Downloads -> excelPerl

Update: V0.12 is the latest version, please scroll down to see it

Have you already heard about XLSperl? That's a great tool to parse excel files as if you parsed plain text files with perl -ane.

XLSperl's disadvantage is that it can only read excel files and not modify them. So I decided to write a windows-only solution with Win32::OLE which is able to modify an excel sheet.

Here is V0.08a, as promised in returning tied array (@nobull: thank you for your help):

File: excelPerl.pl

#! /usr/bin/perl use warnings; use strict; use Win32; # automatically loaded with Activestate Perl use File::Basename; # for PrintUsage use Getopt::Long; # for parsing program arguments use FindBin; use lib $FindBin::Bin; use ExcelPerl; $|++; # no suffering from buffering my( $file, $perlCode ); my $worksheetNumber = 1; # use first worksheet as default my $headline = 0; # don't skip a headline as default my $visible = 0; # is of no use in this version my $colSeparator = "\t"; # standard col separator for $_ GetOptions( 'file=s' => \$file, 'sheet=i' => \$worksheetNumber, 'headline=i' => \$headline, 'ane=s' => \$perlCode, # 'visible' => \$visible, 'colseparator' => \$colSeparator, ) or die &PrintUsage(); defined $file or die &PrintUsage; $file = Win32::GetFullPathName( $file ); my $xlsPerl = ExcelPerl->new( excelFile => $file, worksheetNumber => $worksheetNumber, ) ->open; #if( $visible ) { $xlsPerl->excel->{Visible} = 1; #} # if # skip all lines until -headline if available $xlsPerl->currentRow( $headline ) if $headline; my @F; while( $xlsPerl->getNextRow( \@F ) ) { # prepare some other helpful variables local $. = $xlsPerl->currentRow(); local $_ = join($colSeparator, @F ) . "\n"; # try to eval the perl code eval $perlCode; die $@ if $@; @F = (); } # while # ------------------------------------------------------------ sub PrintUsage { my $bin = File::Basename::basename($0); die <<EOH; $bin -sheet 1 -headline 1 -ane "print qq~\$.: \$F[0], \$F[1]\\n~" -fil +e excelFile try perldoc $bin for more details EOH } # PrintUsage # ------------------------------------------------------------ =pod =head1 excelPerl You know about the perl parameters -ane? This script is a way to try to do the same thing to Excel files. =head1 Prerequisites =over 1 =item Win32::OLE =item Excel needs to be installed =back =head1 Description Like perl -ane with a plaintext file, excelPerl.pl loops over an excel file and automatically (-a) splits up the columns of one row into an array with the name @F. Since @F is a tied array, by changing one element of @F you change the content of the excel cells. If you need one whole line with a tailing \n, you can use the variable $_ (like perl -ne) which joins @F by the value given in the parameter -colseparator (default: \t) If you need to know the current line number (like $. for while), you can use the special variable $. (big surprise). =head1 Params: -file String: Name of excel file (full path!) -sheetnumber Integer: which sheet shell I read (1..n, default: 1 +) -headline Integer: line number of headline (used for skipping + headline) -ane String: Perl-Code to execute -colseparator String: how to join columns in $_ (default: \t) Abbreviations of params are allowed, as long as they are unique, e.g excelPerl.pl -s 2 -h 1 -ane "print qq~$.: $F[0] $F[5]\n" -f c:\test. +xls =head1 Special Variables: $. Row number @F Array containing the values of one complete row $_ Row joined by -colseparator, with \n at the end =head1 Bugs/... This code is very, very, very experimental! If you do changes, you better save the workbook manually, that's the reason why the parameter -visible always is on. =head1 SEE ALSO I heard about XLSperl from John Allen which is a great tool. But it only allows to read excel files and not to change them on the fly. Since I often need this feature, I decided to write it with Win32::OLE, and it was not difficult. But unlike XLSperl, excelPerl only runs under Windows and needs an installed version of Excel. You can find XLSPerl at L<http://perl.jonallen.info/projects/xlstools> =head1 Author Martin Fabiani L<http://www.fabiani.net/> 2007 =cut

File: ExcelPerl.pm

package ExcelPerl; use warnings; use strict; use Carp qw(croak); our $VERSION = 0.08; # use Readonly; # better not, since Readonly is no standard module yet use Win32::OLE; use Win32::OLE::Const 'Microsoft Excel'; use Win32::OLE::Variant; Win32::OLE->Option( Warn => 3); use vars qw( @ObjInterfaceMethods @F ); # which object interface methods shell be available as standard #Readonly::Array: Readonly is not yet a standard module :-( @ObjInterfaceMethods = qw( excelFile excel workBook worksheet worksheetNumber currentRow lastRow changeCount ); # install object interface methods at startup foreach my $method ( @ObjInterfaceMethods ) { no strict 'refs'; # Sub::Install is no standard module :-( *{ $method } = sub { my( $self, @values ) = @_; $self->{ $method } = $values[0] if scalar @values; return $self->{ $method }; }; } # foreach # ============================================================ sub new { my( $class, %params ) = @_; my $self = bless {}, $class; foreach my $param (keys %params) { $self->$param( $params{$param} ); } # foreach return $self; } # new # ------------------------------------------------------------ sub DESTROY { # my $self = shift; # print "DESTROY: ", ref $self->excel, "\n"; } # DESTROY # ------------------------------------------------------------ sub increaseCurrentRow { my $self = shift; $self->currentRow( 1 + $self->currentRow ); } # increaseCurrentRow # ------------------------------------------------------------ sub increaseChangeCount { my $self = shift; no warnings; $self->changeCount( 1 + $self->changeCount ); } # increaseChangeCount # ------------------------------------------------------------ sub open { my( $self ) = @_; my $filename = $self->excelFile; defined $filename or croak "Error: no filename given"; -f $filename or croak( "Error: filename '$filename' doesn't exist" +); my $excel; # try to re-use running instance of Excel eval { $excel = Win32::OLE->GetActiveObject( 'Excel.Application' ) + }; die "Error: no Excel installed\n" if $@; unless( defined $excel ) { # if not running, start excel $excel = Win32::OLE->new( 'Excel.Application', sub { $_[0]->Qu +it } ) or die "Error: can't start Excel\n"; } # unless $self->excel( $excel ); $self->workBook( $excel->Workbooks->Open( $filename ) ); $self->worksheet( $self->workBook->Worksheets( $self->worksheetNum +ber ) ); my $range = $self->worksheet->UsedRange->{Value}; $self->lastRow( $#$range + 1 ); $self->currentRow( 0 ); return $self; } # open # ------------------------------------------------------------ sub getNextRow { my( $self, $F ) = @_; $self->increaseCurrentRow; my $row = $self->currentRow; my $sheet = $self->worksheet; return if $row > $self->lastRow; my $rowData = $sheet->Range("A$row:IV$row")->{Value}; # dirty my $itemsFound = 0; tie( @$F, 'ExcelPerl::RowArray', $self ); foreach my $value ( reverse @{ $rowData->[0] } ) { if( defined $value ) { unshift( @$F, $value ); $itemsFound++; +} else { unshift( @$F, '' ) if scalar @$F; } } # foreach return 1; } # getNextRow # ------------------------------------------------------------ # ============================================================ package ExcelPerl::RowArray; # ------------------------------------------------------------ use Tie::Array; use vars qw( @ISA ); @ISA = qw( Tie::Array ); # ------------------------------------------------------------ sub TIEARRAY { my( $class, $excelObj ) = @_; my $self = bless( { data => [] }, $class ); $self->{excelObj} = $excelObj; $self->{row} = $excelObj->currentRow; return $self; } # TIEARRAY # ------------------------------------------------------------ sub FETCH { my( $self, $index) = @_; return $self->{data}->[ $index ]; } # FETCH # ------------------------------------------------------------ sub STORE { my( $self, $index, $value ) = @_; my $excelObj = $self->{excelObj}; my $sheet = $excelObj->worksheet; my $row = $excelObj->currentRow; my $col = $index + 1; if( $self->{data}->[$index] ne $value ) { $excelObj->increaseChangeCount; $sheet ->Cells($row, $col)->{'Value' } = $value; $self ->{'data'} ->[ $index ] = $value; } # if } # STORE # ------------------------------------------------------------ sub FETCHSIZE { my $self = shift; return $#{ $self->{data} }; } # FETCHSIZE # ------------------------------------------------------------ sub STORESIZE { my( $self, $newLength ) = @_; $#{ $self->{data} } = $newLength; } # STORESIZE # ------------------------------------------------------------ sub UNSHIFT { my $self = shift; unshift( @{ $self->{data} }, @_ ); } # UNSHIFT # ------------------------------------------------------------ sub SHIFT { my $self = shift; return shift @{ $self->{data} }; } # SHIFT # ------------------------------------------------------------ sub POP { my $self = shift; return pop @{ $self->{data} }; } # POP # ------------------------------------------------------------ sub PUSH { my $self = shift; push( @{ $self->{data} }, @_ ); } # PUSH # ------------------------------------------------------------ sub CLEAR { my( $self ) = @_; # TODO: update excel $self->{data} = []; } # CLEAR # ------------------------------------------------------------ sub DESTROY { # my $self = shift; } # DESTROY # ------------------------------------------------------------ # ... # ------------------------------------------------------------ 1; # modules have to return a true value

Update: (@pKai: thanks for the hints)

  • backslash before $. and \n in PrintUsage
  • Test if file is defined before ...Win32::GetFullPathName...
  • >

Best regards,
perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

Comment on excelPerl V0.08
Select or Download Code
Re: excelPerl V0.08
by pKai (Priest) on Jan 12, 2007 at 14:43 UTC

    Proposed cleanup for the dirt mentioned in ExcelPerl.pm/getNextRow:

    my $rowData = $sheet->Range("A$row:IV$row")->{Value}; # dirty

    Idea: A range has a method address to stringify itself, giving the string one needs in the constructor la range("A1:X1").

    1. Expand open by not only saving the lastRow, but also the maxCol:
      my $bottomRight = (split (/:/, $self->worksheet->UsedRange->addres +s(0,0)))[-1]; my ($colNameMax, $rowNumMax) = $bottomRight =~ /^([A-Z]+)(\d+)/; $self->lastRow( $rowNumMax ); $self->maxCol( $colNameMax );
      Also add maxCol to @ObjInterfaceMethod to initially create it.
    2. Now you can do the following in getNextRow:
      my $rowData = $sheet->Range("A$row:" . $self->maxCol . $row)->{Value};
      But we also have to change the construction of the list to iterate over in the foreach following:
      foreach my $value (ref $rowData ? reverse @{ $rowData->[0] } : $rowDat +a) {
      because $rowdata is only a scalar when maxCol eq 'A'.

    We can take an additional step (now in excelPerl.pl) and compile (eval) the $perCode only once:

    my $fn = eval "sub { my \@F = \@_; no warnings 'uninitialized'; $perlCode }"; die $@ if $@;

    Maybe we want the no warnings as shown in that tight scope, since our first change made it more likely that an index out of range for @F will be tried to access in the $perlCode.

    The former eval part in the while - getNextRow changes to

    $fn->(@F);

    with that.

      @pKai: thank you for these good ideas; I added them in V0.09

      Best regards,
      perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

excelPerl V0.11 released
by strat (Canon) on Jun 08, 2007 at 09:30 UTC

    Now Version0.11 released (thanks to pkai for the good hints) which is much more powerful and a little bit faster:

excelPerl V0.12 released
by strat (Canon) on Sep 02, 2007 at 10:05 UTC

    Version 0.12 of excelPerl released

    Added: flags -begin and -end to be easily able to execute code before and after the loop

      Strat,

      I created a simple xls document with "Hello" in A1 and "World" in B1. I used the following command line that you sent me in a previous post

      perl excelPerl.pl -begin "use Text::CSV_XS; $csv = Text::CSV_XS->new( +{ binary => 1 } )" -ane "$csv->combine(@F); print $csv->string, $/" - +f helloworld.xls > helloworld.csv
      But all I got was an output file with CRLF. I'm working with Active State Perl 5.8.8 on WinXP and Text::CSV_XS version 0.3.

      Any ideas?

        Any error messages?

        Best regards,
        perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

Re: excelPerl V0.13 released
by strat (Canon) on Oct 25, 2007 at 08:44 UTC

    This is V0.13 with two bugfixes. Now it should work if only the field B1 is filled

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://594230]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2014-07-10 05:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (199 votes), past polls