Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

excelPerl V0.08

by strat (Canon)
on Jan 11, 2007 at 20:41 UTC ( #594230=CUFP: print w/ replies, xml ) Need Help??

Update: V0.13 released which contains some bugfixes; please scroll down for download the source code

Update: V0.12 is the latest version, with added args: -begin and -end. Download from http://www.fabiani.net/ -> Perl -> Downloads -> excelPerl

Update: V0.12 is the latest version, please scroll down to see it

Have you already heard about XLSperl? That's a great tool to parse excel files as if you parsed plain text files with perl -ane.

XLSperl's disadvantage is that it can only read excel files and not modify them. So I decided to write a windows-only solution with Win32::OLE which is able to modify an excel sheet.

Here is V0.08a, as promised in returning tied array (@nobull: thank you for your help):

File: excelPerl.pl

#! /usr/bin/perl use warnings; use strict; use Win32; # automatically loaded with Activestate Perl use File::Basename; # for PrintUsage use Getopt::Long; # for parsing program arguments use FindBin; use lib $FindBin::Bin; use ExcelPerl; $|++; # no suffering from buffering my( $file, $perlCode ); my $worksheetNumber = 1; # use first worksheet as default my $headline = 0; # don't skip a headline as default my $visible = 0; # is of no use in this version my $colSeparator = "\t"; # standard col separator for $_ GetOptions( 'file=s' => \$file, 'sheet=i' => \$worksheetNumber, 'headline=i' => \$headline, 'ane=s' => \$perlCode, # 'visible' => \$visible, 'colseparator' => \$colSeparator, ) or die &PrintUsage(); defined $file or die &PrintUsage; $file = Win32::GetFullPathName( $file ); my $xlsPerl = ExcelPerl->new( excelFile => $file, worksheetNumber => $worksheetNumber, ) ->open; #if( $visible ) { $xlsPerl->excel->{Visible} = 1; #} # if # skip all lines until -headline if available $xlsPerl->currentRow( $headline ) if $headline; my @F; while( $xlsPerl->getNextRow( \@F ) ) { # prepare some other helpful variables local $. = $xlsPerl->currentRow(); local $_ = join($colSeparator, @F ) . "\n"; # try to eval the perl code eval $perlCode; die $@ if $@; @F = (); } # while # ------------------------------------------------------------ sub PrintUsage { my $bin = File::Basename::basename($0); die <<EOH; $bin -sheet 1 -headline 1 -ane "print qq~\$.: \$F[0], \$F[1]\\n~" -fil +e excelFile try perldoc $bin for more details EOH } # PrintUsage # ------------------------------------------------------------ =pod =head1 excelPerl You know about the perl parameters -ane? This script is a way to try to do the same thing to Excel files. =head1 Prerequisites =over 1 =item Win32::OLE =item Excel needs to be installed =back =head1 Description Like perl -ane with a plaintext file, excelPerl.pl loops over an excel file and automatically (-a) splits up the columns of one row into an array with the name @F. Since @F is a tied array, by changing one element of @F you change the content of the excel cells. If you need one whole line with a tailing \n, you can use the variable $_ (like perl -ne) which joins @F by the value given in the parameter -colseparator (default: \t) If you need to know the current line number (like $. for while), you can use the special variable $. (big surprise). =head1 Params: -file String: Name of excel file (full path!) -sheetnumber Integer: which sheet shell I read (1..n, default: 1 +) -headline Integer: line number of headline (used for skipping + headline) -ane String: Perl-Code to execute -colseparator String: how to join columns in $_ (default: \t) Abbreviations of params are allowed, as long as they are unique, e.g excelPerl.pl -s 2 -h 1 -ane "print qq~$.: $F[0] $F[5]\n" -f c:\test. +xls =head1 Special Variables: $. Row number @F Array containing the values of one complete row $_ Row joined by -colseparator, with \n at the end =head1 Bugs/... This code is very, very, very experimental! If you do changes, you better save the workbook manually, that's the reason why the parameter -visible always is on. =head1 SEE ALSO I heard about XLSperl from John Allen which is a great tool. But it only allows to read excel files and not to change them on the fly. Since I often need this feature, I decided to write it with Win32::OLE, and it was not difficult. But unlike XLSperl, excelPerl only runs under Windows and needs an installed version of Excel. You can find XLSPerl at L<http://perl.jonallen.info/projects/xlstools> =head1 Author Martin Fabiani L<http://www.fabiani.net/> 2007 =cut

File: ExcelPerl.pm

package ExcelPerl; use warnings; use strict; use Carp qw(croak); our $VERSION = 0.08; # use Readonly; # better not, since Readonly is no standard module yet use Win32::OLE; use Win32::OLE::Const 'Microsoft Excel'; use Win32::OLE::Variant; Win32::OLE->Option( Warn => 3); use vars qw( @ObjInterfaceMethods @F ); # which object interface methods shell be available as standard #Readonly::Array: Readonly is not yet a standard module :-( @ObjInterfaceMethods = qw( excelFile excel workBook worksheet worksheetNumber currentRow lastRow changeCount ); # install object interface methods at startup foreach my $method ( @ObjInterfaceMethods ) { no strict 'refs'; # Sub::Install is no standard module :-( *{ $method } = sub { my( $self, @values ) = @_; $self->{ $method } = $values[0] if scalar @values; return $self->{ $method }; }; } # foreach # ============================================================ sub new { my( $class, %params ) = @_; my $self = bless {}, $class; foreach my $param (keys %params) { $self->$param( $params{$param} ); } # foreach return $self; } # new # ------------------------------------------------------------ sub DESTROY { # my $self = shift; # print "DESTROY: ", ref $self->excel, "\n"; } # DESTROY # ------------------------------------------------------------ sub increaseCurrentRow { my $self = shift; $self->currentRow( 1 + $self->currentRow ); } # increaseCurrentRow # ------------------------------------------------------------ sub increaseChangeCount { my $self = shift; no warnings; $self->changeCount( 1 + $self->changeCount ); } # increaseChangeCount # ------------------------------------------------------------ sub open { my( $self ) = @_; my $filename = $self->excelFile; defined $filename or croak "Error: no filename given"; -f $filename or croak( "Error: filename '$filename' doesn't exist" +); my $excel; # try to re-use running instance of Excel eval { $excel = Win32::OLE->GetActiveObject( 'Excel.Application' ) + }; die "Error: no Excel installed\n" if $@; unless( defined $excel ) { # if not running, start excel $excel = Win32::OLE->new( 'Excel.Application', sub { $_[0]->Qu +it } ) or die "Error: can't start Excel\n"; } # unless $self->excel( $excel ); $self->workBook( $excel->Workbooks->Open( $filename ) ); $self->worksheet( $self->workBook->Worksheets( $self->worksheetNum +ber ) ); my $range = $self->worksheet->UsedRange->{Value}; $self->lastRow( $#$range + 1 ); $self->currentRow( 0 ); return $self; } # open # ------------------------------------------------------------ sub getNextRow { my( $self, $F ) = @_; $self->increaseCurrentRow; my $row = $self->currentRow; my $sheet = $self->worksheet; return if $row > $self->lastRow; my $rowData = $sheet->Range("A$row:IV$row")->{Value}; # dirty my $itemsFound = 0; tie( @$F, 'ExcelPerl::RowArray', $self ); foreach my $value ( reverse @{ $rowData->[0] } ) { if( defined $value ) { unshift( @$F, $value ); $itemsFound++; +} else { unshift( @$F, '' ) if scalar @$F; } } # foreach return 1; } # getNextRow # ------------------------------------------------------------ # ============================================================ package ExcelPerl::RowArray; # ------------------------------------------------------------ use Tie::Array; use vars qw( @ISA ); @ISA = qw( Tie::Array ); # ------------------------------------------------------------ sub TIEARRAY { my( $class, $excelObj ) = @_; my $self = bless( { data => [] }, $class ); $self->{excelObj} = $excelObj; $self->{row} = $excelObj->currentRow; return $self; } # TIEARRAY # ------------------------------------------------------------ sub FETCH { my( $self, $index) = @_; return $self->{data}->[ $index ]; } # FETCH # ------------------------------------------------------------ sub STORE { my( $self, $index, $value ) = @_; my $excelObj = $self->{excelObj}; my $sheet = $excelObj->worksheet; my $row = $excelObj->currentRow; my $col = $index + 1; if( $self->{data}->[$index] ne $value ) { $excelObj->increaseChangeCount; $sheet ->Cells($row, $col)->{'Value' } = $value; $self ->{'data'} ->[ $index ] = $value; } # if } # STORE # ------------------------------------------------------------ sub FETCHSIZE { my $self = shift; return $#{ $self->{data} }; } # FETCHSIZE # ------------------------------------------------------------ sub STORESIZE { my( $self, $newLength ) = @_; $#{ $self->{data} } = $newLength; } # STORESIZE # ------------------------------------------------------------ sub UNSHIFT { my $self = shift; unshift( @{ $self->{data} }, @_ ); } # UNSHIFT # ------------------------------------------------------------ sub SHIFT { my $self = shift; return shift @{ $self->{data} }; } # SHIFT # ------------------------------------------------------------ sub POP { my $self = shift; return pop @{ $self->{data} }; } # POP # ------------------------------------------------------------ sub PUSH { my $self = shift; push( @{ $self->{data} }, @_ ); } # PUSH # ------------------------------------------------------------ sub CLEAR { my( $self ) = @_; # TODO: update excel $self->{data} = []; } # CLEAR # ------------------------------------------------------------ sub DESTROY { # my $self = shift; } # DESTROY # ------------------------------------------------------------ # ... # ------------------------------------------------------------ 1; # modules have to return a true value

Update: (@pKai: thanks for the hints)

  • backslash before $. and \n in PrintUsage
  • Test if file is defined before ...Win32::GetFullPathName...
  • >

Best regards,
perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

Comment on excelPerl V0.08
Select or Download Code
Re: excelPerl V0.08
by pKai (Priest) on Jan 12, 2007 at 14:43 UTC

    Proposed cleanup for the dirt mentioned in ExcelPerl.pm/getNextRow:

    my $rowData = $sheet->Range("A$row:IV$row")->{Value}; # dirty

    Idea: A range has a method address to stringify itself, giving the string one needs in the constructor la range("A1:X1").

    1. Expand open by not only saving the lastRow, but also the maxCol:
      my $bottomRight = (split (/:/, $self->worksheet->UsedRange->addres +s(0,0)))[-1]; my ($colNameMax, $rowNumMax) = $bottomRight =~ /^([A-Z]+)(\d+)/; $self->lastRow( $rowNumMax ); $self->maxCol( $colNameMax );
      Also add maxCol to @ObjInterfaceMethod to initially create it.
    2. Now you can do the following in getNextRow:
      my $rowData = $sheet->Range("A$row:" . $self->maxCol . $row)->{Value};
      But we also have to change the construction of the list to iterate over in the foreach following:
      foreach my $value (ref $rowData ? reverse @{ $rowData->[0] } : $rowDat +a) {
      because $rowdata is only a scalar when maxCol eq 'A'.

    We can take an additional step (now in excelPerl.pl) and compile (eval) the $perCode only once:

    my $fn = eval "sub { my \@F = \@_; no warnings 'uninitialized'; $perlCode }"; die $@ if $@;

    Maybe we want the no warnings as shown in that tight scope, since our first change made it more likely that an index out of range for @F will be tried to access in the $perlCode.

    The former eval part in the while - getNextRow changes to

    $fn->(@F);

    with that.

      @pKai: thank you for these good ideas; I added them in V0.09

      Best regards,
      perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

excelPerl V0.11 released
by strat (Canon) on Jun 08, 2007 at 09:30 UTC

    Now Version0.11 released (thanks to pkai for the good hints) which is much more powerful and a little bit faster:

excelPerl V0.12 released
by strat (Canon) on Sep 02, 2007 at 10:05 UTC

    Version 0.12 of excelPerl released

    Added: flags -begin and -end to be easily able to execute code before and after the loop

      Strat,

      I created a simple xls document with "Hello" in A1 and "World" in B1. I used the following command line that you sent me in a previous post

      perl excelPerl.pl -begin "use Text::CSV_XS; $csv = Text::CSV_XS->new( +{ binary => 1 } )" -ane "$csv->combine(@F); print $csv->string, $/" - +f helloworld.xls > helloworld.csv
      But all I got was an output file with CRLF. I'm working with Active State Perl 5.8.8 on WinXP and Text::CSV_XS version 0.3.

      Any ideas?

        Any error messages?

        Best regards,
        perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

Re: excelPerl V0.13 released
by strat (Canon) on Oct 25, 2007 at 08:44 UTC

    This is V0.13 with two bugfixes. Now it should work if only the field B1 is filled

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://594230]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (13)
As of 2014-09-16 20:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (47 votes), past polls