Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

(code) text2xls.pl (was csv2xls.pl)

by ybiC (Prior)
on Apr 17, 2001 at 02:44 UTC ( [id://72985]=CUFP: print w/replies, xml ) Need Help??

Convert text file in delimiter-separated format to Excel binary format

Required switches:

  • -i or --infile csv_infilename
  • -o or --output xls_outfilename

Optional switches:

  • -d or --delim alt_delimiter
    default delimiter is ','
  • -h or --header
    Accepts no arguments.
    Bolds and background colors contents of topmost row.

Example:
text2xls.pl -i file1.csv -o file2.xls -d * -h

Update 2: Renamed to avoid conflict with existing csv2xls.pl, and added Todos re jmcnamara's feedback below.   Hmmm... this started out as a true Snippet, but I wonder if it should be moved to Code Catacombs?

Update 1: And if I'd looked just a wee bit further than the pod, I'd have found /usr/local/lib/site_perl/Spreadsheet/WriteExcel/examples/csv2xls.pl which does this exact same thing.   Hmmm... I wasn't reinventing a wheel, I was um, learning - yeah, that's it, I was learning!   ;^D

#!/usr/bin/perl -w # text2xls.pl # pod at tail use strict; use Spreadsheet::WriteExcel; use Getopt::Long; # Crank it up print("\nStarting $0\n"); # Get command line switches and arguments my %option = ('defaultdelim' => ','); GetOptions( 'infile=s' => \$option{csvin}, 'outfile=s' => \$option{xlsout}, 'delim=s' => \$option{delim}, 'header!' => \$option{header}, ); unless (defined ($option{csvin} && $option{xlsout})) { &USAGE(); exit; } unless (defined $option{delim}) { $option{delim} = $option{defaultdelim}; } unless (defined $option{header}) { $option{header} = 0; } # Do the dirty work open (CSVFILE, $option{csvin}) or die "Error opening $option{csvin} +: $!"; my $workbook = Spreadsheet::WriteExcel -> new($option{xlsout}); my $worksheet = $workbook -> addworksheet(); my $row = 0; while (<CSVFILE>) { chomp; my @field = split("$option{delim}", $_); my $column = 0; foreach my $token (@field) { $worksheet -> write($row, $column, $token); $column++; } $row++; } # Prettify row as header if($option{header} == 1) { my $header = $workbook -> addformat(); $header -> set_bold(); $header -> set_align('center'); $header -> set_bg_color('tan'); $header -> set_border(); $worksheet -> set_row(0, undef, $header); } # Optional, unless doing some external action against xls file $workbook -> close() or die "Error closing $workbook: $!"; # Go tell it on the mountain print(" infile = $option{csvin}\n"); print(" outfile = $option{xlsout}\n"); print("Finished $0\n\n"); ######################################################## sub USAGE { print <<EOF Required switches: -i or --infile csv_infilename -o or --output xls_outfilename Optional switches: -d or --delim alt_delimiter default delimiter is ',' -h or --header Accepts no arguments. Bolds and background colors contents of topmost row. Example: csv2xls.pl -i file1.csv -o file2.xls -d * -h EOF ; } ######################################################## =head1 Name text2xls.pl =head1 Description Convert text file in csv format to Excel binary format =head1 Usage Required switches: -i or --infile csv_infilename -o or --output xls_outfilename Optional switches: -d or --delim alt_delimiter default delimiter is ',' -h or --header Accepts no arguments. Bolds and background colors contents of topmost row. Example: text2xls.pl -i file1.csv -o file2.xls -d * -h =head1 Tested Spreadsheet::WriteExcel 0.31 Getopt::Long 2.19 Perl 5.00503 Debian 2.2r2 "Espy" Excel 9.0.3821 SR-1 97 SR-2 Win32 5.0.2195 SP-1 4.0 SP-6a =head1 Updated 2001-04-17 10:45 Renamed as text2xls - avoid conflict with existin +g csv2xls. Tweaked todos re feedback from jmcnamara (S::WE a +uthor). 2001-04-16 Initial working code & posted to PerlMonks =head1 Todos Use Text::xSV by tilly or Text::CSV_XS for parsing text infile Review and fix "prettify row header" which somewhat works, but not qu +ite right. Support for "freeze pane" anticipated in future S::WE Test with different delimiters. Use Pod::Usage to instead of &USAGE(). But appears to only be in Perl + 5.6+ 8^( Use @headerparms and foreach loop in headers section to reduce duplic +ation. Accept multiple infiles. Save as "file.xls" (provided infile is "file") with -r switch. Add support for "page delimiter" character for multiple worksheets. Add support for $VERSION. =head1 Author ybiC =head1 Credits Thanks to jmcnamara, Petruchio and zdog for suggestions, and to jmcnamara for excellent pod in Spreadsheet::WriteExcel, and vroom, of course for PM. =cut

Replies are listed 'Best First'.
Re: csv2xls.pl
by jmcnamara (Monsignor) on Apr 17, 2001 at 14:36 UTC


    And if I'd looked just a wee bit further than the pod, I'd have found ~/Spreadsheet/WriteExcel/examples/csv2xls.pl which does this exact same thing.

    Yes, but yours is nicer. ;-) In fact if you don't mind I'll include it in the Spreadsheet::WriteExcel distro. Since your version deals with generic separators you could re-target it and call it text2xls.

    Just a few notes:

    Parsing CSV is never as easy as it seems, therefore, for more general applications it is worth considering Text::CSV_XS or Text::CSV. If the stated purpose is to split on a single character then your method is fine.

    # Prettify row as header.

    Prettifying the first row is a good idea. Unfortunately, it won't work like this because set_row() defines the default format for cells written by Excel and not for cells written by WriteExcel. This is easy to fix.

    # Unfortunately no support for "freeze pane"

    Freeze panes will be in the next version. It is a feature that a *lot* of people have requested.

    USAGE:

    For anything more than a single line of usage the POD::Usage module is worth considering. This keeps your POD and your usage synchronised. It is also one of the core modules. I tend to use it in any program where I use Getopt::Long. It is a useful module and a clever idea.

    Todo: binmode(FH) on xls outfile for cross-platform support.

    This isn't required. Spreadsheet::WriteExcel will take care of binmode() for you. It only applies if you are creating a new file based on a filehandle as opposed to a filename.

    A useful feature would be an option to split the input file into pages based on ^L, or some other delimiter, and put each page on a new worksheet. This is something that I wanted to add it to the csv2xls.pl example but I didn't have time.

    John.
    --

      Text::CSV's API is unfortunately incompatible with handling embedded returns in a CSV file. Text::CSV_XS can, or you can try (now new on CPAN) Text::xSV for a pure Perl solution.
      I've modified the file co use Text::CSV_XS in order to make it work even when there are \n characters inside a field. I've added a default parameter --binary so the program works properly in that cases. Oscar
      #!/usr/bin/perl -w # text2xls.pl # pod at tail # # Modified by Oscar Fernandez Sierra to support LF inside fields # (in that case, a row requires several lines from .csv file) # Changed default delimiter from "," to ";" use strict; use Spreadsheet::WriteExcel; use Getopt::Long; use Text::CSV_XS; use IO::File; use Data::Dumper; # Crank it up print("\nStarting $0\n"); # Get command line switches and arguments my %option = ('defaultdelim' => ';', 'defaultbinary' => 0); GetOptions( 'infile=s' => \$option{csvin}, 'outfile=s' => \$option{xlsout}, 'delim=s' => \$option{delim}, 'header!' => \$option{header}, 'binary!' => \$option{binary}, ); unless (defined ($option{csvin} && $option{xlsout})) { &USAGE(); exit; } unless (defined $option{delim}) { $option{delim} = $option{defaultdelim}; } unless (defined $option{header}) { $option{header} = 0; } unless (defined $option{binary}) { $option{binary} = 1; } # Do the dirty work my $fh = new IO::File; if (! $fh->open("< " . $option{csvin})) { die "Error opening $option{csvin}"; } # "binary" is necessary to process my $csv = Text::CSV_XS->new({"sep_char" => $option{delim}, "binary" => + $option{binary}}); my $workbook = Spreadsheet::WriteExcel -> new($option{xlsout}); my $worksheet = $workbook -> addworksheet(); my $row = 0; my $refarray_datos; # $csv->getline($fh) returns a reference to array that results to sepa +rate # in fields the row read from file. If any field in the row contains \ +n, and # the binary option is on, $csv->getline($fh) will read the right numb +er of # lines from the .csv file my $status; # to check errors my $error_input; # to check errors while ( ! $fh->eof() ) { if ($refarray_datos = $csv->getline($fh)) { #$status = $csv->status(); #$error_input = $csv->error_input(); #print Dumper($refarray_datos, $status, $error_input); my @field = @{$refarray_datos}; my $column = 0; foreach my $token (@field) { $worksheet -> write($row, $column, $token); $column++; } $row++; } else { $status = $csv->status(); $error_input = $csv->error_input(); die "Fatal error !!! Incorrect line (delimited by []):\n[$erro +r_input]\n(If any field contains LF (\\n), check if --binary can solv +e the error)\n\n"; } } # Prettify row as header if($option{header} == 1) { my $header = $workbook -> addformat(); $header -> set_bold(); $header -> set_align('center'); $header -> set_bg_color('tan'); $header -> set_border(); $worksheet -> set_row(0, undef, $header); } # Optional, unless doing some external action against xls file $workbook -> close() or die "Error closing $workbook: $!"; # Go tell it on the mountain print(" infile = $option{csvin}\n"); print(" outfile = $option{xlsout}\n"); print("Finished $0\n\n"); ######################################################## sub USAGE { print <<EOF Required switches: -i or --infile csv_infilename -o or --output xls_outfilename Optional switches: -d or --delim alt_delimiter default delimiter is ';' -h or --header Accepts no arguments. Bolds and background colors contents of topmost row. -b or --binary default mode is binary (permits LF inside a field). Use --nobinary do disable binary mode Example: csv2xls.pl -i file1.csv -o file2.xls -d [--[no]binary]] * -h EOF ; } ######################################################## =head1 Name text2xls.pl =head1 Description Convert text file in csv format to Excel binary format =head1 Usage Required switches: -i or --infile csv_infilename -o or --output xls_outfilename Optional switches: -d or --delim alt_delimiter default delimiter is ',' -h or --header Accepts no arguments. Bolds and background colors contents of topmost row. -b or --binary default mode is binary (permits LF inside a field). Use --nobinary do disable binary mode Example: csv2xls.pl -i file1.csv -o file2.xls -d [--[no]binary]] * -h =head1 Tested Spreadsheet::WriteExcel 0.31 Getopt::Long 2.19 Perl 5.00503 Debian 2.2r2 "Espy" Excel 9.0.3821 SR-1 97 SR-2 Win32 5.0.2195 SP-1 4.0 SP-6a =head1 Updated 2001-04-17 10:45 Renamed as text2xls - avoid conflict with existin +g csv2xls. Tweaked todos re feedback from jmcnamara (S::WE a +uthor). 2001-04-16 Initial working code & posted to PerlMonks =head1 Todos Use Text::xSV by tilly or Text::CSV_XS for parsing text infile Review and fix "prettify row header" which somewhat works, but not qu +ite right. Support for "freeze pane" anticipated in future S::WE Test with different delimiters. Use Pod::Usage to instead of &USAGE(). But appears to only be in Perl + 5.6+ 8^( Use @headerparms and foreach loop in headers section to reduce duplic +ation. Accept multiple infiles. Save as "file.xls" (provided infile is "file") with -r switch. Add support for "page delimiter" character for multiple worksheets. Add support for $VERSION. =head1 Author ybiC =head1 Credits Thanks to jmcnamara, Petruchio and zdog for suggestions, and to jmcnamara for excellent pod in Spreadsheet::WriteExcel, and vroom, of course for PM. =cut
Re: (code) text2xls.pl (was csv2xls.pl)
by Anonymous Monk on Nov 12, 2002 at 10:44 UTC
    Hi,

    I found your script very useful. I'm using it now in our
    production system. In fact I added some formatting code to
    fit my needs. Why I write is one problem I had with the
    originally code, that was your use of split. I cannot say
    what is less useful to split a comma separated list as
    split, I changed that and use the Module Text::ParseWords
    and the quotewords() function now.

    while (<CSVFILE>) { chomp; my @field = quotewords("$option{delim}", 0, $_); my $column = 0; ...

    This does a perfect job for me, so even if your tokens are
    including commas it gets parsed correctly.

    Regards,

    Gerrit
    --
    ...forgotten my perlmonks login...

      Found it now again, I'm perlmonk 'Siebenschlaefer'. Ciao, Gerrit
        This work very good, but not have support encoding for input text. I made some changes:
        #open (CSVFILE, $option{csvin}) or die "Error opening $option{csvin}
        open CSVFILE, '<:encoding(cp1251)', $option{csvin} or die "Error opening $option{csvin} +: $!";
        May be Siebenschlaefer can add this feature as swith "-e encoding" ?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://72985]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2024-03-19 05:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found