Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
Problems? Is your data what you think it is?
 
PerlMonks  

(code) text2xls.pl (was csv2xls.pl)

by ybiC (Prior)
on Apr 17, 2001 at 02:44 UTC ( #72985=snippet: print w/ replies, xml ) Need Help??

Description: Convert text file in delimiter-separated format to Excel binary format

Required switches:

  • -i or --infile csv_infilename
  • -o or --output xls_outfilename

Optional switches:

  • -d or --delim alt_delimiter
    default delimiter is ','
  • -h or --header
    Accepts no arguments.
    Bolds and background colors contents of topmost row.

Example:
text2xls.pl -i file1.csv -o file2.xls -d * -h

Update 2: Renamed to avoid conflict with existing csv2xls.pl, and added Todos re jmcnamara's feedback below.   Hmmm... this started out as a true Snippet, but I wonder if it should be moved to Code Catacombs?

Update 1: And if I'd looked just a wee bit further than the pod, I'd have found /usr/local/lib/site_perl/Spreadsheet/WriteExcel/examples/csv2xls.pl which does this exact same thing.   Hmmm... I wasn't reinventing a wheel, I was um, learning - yeah, that's it, I was learning!   ;^D

#!/usr/bin/perl -w

# text2xls.pl
# pod at tail

use strict;
use Spreadsheet::WriteExcel;
use Getopt::Long;

# Crank it up
print("\nStarting $0\n");

# Get command line switches and arguments
my %option = ('defaultdelim' => ',');
GetOptions(
    'infile=s'  => \$option{csvin},
    'outfile=s' => \$option{xlsout},
    'delim=s'   => \$option{delim},
    'header!'   => \$option{header},
    );
unless (defined ($option{csvin} && $option{xlsout})) {
    &USAGE();
    exit;
    }
unless (defined $option{delim}) {
    $option{delim} = $option{defaultdelim};
    }
unless (defined $option{header}) {
    $option{header} = 0;
    }

# Do the dirty work
open (CSVFILE, $option{csvin})    or die "Error opening $option{csvin}
+: $!";
my $workbook  = Spreadsheet::WriteExcel -> new($option{xlsout});
my $worksheet = $workbook -> addworksheet();
my $row       = 0;
while (<CSVFILE>) {
    chomp;
    my @field = split("$option{delim}", $_);
    my $column = 0;
    foreach my $token (@field) {
        $worksheet -> write($row, $column, $token);
        $column++;
        }
    $row++;
    }

# Prettify row as header
if($option{header} == 1) {
    my $header = $workbook -> addformat();
        $header -> set_bold();
        $header -> set_align('center');
        $header -> set_bg_color('tan');
        $header -> set_border();
    $worksheet -> set_row(0, undef, $header);
    }

# Optional, unless doing some external action against xls file
$workbook  -> close()         or die "Error closing $workbook: $!";

# Go tell it on the mountain
print("  infile  = $option{csvin}\n");
print("  outfile = $option{xlsout}\n");
print("Finished $0\n\n");

########################################################
sub USAGE {
print <<EOF

Required switches:
  -i or --infile csv_infilename
  -o or --output xls_outfilename

Optional switches:
-d or --delim alt_delimiter
  default delimiter is ','
-h or --header
  Accepts no arguments.
  Bolds and background colors contents of topmost row.

Example:
  csv2xls.pl -i file1.csv -o file2.xls -d * -h

EOF
;
}
########################################################

=head1 Name

 text2xls.pl

=head1 Description

 Convert text file in csv format to Excel binary format

=head1 Usage

 Required switches:
   -i or --infile csv_infilename
   -o or --output xls_outfilename

 Optional switches:
 -d or --delim alt_delimiter
   default delimiter is ','
 -h or --header
   Accepts no arguments.
   Bolds and background colors contents of topmost row.

 Example:
   text2xls.pl -i file1.csv -o file2.xls -d * -h

=head1 Tested

 Spreadsheet::WriteExcel 0.31
 Getopt::Long            2.19
 Perl                    5.00503
 Debian                  2.2r2 "Espy"
 Excel                   9.0.3821 SR-1
                         97 SR-2
 Win32                   5.0.2195 SP-1
                         4.0 SP-6a

=head1 Updated

 2001-04-17   10:45  Renamed as text2xls - avoid conflict with existin
+g csv2xls.
                     Tweaked todos re feedback from jmcnamara (S::WE a
+uthor).
 2001-04-16          Initial working code & posted to PerlMonks

=head1 Todos

 Use Text::xSV by tilly or Text::CSV_XS for parsing text infile
 Review and fix "prettify row header" which somewhat works, but not qu
+ite right.
   Support for "freeze pane" anticipated in future S::WE
 Test with different delimiters.
 Use Pod::Usage to instead of &USAGE(). But appears to only be in Perl
+ 5.6+ 8^(
 Use @headerparms and foreach loop in headers section to reduce duplic
+ation.
 Accept multiple infiles.
 Save as "file.xls" (provided infile is "file") with -r switch.
 Add support for "page delimiter" character for multiple worksheets.
 Add support for $VERSION.

=head1 Author

 ybiC

=head1 Credits

 Thanks to jmcnamara, Petruchio and zdog for suggestions,
 and to jmcnamara for excellent pod in Spreadsheet::WriteExcel,
 and vroom, of course for PM.

=cut

Comment on (code) text2xls.pl (was csv2xls.pl)
Download Code
Re: csv2xls.pl
by jmcnamara (Monsignor) on Apr 17, 2001 at 14:36 UTC


    And if I'd looked just a wee bit further than the pod, I'd have found ~/Spreadsheet/WriteExcel/examples/csv2xls.pl which does this exact same thing.

    Yes, but yours is nicer. ;-) In fact if you don't mind I'll include it in the Spreadsheet::WriteExcel distro. Since your version deals with generic separators you could re-target it and call it text2xls.

    Just a few notes:

    Parsing CSV is never as easy as it seems, therefore, for more general applications it is worth considering Text::CSV_XS or Text::CSV. If the stated purpose is to split on a single character then your method is fine.

    # Prettify row as header.

    Prettifying the first row is a good idea. Unfortunately, it won't work like this because set_row() defines the default format for cells written by Excel and not for cells written by WriteExcel. This is easy to fix.

    # Unfortunately no support for "freeze pane"

    Freeze panes will be in the next version. It is a feature that a *lot* of people have requested.

    USAGE:

    For anything more than a single line of usage the POD::Usage module is worth considering. This keeps your POD and your usage synchronised. It is also one of the core modules. I tend to use it in any program where I use Getopt::Long. It is a useful module and a clever idea.

    Todo: binmode(FH) on xls outfile for cross-platform support.

    This isn't required. Spreadsheet::WriteExcel will take care of binmode() for you. It only applies if you are creating a new file based on a filehandle as opposed to a filename.

    A useful feature would be an option to split the input file into pages based on ^L, or some other delimiter, and put each page on a new worksheet. This is something that I wanted to add it to the csv2xls.pl example but I didn't have time.

    John.
    --

      Text::CSV's API is unfortunately incompatible with handling embedded returns in a CSV file. Text::CSV_XS can, or you can try (now new on CPAN) Text::xSV for a pure Perl solution.
      I've modified the file co use Text::CSV_XS in order to make it work even when there are \n characters inside a field. I've added a default parameter --binary so the program works properly in that cases. Oscar
      #!/usr/bin/perl -w # text2xls.pl # pod at tail # # Modified by Oscar Fernandez Sierra to support LF inside fields # (in that case, a row requires several lines from .csv file) # Changed default delimiter from "," to ";" use strict; use Spreadsheet::WriteExcel; use Getopt::Long; use Text::CSV_XS; use IO::File; use Data::Dumper; # Crank it up print("\nStarting $0\n"); # Get command line switches and arguments my %option = ('defaultdelim' => ';', 'defaultbinary' => 0); GetOptions( 'infile=s' => \$option{csvin}, 'outfile=s' => \$option{xlsout}, 'delim=s' => \$option{delim}, 'header!' => \$option{header}, 'binary!' => \$option{binary}, ); unless (defined ($option{csvin} && $option{xlsout})) { &USAGE(); exit; } unless (defined $option{delim}) { $option{delim} = $option{defaultdelim}; } unless (defined $option{header}) { $option{header} = 0; } unless (defined $option{binary}) { $option{binary} = 1; } # Do the dirty work my $fh = new IO::File; if (! $fh->open("< " . $option{csvin})) { die "Error opening $option{csvin}"; } # "binary" is necessary to process my $csv = Text::CSV_XS->new({"sep_char" => $option{delim}, "binary" => + $option{binary}}); my $workbook = Spreadsheet::WriteExcel -> new($option{xlsout}); my $worksheet = $workbook -> addworksheet(); my $row = 0; my $refarray_datos; # $csv->getline($fh) returns a reference to array that results to sepa +rate # in fields the row read from file. If any field in the row contains \ +n, and # the binary option is on, $csv->getline($fh) will read the right numb +er of # lines from the .csv file my $status; # to check errors my $error_input; # to check errors while ( ! $fh->eof() ) { if ($refarray_datos = $csv->getline($fh)) { #$status = $csv->status(); #$error_input = $csv->error_input(); #print Dumper($refarray_datos, $status, $error_input); my @field = @{$refarray_datos}; my $column = 0; foreach my $token (@field) { $worksheet -> write($row, $column, $token); $column++; } $row++; } else { $status = $csv->status(); $error_input = $csv->error_input(); die "Fatal error !!! Incorrect line (delimited by []):\n[$erro +r_input]\n(If any field contains LF (\\n), check if --binary can solv +e the error)\n\n"; } } # Prettify row as header if($option{header} == 1) { my $header = $workbook -> addformat(); $header -> set_bold(); $header -> set_align('center'); $header -> set_bg_color('tan'); $header -> set_border(); $worksheet -> set_row(0, undef, $header); } # Optional, unless doing some external action against xls file $workbook -> close() or die "Error closing $workbook: $!"; # Go tell it on the mountain print(" infile = $option{csvin}\n"); print(" outfile = $option{xlsout}\n"); print("Finished $0\n\n"); ######################################################## sub USAGE { print <<EOF Required switches: -i or --infile csv_infilename -o or --output xls_outfilename Optional switches: -d or --delim alt_delimiter default delimiter is ';' -h or --header Accepts no arguments. Bolds and background colors contents of topmost row. -b or --binary default mode is binary (permits LF inside a field). Use --nobinary do disable binary mode Example: csv2xls.pl -i file1.csv -o file2.xls -d [--[no]binary]] * -h EOF ; } ######################################################## =head1 Name text2xls.pl =head1 Description Convert text file in csv format to Excel binary format =head1 Usage Required switches: -i or --infile csv_infilename -o or --output xls_outfilename Optional switches: -d or --delim alt_delimiter default delimiter is ',' -h or --header Accepts no arguments. Bolds and background colors contents of topmost row. -b or --binary default mode is binary (permits LF inside a field). Use --nobinary do disable binary mode Example: csv2xls.pl -i file1.csv -o file2.xls -d [--[no]binary]] * -h =head1 Tested Spreadsheet::WriteExcel 0.31 Getopt::Long 2.19 Perl 5.00503 Debian 2.2r2 "Espy" Excel 9.0.3821 SR-1 97 SR-2 Win32 5.0.2195 SP-1 4.0 SP-6a =head1 Updated 2001-04-17 10:45 Renamed as text2xls - avoid conflict with existin +g csv2xls. Tweaked todos re feedback from jmcnamara (S::WE a +uthor). 2001-04-16 Initial working code & posted to PerlMonks =head1 Todos Use Text::xSV by tilly or Text::CSV_XS for parsing text infile Review and fix "prettify row header" which somewhat works, but not qu +ite right. Support for "freeze pane" anticipated in future S::WE Test with different delimiters. Use Pod::Usage to instead of &USAGE(). But appears to only be in Perl + 5.6+ 8^( Use @headerparms and foreach loop in headers section to reduce duplic +ation. Accept multiple infiles. Save as "file.xls" (provided infile is "file") with -r switch. Add support for "page delimiter" character for multiple worksheets. Add support for $VERSION. =head1 Author ybiC =head1 Credits Thanks to jmcnamara, Petruchio and zdog for suggestions, and to jmcnamara for excellent pod in Spreadsheet::WriteExcel, and vroom, of course for PM. =cut
Re: (code) text2xls.pl (was csv2xls.pl)
by Anonymous Monk on Nov 12, 2002 at 10:44 UTC
    Hi,

    I found your script very useful. I'm using it now in our
    production system. In fact I added some formatting code to
    fit my needs. Why I write is one problem I had with the
    originally code, that was your use of split. I cannot say
    what is less useful to split a comma separated list as
    split, I changed that and use the Module Text::ParseWords
    and the quotewords() function now.

    while (<CSVFILE>) { chomp; my @field = quotewords("$option{delim}", 0, $_); my $column = 0; ...

    This does a perfect job for me, so even if your tokens are
    including commas it gets parsed correctly.

    Regards,

    Gerrit
    --
    ...forgotten my perlmonks login...

      Found it now again, I'm perlmonk 'Siebenschlaefer'. Ciao, Gerrit
        This work very good, but not have support encoding for input text. I made some changes:
        #open (CSVFILE, $option{csvin}) or die "Error opening $option{csvin}
        open CSVFILE, '<:encoding(cp1251)', $option{csvin} or die "Error opening $option{csvin} +: $!";
        May be Siebenschlaefer can add this feature as swith "-e encoding" ?

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://72985]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2014-04-19 14:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (481 votes), past polls