Convert text file in delimiter-separated format to Excel binary format
Required switches:
- -i or --infile csv_infilename
- -o or --output xls_outfilename
Optional switches:
- -d or --delim alt_delimiter
default delimiter is ','
- -h or --header
Accepts no arguments.
Bolds and background colors contents of topmost row.
Example:
text2xls.pl -i file1.csv -o file2.xls -d * -h
Update 2: Renamed to avoid conflict with existing csv2xls.pl,
and added Todos re jmcnamara's feedback below.
Hmmm... this started out as a true Snippet, but I wonder if it should be moved to Code Catacombs?
Update 1: And if I'd looked just a wee bit further than the pod, I'd have found /usr/local/lib/site_perl/Spreadsheet/WriteExcel/examples/csv2xls.pl which does this exact same thing. Hmmm... I wasn't reinventing a wheel, I was um, learning - yeah, that's it, I was learning! ;^D
#!/usr/bin/perl -w
# text2xls.pl
# pod at tail
use strict;
use Spreadsheet::WriteExcel;
use Getopt::Long;
# Crank it up
print("\nStarting $0\n");
# Get command line switches and arguments
my %option = ('defaultdelim' => ',');
GetOptions(
'infile=s' => \$option{csvin},
'outfile=s' => \$option{xlsout},
'delim=s' => \$option{delim},
'header!' => \$option{header},
);
unless (defined ($option{csvin} && $option{xlsout})) {
&USAGE();
exit;
}
unless (defined $option{delim}) {
$option{delim} = $option{defaultdelim};
}
unless (defined $option{header}) {
$option{header} = 0;
}
# Do the dirty work
open (CSVFILE, $option{csvin}) or die "Error opening $option{csvin}
+: $!";
my $workbook = Spreadsheet::WriteExcel -> new($option{xlsout});
my $worksheet = $workbook -> addworksheet();
my $row = 0;
while (<CSVFILE>) {
chomp;
my @field = split("$option{delim}", $_);
my $column = 0;
foreach my $token (@field) {
$worksheet -> write($row, $column, $token);
$column++;
}
$row++;
}
# Prettify row as header
if($option{header} == 1) {
my $header = $workbook -> addformat();
$header -> set_bold();
$header -> set_align('center');
$header -> set_bg_color('tan');
$header -> set_border();
$worksheet -> set_row(0, undef, $header);
}
# Optional, unless doing some external action against xls file
$workbook -> close() or die "Error closing $workbook: $!";
# Go tell it on the mountain
print(" infile = $option{csvin}\n");
print(" outfile = $option{xlsout}\n");
print("Finished $0\n\n");
########################################################
sub USAGE {
print <<EOF
Required switches:
-i or --infile csv_infilename
-o or --output xls_outfilename
Optional switches:
-d or --delim alt_delimiter
default delimiter is ','
-h or --header
Accepts no arguments.
Bolds and background colors contents of topmost row.
Example:
csv2xls.pl -i file1.csv -o file2.xls -d * -h
EOF
;
}
########################################################
=head1 Name
text2xls.pl
=head1 Description
Convert text file in csv format to Excel binary format
=head1 Usage
Required switches:
-i or --infile csv_infilename
-o or --output xls_outfilename
Optional switches:
-d or --delim alt_delimiter
default delimiter is ','
-h or --header
Accepts no arguments.
Bolds and background colors contents of topmost row.
Example:
text2xls.pl -i file1.csv -o file2.xls -d * -h
=head1 Tested
Spreadsheet::WriteExcel 0.31
Getopt::Long 2.19
Perl 5.00503
Debian 2.2r2 "Espy"
Excel 9.0.3821 SR-1
97 SR-2
Win32 5.0.2195 SP-1
4.0 SP-6a
=head1 Updated
2001-04-17 10:45 Renamed as text2xls - avoid conflict with existin
+g csv2xls.
Tweaked todos re feedback from jmcnamara (S::WE a
+uthor).
2001-04-16 Initial working code & posted to PerlMonks
=head1 Todos
Use Text::xSV by tilly or Text::CSV_XS for parsing text infile
Review and fix "prettify row header" which somewhat works, but not qu
+ite right.
Support for "freeze pane" anticipated in future S::WE
Test with different delimiters.
Use Pod::Usage to instead of &USAGE(). But appears to only be in Perl
+ 5.6+ 8^(
Use @headerparms and foreach loop in headers section to reduce duplic
+ation.
Accept multiple infiles.
Save as "file.xls" (provided infile is "file") with -r switch.
Add support for "page delimiter" character for multiple worksheets.
Add support for $VERSION.
=head1 Author
ybiC
=head1 Credits
Thanks to jmcnamara, Petruchio and zdog for suggestions,
and to jmcnamara for excellent pod in Spreadsheet::WriteExcel,
and vroom, of course for PM.
=cut
Re: csv2xls.pl
by jmcnamara (Monsignor) on Apr 17, 2001 at 14:36 UTC
|
And if I'd looked just a wee bit further than the pod, I'd have found ~/Spreadsheet/WriteExcel/examples/csv2xls.pl which does this exact same thing.
Yes, but yours is nicer. ;-) In fact if you don't mind I'll include it in the Spreadsheet::WriteExcel distro. Since your version deals with generic separators you could re-target it and call it text2xls.
Just a few notes:
Parsing CSV is never as easy as it seems, therefore, for more general applications it is worth considering Text::CSV_XS or Text::CSV. If the stated purpose is to split on a single character then your method is fine.
# Prettify row as header.
Prettifying the first row is a good idea. Unfortunately, it won't work like this because set_row() defines the default format for cells written by Excel and not for cells written by WriteExcel. This is easy to fix.
# Unfortunately no support for "freeze pane"
Freeze panes will be in the next version. It is a feature that a *lot* of people have requested.
USAGE:
For anything more than a single line of usage the POD::Usage module is worth considering. This keeps your POD and your usage synchronised. It is also one of the core modules. I tend to use it in any program where I use Getopt::Long. It is a useful module and a clever idea.
Todo: binmode(FH) on xls outfile for cross-platform support.
This isn't required. Spreadsheet::WriteExcel will take care of binmode() for you. It only applies if you are creating a new file based on a filehandle as opposed to a filename.
A useful feature would be an option to split the input file into pages based on ^L, or some other delimiter, and put each page on a new worksheet. This is something that I wanted to add it to the csv2xls.pl example but I didn't have time.
John.
--
| [reply] |
|
Text::CSV's API is unfortunately incompatible
with handling embedded returns in a CSV file.
Text::CSV_XS can, or you can try
(now new on CPAN) Text::xSV for a pure Perl
solution.
| [reply] |
|
I've modified the file co use Text::CSV_XS in order to make it work even when there are \n characters inside a field. I've added a default parameter --binary so the program works properly in that cases.
Oscar
#!/usr/bin/perl -w
# text2xls.pl
# pod at tail
#
# Modified by Oscar Fernandez Sierra to support LF inside fields
# (in that case, a row requires several lines from .csv file)
# Changed default delimiter from "," to ";"
use strict;
use Spreadsheet::WriteExcel;
use Getopt::Long;
use Text::CSV_XS;
use IO::File;
use Data::Dumper;
# Crank it up
print("\nStarting $0\n");
# Get command line switches and arguments
my %option = ('defaultdelim' => ';', 'defaultbinary' => 0);
GetOptions(
'infile=s' => \$option{csvin},
'outfile=s' => \$option{xlsout},
'delim=s' => \$option{delim},
'header!' => \$option{header},
'binary!' => \$option{binary},
);
unless (defined ($option{csvin} && $option{xlsout})) {
&USAGE();
exit;
}
unless (defined $option{delim}) {
$option{delim} = $option{defaultdelim};
}
unless (defined $option{header}) {
$option{header} = 0;
}
unless (defined $option{binary}) {
$option{binary} = 1;
}
# Do the dirty work
my $fh = new IO::File;
if (! $fh->open("< " . $option{csvin})) {
die "Error opening $option{csvin}";
}
# "binary" is necessary to process
my $csv = Text::CSV_XS->new({"sep_char" => $option{delim}, "binary" =>
+ $option{binary}});
my $workbook = Spreadsheet::WriteExcel -> new($option{xlsout});
my $worksheet = $workbook -> addworksheet();
my $row = 0;
my $refarray_datos;
# $csv->getline($fh) returns a reference to array that results to sepa
+rate
# in fields the row read from file. If any field in the row contains \
+n, and
# the binary option is on, $csv->getline($fh) will read the right numb
+er of
# lines from the .csv file
my $status; # to check errors
my $error_input; # to check errors
while ( ! $fh->eof() )
{
if ($refarray_datos = $csv->getline($fh))
{
#$status = $csv->status();
#$error_input = $csv->error_input();
#print Dumper($refarray_datos, $status, $error_input);
my @field = @{$refarray_datos};
my $column = 0;
foreach my $token (@field)
{
$worksheet -> write($row, $column, $token);
$column++;
}
$row++;
}
else
{
$status = $csv->status();
$error_input = $csv->error_input();
die "Fatal error !!! Incorrect line (delimited by []):\n[$erro
+r_input]\n(If any field contains LF (\\n), check if --binary can solv
+e the error)\n\n";
}
}
# Prettify row as header
if($option{header} == 1)
{
my $header = $workbook -> addformat();
$header -> set_bold();
$header -> set_align('center');
$header -> set_bg_color('tan');
$header -> set_border();
$worksheet -> set_row(0, undef, $header);
}
# Optional, unless doing some external action against xls file
$workbook -> close() or die "Error closing $workbook: $!";
# Go tell it on the mountain
print(" infile = $option{csvin}\n");
print(" outfile = $option{xlsout}\n");
print("Finished $0\n\n");
########################################################
sub USAGE {
print <<EOF
Required switches:
-i or --infile csv_infilename
-o or --output xls_outfilename
Optional switches:
-d or --delim alt_delimiter
default delimiter is ';'
-h or --header
Accepts no arguments.
Bolds and background colors contents of topmost row.
-b or --binary
default mode is binary (permits LF inside a field).
Use --nobinary do disable binary mode
Example:
csv2xls.pl -i file1.csv -o file2.xls -d [--[no]binary]] * -h
EOF
;
}
########################################################
=head1 Name
text2xls.pl
=head1 Description
Convert text file in csv format to Excel binary format
=head1 Usage
Required switches:
-i or --infile csv_infilename
-o or --output xls_outfilename
Optional switches:
-d or --delim alt_delimiter
default delimiter is ','
-h or --header
Accepts no arguments.
Bolds and background colors contents of topmost row.
-b or --binary
default mode is binary (permits LF inside a field).
Use --nobinary do disable binary mode
Example:
csv2xls.pl -i file1.csv -o file2.xls -d [--[no]binary]] * -h
=head1 Tested
Spreadsheet::WriteExcel 0.31
Getopt::Long 2.19
Perl 5.00503
Debian 2.2r2 "Espy"
Excel 9.0.3821 SR-1
97 SR-2
Win32 5.0.2195 SP-1
4.0 SP-6a
=head1 Updated
2001-04-17 10:45 Renamed as text2xls - avoid conflict with existin
+g csv2xls.
Tweaked todos re feedback from jmcnamara (S::WE a
+uthor).
2001-04-16 Initial working code & posted to PerlMonks
=head1 Todos
Use Text::xSV by tilly or Text::CSV_XS for parsing text infile
Review and fix "prettify row header" which somewhat works, but not qu
+ite right.
Support for "freeze pane" anticipated in future S::WE
Test with different delimiters.
Use Pod::Usage to instead of &USAGE(). But appears to only be in Perl
+ 5.6+ 8^(
Use @headerparms and foreach loop in headers section to reduce duplic
+ation.
Accept multiple infiles.
Save as "file.xls" (provided infile is "file") with -r switch.
Add support for "page delimiter" character for multiple worksheets.
Add support for $VERSION.
=head1 Author
ybiC
=head1 Credits
Thanks to jmcnamara, Petruchio and zdog for suggestions,
and to jmcnamara for excellent pod in Spreadsheet::WriteExcel,
and vroom, of course for PM.
=cut
| [reply] [d/l] |
Re: (code) text2xls.pl (was csv2xls.pl)
by Anonymous Monk on Nov 12, 2002 at 10:44 UTC
|
Hi,
I found your script very useful. I'm using it now in our
production system. In fact I added some formatting code to
fit my needs. Why I write is one problem I had with the
originally code, that was your use of split. I cannot say
what is less useful to split a comma separated list as
split, I changed that and use the Module Text::ParseWords
and the quotewords() function now.
while (<CSVFILE>) {
chomp;
my @field = quotewords("$option{delim}", 0, $_);
my $column = 0;
...
This does a perfect job for me, so even if your tokens are
including commas it gets parsed correctly.
Regards,
Gerrit
--
...forgotten my perlmonks login...
| [reply] [d/l] |
|
Found it now again, I'm perlmonk 'Siebenschlaefer'.
Ciao,
Gerrit
| [reply] |
|
This work very good, but not have support encoding for input text. I made some changes:
#open (CSVFILE, $option{csvin}) or die "Error opening $option{csvin}
open CSVFILE, '<:encoding(cp1251)', $option{csvin} or die "Error opening $option{csvin} +: $!";
May be Siebenschlaefer can add this feature as swith "-e encoding" ?
| [reply] |
|
|
|
|