joec_ has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
I have a csv file, that potentially could contain 400 "columns" and anything up to 1000 rows. What i would like to end up with is this
CSV at the moment:
123,"text",66,"more text"
124,"text1",67,"more text1"
125,"text2",68,"more text2"
Seperate Strings (eventually going into oracle as clobs):
123|124|125
"text"|"text1"|"text2"
66|67|68
"more text"|"more text1"|"more text2"
I wont know before hand how many rows/cols the csv has as it depends on the dataset from another app. What i have tried so far is : while (<CSV>){
@fields = split /,/,$_;
push @s, $fields[0];
$structid=join "|",@s;
}
which deals with just the first column. Following this logic, i would have to create an array for each column...there must be a better way. Any ideas appreciated Thanks-Joe
----- Eschew obfuscation, espouse eludication!
Re: CSV manipulation with perl
by Tux (Canon) on Mar 09, 2009 at 14:14 UTC
|
use DBI;
use Text::CSV_XS;
my $dbh = DBI->connect (...);
my $sth = $dbh->prepare ("insert into table values (?, ?, ?, ...)");
my $csv = Text::CSV_XS->new ({ binary => 1 });
open my $dta, "<", "data.csv" or die "data: $!\n";
while (my $row = $csv->getline ($dta)) {
$sth->execute (@$row);
}
$csv->eof or $csv->error_diag;
$sth->finish;
$dbh->commit;
Enjoy, Have FUN! H.Merijn
| [reply] [d/l] |
|
| [reply] |
|
| [reply] |
|
Correct, but if you only install Text::CSV, it doesn't automatically install Text::CSV_XS, so you don't get the speed, so I keep promotong the latter
Talk is here. Opera browser might be required, as it depends on <link rel="next" type="text/html" href="..." /> in the <head> section, which Opera supports with FastForward. To skip directly to the first slide, click here.
Enjoy, Have FUN! H.Merijn
| [reply] [d/l] [select] |
|
| [reply] |
|
No, this was just an example of how to mix CSV and DBI
Your problem could be translated to something like
my $csv = Text::CSV_XS->new ({ binary => 1 });
open my $dta, "<", "data.csv" or die "data: $!\n";
my @rows;
while (my $row = $csv->getline ($dta)) {
push @rows, $row;
}
$csv->eof or $csv->error_diag;
close $dta;
# pivot
my @data = map { my $col = $_; join "|", map { $_->[$col] } @rows } 0.
+.$#{$rows[0]};
Which will convert
123,foo,2,FOO
234,bar,2,BAR
345,zap,3,ZAP
456,bok,5,BOK
into
( "123|234|345|456",
"foo|bar|zap|bok",
"2|2|3|5",
"FOO|BAR|ZAP|BOK")
Assuming all records have equal length
Enjoy, Have FUN! H.Merijn
| [reply] [d/l] [select] |
Re: CSV manipulation with perl
by olus (Curate) on Mar 09, 2009 at 15:06 UTC
|
Following your approach, I came with the following code that creates one array with all the strings that result from the concatenation of all the columns. Note that I'm not taking into account eventual memory problems that may arise from the amount of columns/lines your csv may have.
use strict;
use warnings;
use Data::Dumper;
my @parsed = ();
while (<DATA>) {
my $in = $_;
chomp($in);
my @line = split(/,/, $in);
for my $i(0..$#line) {
$parsed[$i] .= $line[$i].'|';
}
}
map {chop} @parsed;
print Dumper(\@parsed);
__DATA__
123,"text",66,"more text"
124,"text1",67,"more text1"
125,"text2",68,"more text2"
outputs
$VAR1 = [
'123|124|125',
'"text"|"text1"|"text2"',
'66|67|68',
'"more text"|"more text1"|"more text2"'
];
| [reply] [d/l] [select] |
Re: CSV manipulation with perl
by clueless newbie (Curate) on Mar 09, 2009 at 15:43 UTC
|
Ignoring the problem of parsing CSV, is this what you're looking for?
#!/perl/bin/perl
use Smart::Comments;
use strict;
use warnings;
my $_raa;
my $i=0;
while (<DATA>) {
chomp;
my @a=split(',');
for my $j (0..$#a) { $_raa->[$j][$i]=$a[$j] };
$i++;
};
### $_raa
__DATA__
123,"text",66,"more text"
124,"text1",67,"more text1"
125,"text2",68,"more text2"
which yields ### $_raa: [
### [
### '123',
### '124',
### '125'
### ],
### [
### '"text"',
### '"text1"',
### '"text2"'
### ],
### [
### '66',
### '67',
### '68'
### ],
### [
### '"more text"',
### '"more text1"',
### '"more text2"'
### ]
### ]
| [reply] [d/l] [select] |
|
my @_raa;
while (<DATA>) {
chomp;
my @a=split(','); # use Text::CSV_XS here.
push @{$_raa[$_]}, $a[$_] for 0..$#a;
};
And then, if you do use Text::CSV_XS to extract the fields, use it to merge the fields back together, since the extraction will remove quotes and escapes, so you'll need to push it back with quotes and escapes (which Text::CSV_XS does for you automatically). The difference between my code and yours is that if some rows have more columns than others, we'll do different things. Yours will have some undef's in those places, while mine will silently ignore them, possibly shifting things around incorrectly. What to do in this scenario wasn't spec'd by the OP, so it's hard to tell which one is right. | [reply] [d/l] |
Re: CSV manipulation with perl
by Bloodnok (Vicar) on Mar 09, 2009 at 14:19 UTC
|
| [reply] |
|
| [reply] |
|
Thanks,yes it should, but that doesnt help my immediate dilemma... :) Do you have any ideas? thanks
----- Eschew obfuscation, espouse elucidation!
| [reply] |
|
|