This is an archived low-energy page for bots and other anonmyous visitors.
Please sign up if you are a human and want to interact.
arunhorne has asked for the wisdom of the Perl Monks concerning the following question:
Monks
I have a data file in the form:
key1,valueA
key2,valueB
key2,valueC
key3,valueD
I want to translate it to:
key1,valueA
key2,valueB,valueC
key3,valueD
I'm sure this is easy in perl but I can't seem to do it easily. Also it seems like a prime candidate for awk, but again same problem... can anyone help?
Thanks
â¢Re: Reformat Text File
by merlyn (Sage) on Oct 05, 2004 at 10:16 UTC
|
my %result;
while (<>) {
chomp;
my ($k, $v) = split /,/;
push @{$result{$k}}, $v;
}
for (sort keys %result) {
print join(",", $_, @{$result{$_}}), "\n";
}
| [reply] [d/l] |
|
|
This assumes that the keys are in sorted order to begin with and that it's easy to re-sort them. Instead, you might want to preserve the original order as much as possible.
my %result;
my @keys;
while (<>) {
chomp;
my ($k, $v) = split /,/;
push @keys,$k if (!exists $result{$k});
push @{$result{$k}}, $v;
}
for (@keys) {
print join(",", $_, @{$result{$_}}), "\n";
}
| [reply] [d/l] |
|
|
> This assumes that the keys are in sorted order
What makes you say that?
| [reply] |
|
|
|
|
Re: Reformat Text File
by dragonchild (Archbishop) on Oct 05, 2004 at 10:19 UTC
|
open( my $fh, $infile ) or die "Cannot open '$infile' for reading: $!\
+n";
my %data;
while ( defined( $_ = <$fh> ) ) {
chomp;
my @line = split( $_, ',', 2 );
push @{$data{$line[0]}}, $line[1];
}
close( $fh );
open( $fh, ">$outfile" ) or die "Cannot open '$outfile' for writing: $
+!\n";
foreach my $k ( sort keys %data ) {
print $fh join( ',', $k, @{$data{$k}} ), $/;
}
close( $fh );
Now, if this is homework, I wouldn't turn that in - it'll be obvious you got help. If it's not homework, take the time to figure out what I did. The key is push @{$data{$line[0]}}, $line[1];.
Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.
| [reply] [d/l] [select] |
Re: Reformat Text File
by Happy-the-monk (Canon) on Oct 05, 2004 at 10:21 UTC
|
You would use a hash of arrays, as shown in perllol:
my %hash;
while ( <FILE> ) {
chomp;
my ( $key, $value ) = split /,/ $_;
# treats hashvalue as reference to an array:
push @{ $hash{ $key } } , $value;
}
foreach my $key ( keys %hash ) {
# same dereferencing as above.
print "$key," , join( "," => @{ $hash{ $key } } ) , "\n";
}
Cheers, Sören | [reply] [d/l] |
Re: Reformat Text File
by ikegami (Patriarch) on Oct 05, 2004 at 10:30 UTC
|
{
my $last;
my @list;
local $, = ',';
local $\ = $/;
while (<>) {
chomp;
my ($key, $val) = split($,, $_, 2);
print $last, splice(@list) if ($.!=1 && $key ne $last);
$last = $key;
push(@list, $val);
}
print $last, @list if (@list);
}
I like the symetry of $/ and $, being used for both input and output.
| [reply] [d/l] [select] |
Re: Reformat Text File
by Jasper (Chaplain) on Oct 05, 2004 at 10:34 UTC
|
I'm slightly bored today, and I've made a few assumptions about the format of your text. 1 while s/^(\w+),(.*)\n\1,(\w+)$/$1,$2,$3/m; | [reply] [d/l] |
Re: Reformat Text File
by tmoertel (Chaplain) on Oct 05, 2004 at 11:11 UTC
|
Your example data file suggests that the input will always be sorted
by key. If this is true, you have a more efficient option than the
one suggested by most who answered your question.
Rather than reading the entire input file into memory (e.g., as
hash of lists) and then emitting your output, you can emit
output in passing, as soon as each output line is determined.
This approach has the advantage of requiring very little memory, which
is important if your input files can be large.
Here's one possible implementation, which uses autosplitting and
other handy command-line flags (see
perlrun):
#!/usr/bin/perl -lanF,
# if the current key (in $F[0]) is not the same as the
# last key we saw, print out the merged output line for
# the last key and then start a new merged output
# line for the current key
if ($last_key ne $F[0]) {
print $merged if $merged;
$merged = $_;
$last_key = $F[0];
}
# otherwise, the current key is the same as the last,
# and so we can merge this line's value portion (in
# $F[1]) with the previous
else {
$merged .= ",$F[1]";
}
# when we reach the end of the file, we must print
# the final merged output line
END { print $merged if $merged }
Because of the command-line switches we used, the body of the code
will be run for each line of input, and the following variables will
be set up for us automatically:
| $_ |
= the entire input line, with linefeed stripped |
| $F[0] |
= the key portion of the line |
| $F[1] |
= the value portion of the line |
Hope this helps.
Cheers, Tom
| [reply] [d/l] [select] |
|
|