isha has asked for the wisdom of the Perl Monks concerning the following question:
I have a csv file with following data:
Name,Comment
"Isha","""Hello!!"""
"Malav" ,"""koni comments nakhu ? tari?"""
"Mihir","""Dont know what to write :)"""
"Mukesh","""Kya comment add karun"""
"Tanmay Anjaria","""I - Intelligent
S - Smart
H - Highly thoughtful
A - Antonyms should be taken for all of the above to know Isha :-))
Just Kidding... Keep Smiling, dear…"""
The first row Name and Coment are the column name.
Now I want to read this file to a hash such that i can access that data using the row count.
Like if i print $hash{"Name"}[0] it will print 'Isha' and if i print $hash{"Comment"}[0] then it will print 'Hello!!'.
How can i do this with Perl?
Re: Read the csv file to a hash....
by tirwhan (Abbot) on Jul 06, 2007 at 12:11 UTC
|
Take a look at Text::CSV_XS. Things should be fairly obvious from that point on (hint: to add an element to an array reference you can use
push @{$hash{"Name"}}, $field[0];
)
| [reply] [Watch: Dir/Any] [d/l] |
Re: Read the csv file to a hash....
by GrandFather (Saint) on Jul 06, 2007 at 12:16 UTC
|
| [reply] [Watch: Dir/Any] |
Re: Read the csv file to a hash....
by citromatik (Curate) on Jul 06, 2007 at 12:45 UTC
|
So, you want to store all the 'Names' in an array and all the 'Comments' in another array, and put both in a hash, right?
So, you are thinking in something like:
use strict;
use warnings;
my @names;
my @comments;
my %hrec;
while (<DATA>){
chomp;
my ($name,$comment) = split ",";
$name =~ s/"//g;
$comment =~ s/"//g;
push @names,$name;
push @comments,$comment;
}
$hrec{'Names'} = @names;
$hrec{'Comments'} = @comments;
But this doesn't work, because you can not put an array in a hash entry, only a scalar, so in the hash you will have to store a reference to an array:
# $hrec{'Names'} = @names;
# $hrec{'Comments'} = @comments;
$hrec{'Comments'} = \@names;
$hrec{'Names'} = \@comments;
Because in the hash there are references to arrays instead of arrays, the syntax to access their elements are a bit different:
$hrec{'Names'}->[0]; ## Isha
$hrec{'Comment'}->[0]; ## Hello!! ;
Another solution would be to store the pairs "Name / Comment" in a simple hash, like:
my %hrec;
while (<DATA>){
chomp;
my ($name,$comment) = split ",";
$name =~ s/"//g;
$comment =~ s/"//g;
$hrec{$name} = $comment;
}
See perlreftut and perlref for more info about references
Said all that, I agree with tirwhan and GrandFather, the best solution would be using Text::CSV_XS. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
$hrec{'Comments'} = \@names;
$hrec{'Names'} = \@comments;
Any reason to switch the keys and values? :-)
Another solution would be to store the pairs "Name / Comment" in a simple hash
... unless the OP really needs to preserve the order. While I don't see the importance, it could be that way it has to, whatever the reason the person who gave the assignment (if it is).
Update
the best solution would be using Text::CSV_XS
I'd say that the core problem is arranging the data structure, not parsing the source file. While using that module might be one of the best among its competitors, it does solve only half the problem.
Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!
| [reply] [Watch: Dir/Any] [d/l] |
Re: Read the csv file to a hash....
by RMGir (Prior) on Jul 06, 2007 at 12:40 UTC
|
Hmm, this sounds like fun homework. :)
My solution doesn't deal with "ragged" .csv's -- if there's
a line with more entries than the others, things will get
thrown off. That's easy to fix, I'll leave that to you.
use strict;
use warnings;
use Data::Dumper qw(Dumper);
open(INPUT,"foo.csv") or die "Can't open foo.csv, $!";
my @columns;
my %hash;
while(<INPUT>) {
chomp;
my $index=0;
if(/"/) {
push @{$columns[$index++]},$2 while /("+)(.*?)\1/g;
}
else {
push @{$columns[$index++]},$1 while /([^,]+)/g;
}
}
my @headings=map {shift @$_} @columns;
print "Headings: @headings\n";
@hash{@headings}=@columns;
print "Hash contents:\n";
print Dumper(\%hash);
| [reply] [Watch: Dir/Any] [d/l] |
Re: Read the csv file to a hash....
by snopal (Pilgrim) on Jul 06, 2007 at 15:58 UTC
|
Is there any reason that the following structure would be inappropriate?
#!/usr/bin/perl -w
use strict;
use Text::CSV_XS;
my $fh;
unless (open $fh, "filename") {
die "No file available\n";
}
my $csv = Text::CSV_XS->new({binary => 1});
my @people;
<$fh>; # Drop column names on the floor
while ( my $line = <$fh> ) {
my $status = $csv->parse($line);
my @columns = $csv->fields();
my %h;
@h{('Name','Comment')} = @columns;
push @people, \%h;
}
close $fh;
print "Name: " . $people[2]->{'Name'}
. ", Comment: " . $people[2]->{'Comment'}
. "\n";
It seems to me that keeping the associated data together is a better solution to this problem. Also, looping by key field is more flexible than looping by dedicated array name.
| [reply] [Watch: Dir/Any] [d/l] |
|
That's reasonable, but I'd think that instead of
<$fh>; # Drop column names on the floor
the script should read that and parse it, so it gets the
column names from the .csv file rather than hardcoding
Name and Comment.
Apart from that, that works very nicely, and Text::CSV_XS is much more robust than my //g approach above.
| [reply] [Watch: Dir/Any] [d/l] |
|
I agree wholeheartedly that a production system should use all available data. Your point is extremely valid. I have almost always applied column header preservation in my production environments.
I made a design decision here so that I could highlight the storage strategy here over the complexity of the implementation (yet showing off a bit with the assignment).
| [reply] [Watch: Dir/Any] |
|
<$fh>; # Drop column names on the floor
Somehow, it hurts me to see this in void context :-)
#!/usr/bin/perl
use strict;
use warnings;
use Text::ParseWords;
# WARNING: this code may generate some warnings
# but checking is omitted intentionally
my @headers;
{
local $_ = <DATA>;
chomp;
@headers = parse_line(',', 0, $_);
}
my @people;
while (<DATA>) {
chomp;
my @fields = parse_line(',', 0, $_);
push @people, { map { $headers[$_], $fields[$_] } 0 .. $#headers }
+;
}
printf qq(%s says "%s"\n), @{$people[0]}{@headers};
# Isha says "Hello!!"
__DATA__
Name,Comment
"Isha","""Hello!!"""
"Malav" ,"""koni comments nakhu ? tari?"""
"Mihir","""Dont know what to write :)"""
"Mukesh","""Kya comment add karun"""
"Tanmay Anjaria","""I - Intelligent S - Smart H - Highly thoughtful A
+- Antonyms should be taken for all of the above to know Isha :-)) Jus
+t Kidding... Keep Smiling, dear\u2026"""
Or, following original requirement exactly:
my %people;
while (<DATA>) {
chomp;
my @fields = parse_line(',', 0, $_);
for (0 .. $#headers) {
push @{$people{$headers[$_]}}, $fields[$_];
}
}
printf qq(%s says "%s"\n), map { $people{$_}[0] } @headers;
# Isha says "Hello!!"
Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
{
local $_ = <DATA>;
chomp;
..
}
Is wicked. SO MANY TIMES I've re-invented the wheel and created little stupid variable names.. yadda yadda. Never agian. I will localize the scope on $_ and us it as a temporary string manipulation variable.. Thanks for doing that guys homework so well! :-)
Kurt
PS: I really have to echo PBP on this: Use Text::CSV_XS to extract complex variable-width fields. - DCFB | [reply] [Watch: Dir/Any] [d/l] |
Re: Read the csv file to a hash....
by wind (Priest) on Jul 06, 2007 at 19:52 UTC
|
Simply Parse to create arrays for each column, and then translate to a hash at the very end.
use Text::CSV_XS;
use strict;
use warnings;
my $csvfile = shift or die "No filename specified";
my $csv = Text::CSV_XS->new();
my @columns;
open(FILE, $csvfile) or die "Can't open $csvfile: $!";
while (<FILE>) {
$csv->parse($_) or die "parse() failed: " . $csv->error_input();
my @data = $csv->fields();
for my $i (0..$#data) {
push @{$columns[$i]}, $data[$i];
}
}
close(FILE);
my %hash = map {shift @$_ => $_} @columns;
use Data::Dumper;
print Dumper(\%hash);
- Miller | [reply] [Watch: Dir/Any] [d/l] |
Re: Read the csv file to a hash....
by generator (Pilgrim) on Aug 31, 2015 at 21:53 UTC
|
The OP implies a desire to preserve order:
"I want to read this file to a hash such that i can access that data using the row count."
If that is the case a hash is not an appropriate repository.
I would think an array would be more appropriate.
| [reply] [Watch: Dir/Any] |
Re: Read the csv file to a hash....
by Tux (Canon) on Jul 09, 2007 at 11:49 UTC
|
Not one of the answers addresses the problem with using <> for reading CSV lines that might have embedden newlines
use strict;
use warnings;
use Text::CSV_XS;
my %hash;
my $csv = Text::CSV_XS->new ({binary => 1, eol => "\n"});
open my $io, "<", "file.csv" or die "file.csv: $!";
my @fld = @{$csv->getline ($io)};
while (my $row = $csv->getline ($io)) {
push @{$hash{$fld[$_]}}, $row->[$_] for 0 .. $#fld;
}
close $io;
Enjoy, Have FUN! H.Merijn
| [reply] [Watch: Dir/Any] [d/l] |
|
|