Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Accessing cells in CSV files

by newbie1991 (Acolyte)
on May 13, 2013 at 14:48 UTC ( #1033300=perlquestion: print w/ replies, xml ) Need Help??
newbie1991 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks :) I'm working with a CSV file with large dimensions (1100~rows, 60~ columns). I want to split the list of 1100~ using the information present in one of the columns. To do this, I've copied the contents into a multi-dimensional array. However, when I try to access  $Arr[0][0], there is no output.  $Arr[0] , it shows me the first row of the array. I want to approach the array-column wise, and I'm not sure what changes to make. This is the code I am using to read the csv file in question :

#!/usr/bin/perl-w my @list = <*.csv>; my $elem; foreach $elem (@list) { open (FILE, $elem) or die "Cannot find $elem!\n"; my @array = <FILE>; close FILE; }
Here are a couple of rows from the file:
organism O2_REQUIREMENT Domain Classification a.acidocaldarius Aerobe BACTERIA FIRMICUTES a.actinomycetemcomitans Facultative BACTERIA PROTEO
To clarify : $array[0][0] should give "organism" but I get no output. $array[0] gives "organism O2_REQUIREMENT Domain Classification"

Comment on Accessing cells in CSV files
Select or Download Code
Re: Accessing cells in CSV files
by hdb (Prior) on May 13, 2013 at 14:57 UTC

    my @array = <FILE>; stores the lines of your file in the array. They still need to be split into columns:

    use strict; use warnings; use Data::Dumper; my @array = <DATA>; # now contains the rows print Dumper(\@array); $array[$_] = [ split /\s+/, $array[$_] ] for 0..$#array; # split lines + into columns print Dumper(\@array); print $array[0][0]; # does not look like CSV... __DATA__ organism O2_REQUIREMENT Domain Classification a.acidocaldarius Aerobe BACTERIA FIRMICUTES a.actinomycetemcomitans Facultative BACTERIA PROTEO
      Data::Dumper worked perfectly, thank you :) I usually work with txt files, and they're delimited properly so the splitting never occurred to me! As for the input file, it was handed to me in that particular format. I figured it wasn't comma-separated, but it isn't creating any issues right now, so I suppose it's alright (for now).
      Text::CSV_XS or Text::CSV will also handle tab separated values (TSV) like wot we appear to have 'ere.

      One of the reasons for using libraries is that the raw method you proposed will handle most cases but won't handle escaped/quoted forms if they exist in the data.

      If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)
Re: Accessing cells in CSV files
by Tux (Monsignor) on May 13, 2013 at 14:58 UTC

    Consider using Text::CSV_XS (or the slower cousin Text::CSV). If that interface is too complicated, consider DBD::CSV (a DBI interface) or Spreadsheet::Read (which needs Text::CSV_XS).

    BTW 1100 x 60 isn't really large. Normal computers can read that without problem (unless each field is 600Mb).

    Your data-example doesn't show CSV btw, but tab or space separated data.


    Enjoy, Have FUN! H.Merijn

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1033300]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (12)
As of 2014-12-19 17:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (90 votes), past polls