Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

indefinite number of columns

by torres09 (Acolyte)
on Jul 15, 2013 at 13:29 UTC ( #1044374=perlquestion: print w/ replies, xml ) Need Help??
torres09 has asked for the wisdom of the Perl Monks concerning the following question:

actually I am trying to extract columns of a csv file . The problem I am facing is that the csv can have arbitrary number of columns , the number of columns will be defined by the user and will be given as an input to script

Now my question is how to define arrays when I don't know while coding , how many columns will be extracted i.e. if I want column 1 entries to go in one array , second column entry to another and so on . How can i define my array

Comment on indefinite number of columns
Re: indefinite number of columns
by marto (Bishop) on Jul 15, 2013 at 13:35 UTC

    Why not just parse each workbook of the spreadsheet using Spreadsheet::ParseExcel? The first example shows how to do this. Modifying this example to work on a per column basis (rather than by row) would be trivial.

    Update: My mistake, you're working with a csv file.

    book.csv:

    1,2 ,3 1,2, 3 1,2,43 1 12,1 apple, 1, 3 orange 7,onion, 8, 9

    Sample code using Text::CSV:

    #!/usr/bin/perl use strict; use warnings; use Text::CSV; use Data::Dumper; my $csv = Text::CSV->new({ sep_char => ',' }); my $file = "book.csv"; open(my $csvdata, '<', $file) or die "Could not open '$file' $!\n"; while (my $line = <$csvdata>) { chomp $line; if ($csv->parse($line)) { my @fields = $csv->fields(); print Dumper \@fields; } }

    What exactly are you trying to achieve?

Re: indefinite number of columns
by hippo (Curate) on Jul 15, 2013 at 13:53 UTC
    if I want column 1 entries to go in one array , second column entry to another and so on . How can i define my array
    use strict; use warnings; use Data::Dumper; my $ncols = 5; # or obtained from user my @bigfatarray = (); push @bigfatarray, [] for [1 .. $ncols]; while (<DATA>) { my @cols = split (/,/, $_); for my $i (0 .. $ncols - 1) { push @{$bigfatarray[$i]}, $cols[$i]; } } print Dumper (\@bigfatarray); exit; __DATA__ 1,2,3,4,5 a,b,c,d,e

    Obviously, you'll need to do more than just split your inputs but this should serve as an illustration.

Re: indefinite number of columns
by zork42 (Monk) on Jul 15, 2013 at 14:05 UTC
    if I want column 1 entries to go in one array , second column entry to another and so on . How can i define my array
    The easiest solution is to have an array or arrays, AoA for short. These are 2D arrays. See:
    To understand AoA's you'll need to learn about references:
    You can then store each field in $AoA[row number][column number]
    It's unlikely you'll need to ask the user how many columns there are.

    Update: doh! Cross posted with marto and hippo, though luckily I think together our posts explain different levels and aspects of the solution.

    Text::CSV is definitely the way to go!
Re: indefinite number of columns
by hdb (Prior) on Jul 15, 2013 at 14:21 UTC

    If you have a header line in your data, you might want to consider to use a hash of arrays. Each array would contain one of your columns and could be adressed by the name given in the header.

    use strict; use warnings; use Data::Dumper; my $n = 4; # number of columns, given as user input my %hcolumns; # this is the hash of the columns you are looking for my @header = split /,/,<DATA>; # pls use Text::CSV for any real work chomp @header; while(<DATA>){ chomp; my @record = split /,/; # pls use Text::CSV for any real wo +rk for (0..$n-1) { push @{ $hcolumns{$header[$_]} }, $record[$_]; + } } print Dumper \%hcolumns; # hash of columns __DATA__ Name,FirstName,Instrument Mangelsdorff,Albert,Trombone Parker,Charlie,Trumpet Coltrane,John,Tenor Evans,Bill,Piano

    The number of columns can still be provided by the user. In this example here, I have chosen a number larger that the available columns resulting in an empty column (and some warnings when running the script).

Re: indefinite number of columns
by mtmcc (Hermit) on Jul 15, 2013 at 14:30 UTC
    Or an array of arrays, but this would use more memory than using hashes as suggested above, which might be important if your files are very large:

    #!/usr/bin/perl use strict; use warnings; my $fileName = $ARGV[0]; my $numberOfColumns = 0; my @line = (); my $lineNumber = 0; my $x = 0; my $p = 0; my @arrayOfArrays = (); my $firstTime = 0; open (FILE, "<", $fileName); while (<FILE>) { chomp; @line = split (",", $_); if ($firstTime == 0) { $numberOfColumns = @line; $firstTime = 1; } for($x = 0; $x < $numberOfColumns; $x += 1) { $arrayOfArrays[$x][$lineNumber] = $line[$x]; } $lineNumber +=1; } #to access your values: column one: $p=0, column 2: $p = 1 etc. for ($p = 0; $p < $numberOfColumns; $p += 1) { for ($x= 0; $x<$lineNumber; $x += 1) { print STDERR "$arrayOfArrays[$p][$x]\n"; } }

    -Michael
      Or an array of arrays, but this would use more memory than using hashes as suggested above, which might be important if your files are very large:
      I don't understand this, please could you explain:

      1. Why would an array of arrays use more memory than a hash of arrays?
        I'd expect it to be the other way round.
      2. would the "of arrays" data occupy the same amount of memory in both AoA and HoA in this example?
        (This seems reasonable as this is just the total data of all the rows.)
      3. From (2) follows: Why does an array use more memory than a hash?

      Thanks :)
        Touché!

        My mistake. I guess the hash of arrays would take up slightly more memory, because instead of using an array index to label individual arrays, it uses the column headers to label them. I imagine that if the number of columns is low, there wouldn't be a large difference?

        Thanks for pointing that out!

        -Michael

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1044374]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2014-12-25 07:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls