http://www.perlmonks.org?node_id=977528


in reply to Re^4: Parse .csv file from FTP using Perl
in thread Parse .csv file from FTP using Perl

So the CSV file would look like this?

a,b,c 1,2,3

How does any of the fields in line 2 correspond to any of the fields in line 1? I'm afraid I don't follow :)

Edit: Wait, I think I get it now. What you want is a way to tell Perl, "hey, listen. What is the number of the field in the first line that has the value 'c'? Got that? Okay, now give me the value of the field with that number in the second line." Is that what you mean? There are actually several approaches, and I'm coming up with something momentarily. Hold on.

Edit-edit: Yes, got it now. Okay, gimme a couple minutes.

Replies are listed 'Best First'.
Re^6: Parse .csv file from FTP using Perl
by Sherlock Perl (Novice) on Jun 21, 2012 at 01:11 UTC
    What I want the code to be doing is: Return the value in Row 2 which is in the same column as "b" is in Row 1. This means that the code would have to locate where "b" is first I guess. I assume I need to use this operator ~= and some looping or have all the values in the first row stored in an array and the check each element maybe..? And when I find the location of b just return the value in the second row which is in the same column as b...again I assume some kind of loop maybe? Thank you again I really appreciate your help!

    Edit: Yes this is exactly what I want. Thank you very much for the help :) I owe you big time for all this help:)

      Alright, here we go. For these examples I assume you have a file some_csv.csv, with the following contents:

      Name,Type,Age Lily,Cat,4 Buster,Dog,5 Tweety,Bird,2

      The first approach is the 'naive' one. It does exactly what you want as specified, but it looks a bit messy.

      use strict; use warnings; use Text::CSV; use subs qw(hlookup); print "The animal in line 2 is ", hlookup("some_csv.csv", "Age", 2), " + years old.\n"; print "The critter in line 4 is called ", hlookup("some_csv.csv", "Nam +e", 4), ".\n"; sub hlookup { my $csv_file = shift; my $field_name = shift; my $row_num = shift; my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary + attribute. or die "Cannot use CSV: ".Text::CSV->error_diag (); open my $fh, "<:encoding(utf8)", $csv_file or die "test.csv: $!"; my $field_num; my $first_row = $csv->getline( $fh ); my @result; for (my $idx = 0; $idx <= @$first_row; $idx++) { if ($first_row->[$idx] eq $field_name) { $field_num = $idx; push @result, $first_row->[$field_num]; last; } } die "Field name '$field_name' not found on the first row of $csv_ +file" if not defined $field_num; while ( my $row = $csv->getline( $fh )) { push @result, $row->[$field_num]; } $csv->eof or $csv->error_diag(); close $fh; return $result[$row_num - 1]; # -1 since array indexes are 0-bas +ed, # and we count our lines 1-based }

      The second approach used Text::CSV's ability to give fields names. What I did here was read the first row, tell Text::CSV that the values of the fields are actually the names for that field, and then I told Text::CSV to get me the field of the row I'm interested in. It's a little bit clearer, and most of the looping is delegated to Text::CSV. I like it when my modules do my heavy lifting for me.

      use strict; use warnings; use Text::CSV; use subs qw(hlookup); print "The animal in line 2 is ", hlookup("some_csv.csv", "Age", 2), " + years old.\n"; print "The critter in line 4 is called ", hlookup("some_csv.csv", "Nam +e", 4), ".\n"; sub hlookup { my $csv_file = shift; my $field_name = shift; my $row_num = shift; my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary + attribute. or die "Cannot use CSV: ".Text::CSV->error_diag (); open my $fh, "<:encoding(utf8)", $csv_file or die "test.csv: $!"; my $field_num; my $first_row = $csv->getline( $fh ); $csv->column_names(@$first_row); my $all = $csv->getline_hr_all($fh, $row_num - 2, 1); # - 2 since array indexes are 0-based, # and we count our lines 1-based, # and because we've already read the first # line. return $all->[0]->{$field_name} }

      Both examples give me the output I will list below. I've also tried dabbling with DBD::CSV, because I thought it'd be nice if we could extract our information with SQL! But that's a module that allows you to set up entire relational database in CSV files, which is pretty much overkill for what we're trying to do here.

      Anyway, I'm sure these aren't the only approaches to go at things, but it surely should get you going. And you owe me nothing, it was my pleasure. But if you insist, I can send you my paypal account through a private message ;)

      The animal in line 2 is 4 years old. The critter in line 4 is called Tweety.
        Thank you so much. I will try both of the methods and choose whichever I find easier to understand. Do you think it would be easy to alter the code if for example the two rows with values start not from the beginning but from some other cell. For example they are in the range "d4:n5" or any other similar range? If you are in nyc your beer or whatever you drink is on me :)

      muba already helped you with the basics, which were quite well explained. muba++. Text::CSV is a wrapper over Text::CSV_XS and Text::CSV_PP. Whatever is developed in Text::CSV_XS is copied into Text::CSV_PP making them behave exacly the same, but the XS version is up to 100 times as fast. Both act on streams of data, wheather that is a file or a pipe. That means that these modules do not allow to go "back" into the stream to look at previous lines. As muba already mentioned, getline_hr_all can be very handy here.

      If you need to do many lookups, like in a spreadsheet or a database, consider using DBD::CSV (when you are acquainted with DBI) or Spreadsheet::Read, which enables you to look at CSV file with the eye of a spreadsheet and enables you to direct access any field in a CSV data structure.

      One final remark about your original code. The "loop" could easily be shortened a lot:

      my @ftpFiles =$ftp->ls (); + my $i = 0; while ($i < $#ftpFiles) { + if ($ftpFiles[$i] =~ m/andv/) { $ftp->get ($ftpFiles[$i]); $myfile = $ftp; } $i++; } $ftp->quit (); => foreach my $file (grep m/andv/ => $ftp->ls ()) { $ftp->get ($file); $myfile = $file; # <= your original code probably does not do what + you want } $ftp->quit ();

      update: corrected mumba to muba. sorry.


      Enjoy, Have FUN! H.Merijn
        Thank you for the clarifications. But why do you think my original code doens't do the job it is supposed to do?