Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^4: Parse .csv file from FTP using Perl

by Sherlock Perl (Novice)
on Jun 21, 2012 at 00:55 UTC ( #977526=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Parse .csv file from FTP using Perl
in thread Parse .csv file from FTP using Perl

Yes I perfectly understand your logic. I have two rows in the .csv file and lets say the .csv looks like this: A1 = a B1 = b C1 = c .... etc etc A2 = 1 B2 = 2 C2 = 3 .... etc etc. So I want to find the value in Row 2 (A2, B2, C2,...) that corresponds to lets say value "c" in Row1. It is basically like HLOOKUP function in excel. I am not sure if I am clear enough in explaining what I want to do?


Comment on Re^4: Parse .csv file from FTP using Perl
Re^5: Parse .csv file from FTP using Perl
by muba (Priest) on Jun 21, 2012 at 01:03 UTC

    So the CSV file would look like this?

    a,b,c 1,2,3

    How does any of the fields in line 2 correspond to any of the fields in line 1? I'm afraid I don't follow :)

    Edit: Wait, I think I get it now. What you want is a way to tell Perl, "hey, listen. What is the number of the field in the first line that has the value 'c'? Got that? Okay, now give me the value of the field with that number in the second line." Is that what you mean? There are actually several approaches, and I'm coming up with something momentarily. Hold on.

    Edit-edit: Yes, got it now. Okay, gimme a couple minutes.

      What I want the code to be doing is: Return the value in Row 2 which is in the same column as "b" is in Row 1. This means that the code would have to locate where "b" is first I guess. I assume I need to use this operator ~= and some looping or have all the values in the first row stored in an array and the check each element maybe..? And when I find the location of b just return the value in the second row which is in the same column as b...again I assume some kind of loop maybe? Thank you again I really appreciate your help!

      Edit: Yes this is exactly what I want. Thank you very much for the help :) I owe you big time for all this help:)

        Alright, here we go. For these examples I assume you have a file some_csv.csv, with the following contents:

        Name,Type,Age Lily,Cat,4 Buster,Dog,5 Tweety,Bird,2

        The first approach is the 'naive' one. It does exactly what you want as specified, but it looks a bit messy.

        use strict; use warnings; use Text::CSV; use subs qw(hlookup); print "The animal in line 2 is ", hlookup("some_csv.csv", "Age", 2), " + years old.\n"; print "The critter in line 4 is called ", hlookup("some_csv.csv", "Nam +e", 4), ".\n"; sub hlookup { my $csv_file = shift; my $field_name = shift; my $row_num = shift; my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary + attribute. or die "Cannot use CSV: ".Text::CSV->error_diag (); open my $fh, "<:encoding(utf8)", $csv_file or die "test.csv: $!"; my $field_num; my $first_row = $csv->getline( $fh ); my @result; for (my $idx = 0; $idx <= @$first_row; $idx++) { if ($first_row->[$idx] eq $field_name) { $field_num = $idx; push @result, $first_row->[$field_num]; last; } } die "Field name '$field_name' not found on the first row of $csv_ +file" if not defined $field_num; while ( my $row = $csv->getline( $fh )) { push @result, $row->[$field_num]; } $csv->eof or $csv->error_diag(); close $fh; return $result[$row_num - 1]; # -1 since array indexes are 0-bas +ed, # and we count our lines 1-based }

        The second approach used Text::CSV's ability to give fields names. What I did here was read the first row, tell Text::CSV that the values of the fields are actually the names for that field, and then I told Text::CSV to get me the field of the row I'm interested in. It's a little bit clearer, and most of the looping is delegated to Text::CSV. I like it when my modules do my heavy lifting for me.

        use strict; use warnings; use Text::CSV; use subs qw(hlookup); print "The animal in line 2 is ", hlookup("some_csv.csv", "Age", 2), " + years old.\n"; print "The critter in line 4 is called ", hlookup("some_csv.csv", "Nam +e", 4), ".\n"; sub hlookup { my $csv_file = shift; my $field_name = shift; my $row_num = shift; my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary + attribute. or die "Cannot use CSV: ".Text::CSV->error_diag (); open my $fh, "<:encoding(utf8)", $csv_file or die "test.csv: $!"; my $field_num; my $first_row = $csv->getline( $fh ); $csv->column_names(@$first_row); my $all = $csv->getline_hr_all($fh, $row_num - 2, 1); # - 2 since array indexes are 0-based, # and we count our lines 1-based, # and because we've already read the first # line. return $all->[0]->{$field_name} }

        Both examples give me the output I will list below. I've also tried dabbling with DBD::CSV, because I thought it'd be nice if we could extract our information with SQL! But that's a module that allows you to set up entire relational database in CSV files, which is pretty much overkill for what we're trying to do here.

        Anyway, I'm sure these aren't the only approaches to go at things, but it surely should get you going. And you owe me nothing, it was my pleasure. But if you insist, I can send you my paypal account through a private message ;)

        The animal in line 2 is 4 years old. The critter in line 4 is called Tweety.

        muba already helped you with the basics, which were quite well explained. muba++. Text::CSV is a wrapper over Text::CSV_XS and Text::CSV_PP. Whatever is developed in Text::CSV_XS is copied into Text::CSV_PP making them behave exacly the same, but the XS version is up to 100 times as fast. Both act on streams of data, wheather that is a file or a pipe. That means that these modules do not allow to go "back" into the stream to look at previous lines. As muba already mentioned, getline_hr_all can be very handy here.

        If you need to do many lookups, like in a spreadsheet or a database, consider using DBD::CSV (when you are acquainted with DBI) or Spreadsheet::Read, which enables you to look at CSV file with the eye of a spreadsheet and enables you to direct access any field in a CSV data structure.

        One final remark about your original code. The "loop" could easily be shortened a lot:

        my @ftpFiles =$ftp->ls (); + my $i = 0; while ($i < $#ftpFiles) { + if ($ftpFiles[$i] =~ m/andv/) { $ftp->get ($ftpFiles[$i]); $myfile = $ftp; } $i++; } $ftp->quit (); => foreach my $file (grep m/andv/ => $ftp->ls ()) { $ftp->get ($file); $myfile = $file; # <= your original code probably does not do what + you want } $ftp->quit ();

        update: corrected mumba to muba. sorry.


        Enjoy, Have FUN! H.Merijn

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://977526]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2014-12-21 13:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (105 votes), past polls