Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

looping through a csv file

by pbassnote (Acolyte)
on Feb 03, 2014 at 20:38 UTC ( [id://1073262]=perlquestion: print w/replies, xml ) Need Help??

pbassnote has asked for the wisdom of the Perl Monks concerning the following question:

First, the standard disclaimer: I am new to programmming, and even newer to Perl programmming. I have a large csv text file that I'll call folks.txt in which each line of the file is organized into 5 columns of data in which the first column is a number, and the next four columns are e-mail addrs. The first thing I do with folks.txt is:

open(FOLKS, "folks.txt") || die "can't open folks: $!"; while (<FOLKS>) { chomp; ($number, $addr1, $addr2, $addr3, $addr4) = (split /,/)[0,1,2,3,4];

What I wish to do with this data is to: 1) add four labels to associate with each of the four addrs, then 2) output each line formatted so that each corresponding label-addr pair is printed out with its corresponding number. So, if I were to define each of the labels like so:

$label1 = "fee" $label2 = "fi" $label3 = "fo" $label4 = "fum"

somewhere, likely above my "open(...)" line, that should work. But here I finally get to the problem. I want to be able to iterate through folks.txt four complete times in order to get the four labels associated with each of the four addrs. If I try to do this:

print "$label1,$number,$addr1\n"; print "$label2,$number,$addr2\n"; print "$label3,$number,$addr3\n"; print "$label4,$number,$addr4\n";

It produces a csv file, but not the way I need, as this will loop through the four print statements in the consecutive order just as shown above. What I need is for the first print statement to iterate through the entire folks.txt file, then the second print statement to iterate through the entire folks.txt file, and so on through the other two print statements. I believe that this can be accomplished with a looping construct, but since I'm kind of new to programming, I don't have a good command of loops. Suggestions, please? TIA, Dave

Replies are listed 'Best First'.
Re: looping through a csv file
by kennethk (Abbot) on Feb 03, 2014 at 20:51 UTC

    Welcome to the monastery.

    First, rather that iterating over the file 4 times (possible, but inefficient -- see seek), I would suggest pulling in all the data in one loop, and then processing it in a second loop. My experience has always been the first step to understanding how to deal with data is knowing what kind of file structure to use. You could store the lines of the file in an array, or you could use an array of arrays and then traverse that data structure later. Since you have your labels, maybe something like:

    use strict; use warnings; my @labels = ('fee', 'fi', 'fo', 'fum'); open(my $fh, '<', "folks.txt") || die "can't open folks: $!"; my @list; while (<$fh>) { chomp; push @list, [split /,/]; } for my $element (@list) { print "$labels[0],$element->[0],$element->[1]\n"; } for my $element (@list) { print "$labels[1],$element->[0],$element->[2]\n"; } for my $element (@list) { print "$labels[2],$element->[0],$element->[3]\n"; } for my $element (@list) { print "$labels[3],$element->[0],$element->[4]\n"; }
    or, a little better:
    use strict; use warnings; my @labels = ('fee', 'fi', 'fo', 'fum'); open(my $fh, '<', "folks.txt") || die "can't open folks: $!"; my @list; while (<$fh>) { chomp; push @list, [split /,/]; } for my $i (0 .. 3) { for my $element (@list) { print "$labels[$i],$element->[0],$element->[$i+1]\n"; } }
    or maybe even
    use strict; use warnings; my @labels = ('fee', 'fi', 'fo', 'fum'); open(my $fh, '<', "folks.txt") || die "can't open folks: $!"; my @list; while (<$fh>) { chomp; my @line = split /,/; my %hash = (number => shift @line); for my $i (0 .. $#labels) { $hash{$labels[$i]} = $line[$i]; } push @list, \%hash; } for my $label (@labels) { for my $element (@list) { print "$label,$element->{number},$element->{$label}\n" } }

    Second, please note I added some use statements at the top of the script. Read Use strict warnings and diagnostics or die to learn why. I also swapped to a 3-argument open with an indirect file handle. Especially if you are just learning, good habits to develop.

    Third, rather than rolling you own, try using CPAN. In particular, for dealing with CSV, try Text::CSV. Did you know CSV files contain escaping sometimes? Text::CSV does, and it's tested.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      You guys are awesome. It'll take me some time to figure out your coding suggestions, especially the CPAN one, but it'll be time well spent. Thanks!

Re: looping through a csv file
by Kenosis (Priest) on Feb 03, 2014 at 20:51 UTC

    If you would, please provide a line of your data (even though you've described it--and redacted, if necessary), and then show a finished line the way you want it.

    Also, unless you absolutely know that there are no commas in the email addresses (commas are allowed in the local part of email addresses if enclosed by double-quotes), then you should use a csv parsing module, such as Text::CSV_XS, instead of splitting the line on commas.

Re: looping through a csv file
by Laurent_R (Canon) on Feb 03, 2014 at 21:32 UTC
    Just a small side note. If you insist on using the split function to process your CSV file despite the previous advice, or if you have to use it in a different context, rather than:
    ($number, $addr1, $addr2, $addr3, $addr4) = (split /,/)[0,1,2,3,4];
    you can simply write:
    my ($number, $addr1, $addr2, $addr3, $addr4) = split /,/;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1073262]
Approved by rnewsham
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-04-25 14:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found