Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Problems With Tab Delimited Files

by tom10animal (Monk)
on Mar 26, 2008 at 18:20 UTC ( #676493=perlquestion: print w/replies, xml ) Need Help??

tom10animal has asked for the wisdom of the Perl Monks concerning the following question:

Monks, I'm trying to extract information from a tab delimited file that may contain an unlimited number of columns. The first part of the OPENPYC routine opens the file, the second part linear interpolates data for days that are not given in the original file. The print lines inside the subroutine are for testing purposes. The hashes that get created are used later to extract the data to use in a rather complex fish growth model which I have not shown here. For some reason, when the *.pyc file has zeros in it, as in the example Muskie.pyc my script does not recognie the datum. I've adaped this scipt from an earlier script that by definition had only two possible columns (a *.tem file, shown below) which I've just discovered also has a problem handling zero in the linear interpolation portion of the script. Where have I gone wrong? I'm fairly new to perl, and this is my first post at perlmonks. I hope this post is up to par, and thank you in advance for any help.
Contents of tab delimited files, $PYCFILE: Muskie.pyc day invertebrates fish 1 1 0 183 0 1 365 0 1
Northern Pike.pyc day invertebrates fish 1 3000 4000 365 3000 4000
#!/usr/bin/perl #Working With *.pyc files sub PYCFILENAMEASSIGNMENT{ $PYCFILE = "$FishSpecies".".pyc"; } sub OPENPYC{ @dayvalues = (); open PYCFILE, "/Applications/Bioenergetics/User\ Input\ Data\ Files/$P +YCFILE" or die "Cannot open $PYCFILE: $!"; @dailypyc=<PYCFILE>; shift(@dailypyc); ###I believe the problem is in here. foreach $line (@dailypyc){ print "\n"; chomp($line); @pyclinevalues=split("\t",$line); $day = $pyclinevalues[0]; chomp($day); print "Day $day\t"; push (@dayvalues,$day); shift(@pyclinevalues); $totalnumberofpycpreyitems = scalar (@pyclinevalues); $scalarcounter = 0; while($pyclinevalues[0]){ $pyc = $pyclinevalues[0]; chomp($pyc); print "Endens $pyc\t"; push (@{pycvalues."$scalarcounter"},$pyc); $scalarcounter = $scalarcounter + 1; shift(@pyclinevalues); } } close (PYCFILE); ### $scalarcounter = 0; $PYCLININTERPLOOP = 1; until($PYCLININTERPLOOP > $totalnumberofpycpreyitems){ @projectedpycvalues = (); if(${pycvalues."$scalarcounter"}[1]){ $projecteddeltay = ${pycvalues."$scalarcounter"}[1] - ${pycval +ues."$scalarcounter"}[0]; chomp ($projecteddeltay); $projecteddeltax = $dayvalues[1] - $dayvalues[0]; chomp ($projecteddeltax); $newvalue = ${pycvalues."$scalarcounter"}[0]; chomp ($newvalue); $projectedslope = $projecteddeltay/$projecteddeltax; chomp ($projectedslope); push (@projectedpycvalues,$newvalue); $checker = $dayvalues[0]; chomp ($checker); $checker = $checker + 1; if($checker < $dayvalues[1]){ until($checker == $dayvalues[1]){ $newvalue = $newvalue + $projectedslope; push (@projectedpycvalues,$newvalue); $checker = $checker + 1; } } if($checker == $dayvalues[1]){ $newvalue = ${pycvalues."$scalarcounter"}[1]; push (@projectedpycvalues, $newvalue); $checker = $checker + 1; } } $PYCLOOPCOUNTER = 2; $totalnumberofpycdatapoints = scalar (@pycpvalues); until($PYCLOOPCOUNTER > $totalnumberofpycdatapoints){ if(${pycvalues."$scalarcounter"}[2]){ $projecteddeltay = ${pycvalues."$scalarcounter"}[2] - +${pycvalues."$scalarcounter"}[1]; chomp ($projecteddeltay); $projecteddeltax = $dayvalues[2] - $dayvalues[1]; chomp ($projecteddeltax); $newvalue = ${pycvalues."$scalarcounter"}[1]; chomp ($newvalue); $projectedslope = $projecteddeltay/$projecteddeltax; chomp ($projectedslope); if($checker < $dayvalues[2]){ until($checker == $dayvalues[2]){ $newvalue = $newvalue + $projectedslope; push (@projectedpycvalues,$newvalue); $checker = $checker + 1; } } if($checker == $dayvalues[2]){ $newvalue = ${pycvalues."$scalarcounter"}[2]; push (@projectedpycvalues, $newvalue); $checker = $checker + 1; } } shift(@dayvalues); shift(@pycvalues); $PYCLOOPCOUNTER = $PYCLOOPCOUNTER +1; } print "@projectedpycvalues\n\n"; $totalnumberofdays = scalar (@projectedpycvalues); $day = 1; foreach $value (@projectedpycvalues){ chomp($value); $dailypyctable{$day} = $value; $day = $day + 1; } $scalarcounter = $scalarcounter + 1; %{dailypyctable."$scalarcounter"} = %dailypyctable; $PYCLININTERPLOOP = $PYCLININTERPLOOP + 1; } } print"Species?\n"; chomp($FishSpecies = <STDIN>); &PYCFILENAMEASSIGNMENT; &OPENPYC;
Contents of Tab Delimited Files, $TEMFILE Some Fish.tem day temperature 1 19 30 25 61 25 92 25 122 19.3 153 10 180 5 214 4 304 5 334 10.5 365 18.3
as opposed to this *.tem file which contains a zero, and day temperature 1 15 30 20 61 20 92 20 122 15 153 7 180 2 214 0 304 2 334 8 365 15
#!/usr/bin/perl sub TEMFILENAMEASSIGNMENT{ $TEMFILE = "$FishSpecies".".tem"; } sub OPENTEM{ @dayvalues = (); open TEMP, "/Applications/Bioenergetics/User\ Input\ Data\ Files/$TEMF +ILE" or die "Cannot open $TEMFILE: $!"; @dailytemp=<TEMP>; shift(@dailytemp); foreach $line (@dailytemp){ chomp($line); ($day,$temp)=split("\t",$line); chomp($day); chomp($temp); print "$day, $temp\n"; push (@dayvalues,$day); push (@tempvalues,$temp); } close (TEMP); if($tempvalues[1]){ $projecteddeltay = $tempvalues[1] - $tempvalues[0]; chomp ($projecteddeltay); $projecteddeltax = $dayvalues[1] - $dayvalues[0]; chomp ($projecteddeltax); $newvalue = $tempvalues[0]; chomp ($newvalue); $projectedslope = $projecteddeltay/$projecteddeltax; chomp ($projectedslope); push (@projectedtempvalues,$newvalue); $checker = $dayvalues[0]; chomp ($checker); $checker = $checker + 1; if($checker < $dayvalues[1]){ until($checker == $dayvalues[1]){ $newvalue = $newvalue + $projectedslope; push (@projectedtempvalues,$newvalue); $checker = $checker + 1; } } if($checker == $dayvalues[1]){ $newvalue = $tempvalues[1]; push (@projectedtempvalues, $newvalue); $checker = $checker + 1; } } $TEMLOOPCOUNTER = 2; $totalnumberoftemdatapoints = scalar (@tempvalues); until($TEMLOOPCOUNTER > $totalnumberoftemdatapoints){ if($tempvalues[2]){ $projecteddeltay = $tempvalues[2] - $tempvalues[1]; chomp ($projecteddeltay); $projecteddeltax = $dayvalues[2] - $dayvalues[1]; chomp ($projecteddeltax); $newvalue = $tempvalues[1]; chomp ($newvalue); $projectedslope = $projecteddeltay/$projecteddeltax; chomp ($projectedslope); if($checker < $dayvalues[2]){ until($checker == $dayvalues[2]){ $newvalue = $newvalue + $projectedslope; push (@projectedtempvalues,$newvalue); $checker = $checker + 1; } } if($checker == $dayvalues[2]){ $newvalue = $tempvalues[2]; push (@projectedtempvalues, $newvalue); $checker = $checker + 1; } } shift(@dayvalues); shift(@tempvalues); $TEMLOOPCOUNTER = $TEMLOOPCOUNTER +1; } print "@projectedtempvalues"; $totalnumberofdays = scalar (@projectedtempvalues); $day = 1; foreach $value (@projectedtempvalues){ chomp($value); $dailytemptable{$day} = $value; $day = $day + 1; } } print"Species?\n"; chomp($FishSpecies = <STDIN>); &TEMFILENAMEASSIGNMENT; &OPENTEM;

Replies are listed 'Best First'.
Re: Problems With Tab Delimited Files
by ikegami (Pope) on Mar 26, 2008 at 18:42 UTC

    It would help if each file was a separate code block. It would help even more if you had been able to narrow down the problem some first. And obviously, the total lack of indentation is an assault on our eyes.

    Fortunately, I happen to spot the problem while glancing at the code. The following code loops until the array is empty (i.e. the first remaining element is undef) or until the first remaining element of the array is false (zero is false):

    while ($pyclinevalues[0]) { $pyc = $pyclinevalues[0]; ... shift(@pyclinevalues); }

    You should be looping while the number of elements in the array is non-zero, or true:

    while (@pyclinevalues) { $pyc = $pyclinevalues[0]; ... shift(@pyclinevalues); }

    But why use a while loop at all?

    for my $pyc (@pyclinevalues) { ... }

    By the way, Text::CSV handles tab-separated files without problem.

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Problems With Tab Delimited Files
by wade (Pilgrim) on Mar 26, 2008 at 20:24 UTC

    I noticed a couple other "tidy code" things you might want to consider.

    • use strict; and use warnings; will help you find various things (though, not this bug). When you do this, you'll have to start declaring your variables (e.g., my $day; or my @dayvalues;.)
    • the open statement could benefit from being the 3-parameter version:
      open PYCFILE, "<", $filename or die ...
    • It might be easier to read the file one line at a time rather than slurp the whole thing up into an array and parse the array one line at a time:
      while (my $line=<PYCFILE>) ...
    • You only need chomp to remove the newline at the end of a line. You don't have to worry about doing it for each token read from shift or each value pulled from an array.
    HTH
    --
    Wade
      Thank you for the help. I would like to clarify a few things though. First, generally I do use warnings; and use strict;. In my larger script that I coppied those smaller chunks of code from, everything was declared. Thank you for pointing it out though, as had I not known that I should be declaring variables, I would have started right away after your comment. Second, I realize (and have been taught) that indentation is better for the eyes and I appologize for my lazyness. As far as the overuse of chomp, I was just trying to be careful, but I now see that it was not nescecary. Thank you for your suggestions.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://676493]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2021-09-17 12:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?