Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Unexpected behavior when using PDL::IO::Misc::rcols with $PDL::undefval

by kevbot (Chaplain)
on Jul 30, 2012 at 03:42 UTC ( #984338=perlquestion: print w/ replies, xml ) Need Help??
kevbot has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am using PDL in some modules, and I want to import numerical data from a text file using the rcols function found in PDL::IO::Misc. The rcols function will import the text file into piddles that correspond to the columns of the text file.

I stumbled across an issue when I was trying to import data from a tab-delimited text file. Some positions in my input data file will contain blank entries. It seems that handling of blank entries is inconsistent. If a blank entry is in the last column of the file then the $PDL::undefval is used in the piddle. If a blank appears elsewhere, then it appears that a value of "0" is used in the piddle.

Here is an example.

I have two data files. The data.txt file is tab-delimited and does not contain blank entries.
1 6 11 2 7 12 3 8 13 4 9 14 5 10 15
The data_missing.txt file is tab-delimited but contains some blank entries.
1 6 11 2 7 3 8 13 4 14 5 10 15
I use the following script to test the contents of the pdls created by rcols:
#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::IO::Misc; my $file_name = shift; die 'No file given.' unless defined($file_name); open(my $fh, '<', $file_name) or die "Can not open file: $!"; my @pdls = rcols $fh, { COLSEP => "\t" }; foreach (@pdls) { print "$_\n"; } exit;
The output for <data.txt> is:
[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
The output for <data_missing.txt> is:
[1 2 3 4 5] [6 7 8 0 10] [11 0 13 14 15]
So far, so good. However, if I change the value for $PDL::undefval, I get a strange result. First, the default value of $PDL::undefval is zero.
perl -MPDL -E 'say $PDL::undefval' 0
Here is the code with $PDL::undefval set to -999.
#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::IO::Misc; my $file_name = shift; die 'No file given.' unless defined($file_name); open(my $fh, '<', $file_name) or die "Can not open file: $!"; local $PDL::undefval = -999; my @pdls = rcols $fh, { COLSEP => "\t" }; foreach (@pdls) { print "$_\n"; } exit;
The output for <data.txt> is:
[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
The output for <data_missing.txt> is:
[1 2 3 4 5] [6 7 8 0 10] [11 -999 13 14 15]
The value of $PDL::undefval is used in one case (where the 12 was deleted at the end of a row in the input file), but a zero used (where the 9 was deleted in the middle of a row in the input file).

This looks like a bug to me. Does anyone else have experience using this feature of PDL?

Comment on Unexpected behavior when using PDL::IO::Misc::rcols with $PDL::undefval
Select or Download Code
Replies are listed 'Best First'.
Re: Unexpected behavior when using PDL::IO::Misc::rcols with $PDL::undefval
by kevbot (Chaplain) on Mar 24, 2013 at 17:35 UTC
    Back when I made this post, I never got around to reporting the bug. I installed the new PDL 2.006 yesterday and the problem persisted, so I reported the bug (and surprisingly less than 24 hours later) the fix is in the git repository.

    Just thought I would note this here in the event that someone stumbles upon this node.

    See this blog post to see what's new in PDL 2.006.

      I reported the bug (and surprisingly less than 24 hours later) the fix is in the git repository

      You shouldn't be surprised by that - there are a few guys over there who are typically very responsive.
      Unfortunately, they're not usually hanging around the monastery.

      Cheers,
      Rob

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://984338]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (12)
As of 2015-07-28 11:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (254 votes), past polls