http://www.perlmonks.org?node_id=981961


in reply to Re: Processing values of a piddle (PDL) speedup using 'at' vs. 'index'
in thread Processing values of a piddle (PDL) speedup using 'at' vs. 'index'

Thanks for the reply.

I think your comment,

My (very limited) experience of working with PDL suggests that if you need to manipulate the values in a piddle individually, rather than applying each operation to the entire piddle as a whole, then you should export the piddle, en-masse, to a perl array first. It saves huge amounts of time,
and this excerpt from the PDL Book,
Be careful with at, as you almost never want to use it - it is tedious for anything nontrivial, and extremely slow! Particularly if you find yourself placing an at call inside a for loop, you should probably stop and think about how to use threading for your problem - see below.
are getting at the same idea. That is, one should avoid processing values in a piddle individually and take advantage of PDL's various commands for manipulating whole vecotrs or matrices.

The code example that I gave is a bit simplified compared to my actual use case. In my case, I iterate through a few thousand objects (these objects have attributes that are 1-dimensional pdls). The values from most (or all) of these pdls need to be exported into my text file. I take values from these pdls, check them to see if they are a special value, change the value if needed, and then put them into a perl array. The perl array is eventually written to a text file. Your comments have me thinking that another possibility might be to do something like this:

It's possible that this type of approach might be faster; however, the current approach using at is working plenty fast for me at the moment (other parts of my code are now the bottleneck).
  • Comment on Re^2: Processing values of a piddle (PDL) speedup using 'at' vs. 'index'

Replies are listed 'Best First'.
Re^3: Processing values of a piddle (PDL) speedup using 'at' vs. 'index'
by BrowserUk (Patriarch) on Jul 17, 2012 at 14:20 UTC
    I take values from these pdls, check them to see if they are a special value, change the value if needed, and then put them into a perl array.

    I would say that is entirely the wrong way to do it.

    You need to export the piddle in order to print it. So export it first; then search that for your special values; and only access the piddle elements individually if you find the special value in the exported array -- just to update it with the new value.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      I agree (but my actual use case is a little more complex than the code examples that I have given here). I added an additional subroutine to my benchmark code to see how much faster your suggested approach would be. In the 'perl_array' sub, I export the entire pdl to a perl array using the PDL 'list' command. Then, I search the perl array for the "special value" and replace them with an empty string. This is approach is faster.

      Rate pdl_values perl_values perl_array pdl_values 1.52/s -- -97% -99% perl_values 52.3/s 3345% -- -73% perl_array 197/s 12865% 276% --
Re^3: Processing values of a piddle (PDL) speedup using 'at' vs. 'index'
by chm (Novice) on Mar 24, 2013 at 22:23 UTC

    First off: the best place to ask questions about PDL is on the perldl mailing list and the central site for info on all things PDL is http://pdl.perl.org.

    Second, you are using the right strategy here. The key to remember is that calculation with PDL objects (called "piddles") are performed with special C code and are very fast. If your work can be done on the piddle data directly, you will almost always see the best performance.

    In this case, I would suggest using PDL operations to find all the "special values" in the piddle, mark them as BAD and the list() method (or the newer unpdl() method) will convert the piddle back to a perl list or list of list structure with the the special values all now having the value 'BAD'.

    A map can be used to substituted undef if that is needed for your algorithm. NOTE: if you don't need the special value elements at all, it is easy to not include them in the list() output via PDL operations.

    Here is a short session with the PDL shell (pdl2) showing some calculations along these lines:

    pdl> apropos bad # PDL shells have online help PDL::Bad Module: PDL does process bad values PDL::BadValues Manual: Discussion of bad value support badflag getter/setter for the bad data flag badinfo information on the bad-value support ...many more... pdl> help isbad Module PDL::Bad isbad Signature: (a(); int [o]b()) Returns a binary mask indicating which values of the input are bad values Returns a 1 if the value is bad, 0 otherwise. Similar to isfinite. $a = pdl(1,2,3); $a->badflag(1); set($a,1,$a->badvalue); $b = isbad($a); print $b, "\n"; [0 1 0] This method works with input piddles that are bad. The output piddle will never contain bad values, but its bad value flag will be the same as the input piddle's flag. pdl> $data = rint(10*random(10)) pdl> p $data [5 9 8 3 5 6 7 7 6 10] pdl> $special = 7 pdl> p $data->setvaltobad($special) [5 9 8 3 5 6 BAD BAD 6 10] pdl> p $data->setvaltobad($special)->list 5 9 8 3 5 6 BAD BAD 6 10 pdl> @pdata = $data->setvaltobad($special)->list pdl> p "@pdata" 5 9 8 3 5 6 BAD BAD 6 10 pdl> foreach (@pdata) { $_ = undef if $_ eq 'BAD' } pdl> p "@pdata" Use of uninitialized value $pdata[6] ... Use of uninitialized value $pdata[7] ... 5 9 8 3 5 6 6 10 pdl> p which $data==$special # calc indices of "special vals" [6 7] pdl> @ordinary = $data->where($data != $special) pdl> p "@ordinary" # or output just ordinary values [5 9 8 3 5 6 6 10]