Re^2: Performance problem with Clone Method

Replies are listed 'Best First'.
Re^3: Performance problem with Clone Method by dcmertens (Scribe) on Jul 27, 2011 at 12:47 UTC
BrowserUK, you don't think this is actually a problem, do you? PDL is a very mature Perl extension that handles high dimensional data sets very nicely. In fact (surprise!) there's more than one way to do it. At the moment, I believe the best documentation for beginners is the Matlab or Scilab migration guides. They say that PDL::QuickStart is a better place to start, but I humbly disagree. See http://pdl.perl.org/?docs=Tutorials&title=PDL::Tutorials for the (short) list of tutorials available. To answer your specific question, suppose you have a 2x4 piddle, perhaps created with the sequence command. A snippet of your code might look like this: `my $pdl = sequence(2,4);` [download] If you wanted to modify the (0,3) element of the array (first column, last row), you would use NiceSlice and the .= notation like so: `$pdl(0,3) .= -4;` [download] A full working example would look like this: `use strict; use warnings; use PDL; use PDL::NiceSlice; my $pdl = sequence(2,4); print "$pdl\n"; $pdl(0,3) .= -4; print "$pdl\n";` [download] The output looks like this: `[ [0 1] [2 3] [4 5] [6 7] ] [ [ 0 1] [ 2 3] [ 4 5] [-4 7] ]` [download] Edit: revised the opening paragraph to be more useful Edit2: re-revised the opening paragraph to be even more useful Edit3: used code tags instead of pre tags for example code and ouput	[reply] [d/l] [select]
Re^4: Performance problem with Clone Method by BrowserUk (Patriarch) on Jul 27, 2011 at 17:46 UTC
I think that if an algorithm requires access to individual elements of the piddles, rather than being able to apply single operations to whole piddles at a time, there is little to be gained. What you might gain from more efficient duplication, you lose by having to call one or two functions per element during calculation or conditional testing. From my very limited understanding of the OPs problem, much of the effort involved in the algorithm involves: interchanging whole rows & whole columns in 2D arrays. I don't think that PDL is particularly efficient at performing these operations, especially the latter. performing "bit-wise" boolean operations and 'counting-the-set-bits', on pairs of rows of zeros and ones. If the rows of 0s and 1s were encoded as simple bit-vectors, then not only can standard Perl can perform both these operations more efficiently than PDL, the storage requirements are 8x less. I'd be very happy to be wrong here, but I just don't think PDL suits these particular types of operations. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^5: Performance problem with Clone Method by dcmertens (Scribe) on Jul 27, 2011 at 20:41 UTC
Unfortunately I don't know anything about the OP's problem and an internet search has not proved forthcoming with a simple explanation, in matrix terms, of the Ullmann algorithm. If Commandosupremo could spell out his implementation of the algorithm, giving a list of the operations needed, I could say PDL is the right tool or not. As to your specific points, PDL may be able to handle the operations you pose, depending on what you mean. PDL supports a whole slew of operations including bitwise boolean operations, as mentioned here: http://pdl.perl.org/?docs=Ops&title=PDL::Ops. For counting the set bits, that's just a sum; you can perform that and many other row or whole-piddle operations as discussed here: http://pdl.perl.org/?docs=Ufunc&title=PDL::Ufunc. The implementations of the Ufuncs use simple accumulators, which loses precision when you're summing over thousands or millions of floating-point values, but will work just fine if you're summing a few hundred integers, as is the case (I believe) of the OP. I am note very experienced at exchanging whole rows or columns using PDL, though I know it is something that PDL manages to do with data flow. In other words, when you generate piddle $b from piddle $a with a row exchanged, you can modify piddle $a and it'll show up in piddle $b, and vice versa. Put differently, exchanges lead to different 'views' of the same data. This can be very powerful, but for the OP it may simply lead to inefficiencies in the back-end. I ~~will pose~~ posed this question to the PDL mailing list ~~shortly~~ to see if anybody can help. It would be much more useful if Commandosupremo could post the actual algorithm. At any rate, interchanging whole rows in 2D arrays, and whole columns in 2D arrays, is more difficult than I had expected, and there might be more efficient ways to do it using PDL.To exchange rows or columns, you can use the dice_axis function, like so: use warnings; use PDL; use PDL::NiceSlice; # Here is my matrix: my $data = sequence(5,5); print "Original:\n$data\n"; # In order to permute columns, I need a column vector with the # column numbers that I want. (You'd think a row vector would be # more appropriate, but that's not how it works. Alas.) my $indices = sequence(5); # Set up to exchange the second and the third column: #$indices(1:2) .= $indices(2:1); ### should work but doesn't for me $indices(1:2) .= $indices(pdl [2,1]); # Perform the exchange: my $rearranged = $data->dice_axis(0, $indices); print "Column swap:\n$rearranged\n"; # Start over and exchange the fourth and fifth rows: $indices = sequence(5); $indices(3:4) .= $indices(pdl[4,3]); my $further_rearranged = $rearranged->dice_axis(1, $indices); print "Column and row swap:\n$further_rearranged\n"; [download] The output I get from that looks like this: `Original: [ [ 0 1 2 3 4] [ 5 6 7 8 9] [10 11 12 13 14] [15 16 17 18 19] [20 21 22 23 24] ] Column swap: [ [ 0 2 1 3 4] [ 5 7 6 8 9] [10 12 11 13 14] [15 17 16 18 19] [20 22 21 23 24] ] Column and row swap: [ [ 0 2 1 3 4] [ 5 7 6 8 9] [10 12 11 13 14] [20 22 21 23 24] [15 17 16 18 19] ]` [download] Edit: updated to use dice_axis	[reply] [d/l] [select]
Re^6: Performance problem with Clone Method by BrowserUk (Patriarch) on Jul 27, 2011 at 23:55 UTC
Re^3: Performance problem with Clone Method by dwm042 (Priest) on Jul 27, 2011 at 21:40 UTC
PDL(i,j) manipulation: Look at the get and set functions in PDL::Func. To note, the difference between column and row operations are the difference between 0 transpose operations and 2 of them. That said, between PDL::Slices and PDL::NiceSlice there are so many ways to skin this cat I'm surprised people worry about the issue of column/row operations in PDL. `#! /usr/bin/perl use PDL; my $M = sequence(10,10); print $M; my $row = $M->slice(':,3'); my $col = $M->slice('4,:'); my $col2 = $M->slice('5,:'); my $deep_col = $col->copy; $col .= $col2; $col2 .= $deep_col; print $M;` [download] See also reorder, dice, range etc. Update:original code example broken. If you're doing lots of row or column manipulations, perhaps better to keep a 1d piddle of index values and manipulate those.	[reply] [d/l]
Re^4: Performance problem with Clone Method by BrowserUk (Patriarch) on Jul 27, 2011 at 23:53 UTC
there are so many ways to skin this cat I'm surprised people worry about the issue of column/row operations in PDL. The problem is not "can it be done with PDL", more "do you gain anything by using PDL to do it"? Ie. Is it more efficient? This iterates through all the row and column permutations of a 10x10 matrix in 82 seconds: #! perl -slw use strict; use Data::Dump qw[ pp ]; use Time::HiRes qw[ time ]; #$Data::Dump::WIDTH = 1000; use Algorithm::Combinatorics qw[ permutations ]; my @a = map[ 10 $_ .. 10 $_ + 9 ], 0 .. 9; ## rows my $start = time; my $perms = permutations( [ 0 .. 9 ] ); while( my $p = $perms->next ) { my @perm = @a[ @$p ]; } printf "All row permutations took %f seconds\n", time - $start; ## cols $start = time; $perms = permutations( [ 0 .. 9 ] ); while( my $p = $perms->next ) { my @perm = map[ @{$_}[ @$p ] ], @a; } printf "All column permutations took %f seconds\n", time - $start; __END__ C:\test>PermsMatrix.pl All row permutations took 17.198000 seconds All column permutations took 65.284842 seconds [download] The OP mentioned matrices of "hundreds x hundreds". As I understand the brute force algorithm for the Subgraph Isomorphism Problem, it requires performing all the row permutations for all the column permutations of the smaller of the two adjacency matrix graphs for every equal sized subgraph of the larger adjacency matrix. Ullmann trims the tree somewhat, but essentially still requires many of the iterations and all the transformations to be performed. I've no doubts that this can be done with PDL; I just wonder if you gain much performance? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^5: Performance problem with Clone Method by dwm042 (Priest) on Jul 28, 2011 at 03:33 UTC
I read this thread through 3-4 times before I "got it", which is that PDL is really a floating point engine that makes things like singular value decomposition a one liner. And that back in 1983, the kind of programming the OP is interested in would have been written in assembler and/or C, and so bitwise efficiency is to be optimized here. D-	[reply]
Re^6: Performance problem with Clone Method by etj (Deacon) on May 26, 2022 at 14:01 UTC


Pathologically Eclectic Rubbish Lister
	PerlMonks