Perl Monk, Perl Meditation PerlMonks

### Re^2: Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)

 on Dec 27, 2023 at 22:40 UTC Need Help??

Thanks, nice find. By the way, perhaps I've chosen imprecise word ("sliding"). Maybe in math/science, any function, applied to overlapping infixes, is said to be applied in sliding/moving/rolling manner. What I meant instead, with the word, -- "applied using computationally efficient/cheap algorithm".

Looks like conv2d does very honest (therefore not efficient/cheap) 4-nested-loops style calculation, in PP/XS/C -- OK for very small kernels. Here's overcrowded plot, but B and E cases would suffice to show they are the same breed.

```sub sms_WxH_PDL_conv2d ( \$m, \$w, \$h ) {
\$m -> conv2d( ones( \$w, \$h ))
-> slice(
[floor((\$w-1)/2),floor(-(\$w+1)/2)],
[floor((\$h-1)/2),floor(-(\$h+1)/2)] )
}

__END__

Time (s) vs. N (NxN submatrix, PDL: Double D [1500,1500] matrix)
+-----------------------------------------------------------+
1 |-+  +      +      +      +       +      +      +      +  +-|
|                         A                                 |
|                                                      E    |
|                                                           |
|                                                           |
|                                                           |
0.8 |-+                                                       +-|
|                                                      B    |
|                                                           |
|                                                           |
|                                                           |
0.6 |-+                A                                      +-|
|                                        E                  |
|                                                           |
|                                                           |
|                                                           |
0.4 |-+                                      B                +-|
|                                 E                         |
|                                                           |
|           A                                               |
|                         E       B                         |
0.2 |-+                                                       +-|
|                  E      B                                 |
|        A  D                                               |
|    D   D  E      D      D       D      D             D    |
|    E   E  C      C      C       C      C             C    |
0 |-+  B   C                                                +-|
|    +      +      +      +       +      +      +      +    |
+-----------------------------------------------------------+
2      4      6      8       10     12     14     16
sms_WxH_PDL_naive    A
sms_WxH_pdlpp_4loops    B
sms_WxH_PDL_lags    C
sms_WxH_PDL_sliding    D
sms_WxH_PDL_conv2d    E
+----+-------+-------+-------+-------+-------+
| N  | A     | B     | C     | D     | E     |
+----+-------+-------+-------+-------+-------+
| 2  | 0.061 | 0.009 | 0.030 | 0.083 | 0.020 |
| 3  | 0.120 | 0.019 | 0.013 | 0.073 | 0.042 |
| 4  | 0.252 | 0.030 | 0.039 | 0.098 | 0.066 |
| 6  | 0.566 | 0.078 | 0.028 | 0.078 | 0.141 |
| 8  | 0.963 | 0.139 | 0.033 | 0.081 | 0.237 |
| 10 |       | 0.248 | 0.031 | 0.078 | 0.366 |
| 12 |       | 0.388 | 0.033 | 0.078 | 0.531 |
| 16 |       | 0.728 | 0.041 | 0.069 | 0.928 |
+----+-------+-------+-------+-------+-------+

Replies are listed 'Best First'.
Re^3: Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)
by wlmb (Novice) on Dec 27, 2023 at 23:30 UTC

For large submatrices one can use a Fourier transforms to perform the convolution, as in

```use v5.36;
use PDL;
use PDL::FFT;
my \$matrix=pdl(shift); # Format "[[m11,m21...],[m21...]...]"
my \$width=my \$w=shift;
my \$height=my \$h=shift;
my \$small=ones(\$w%2?\$w:\$w+1, \$h%2?\$h:\$h+1);
\$small->slice(-1).=0, ++\$w unless \$w%2; # zero row and/or column for e
+ven kernels
\$small->slice([],-1).=0, ++\$h unless \$h%2;
my \$kernel=kernctr(\$matrix, \$small); #full kernel
my \$result=\$matrix->copy;
\$result->fftconvolve(\$kernel);
say "\$matrix \$width \$height -> ",
\$result->slice([floor((\$width-1)/2),floor(-(\$width+1)/2)],
[floor((\$height-1)/2),floor(-(\$height+1)/2)]);

It would be interesting to also compare it.

Interesting. FFT for this task, who would have thought a few days ago. Looks like it runs in almost constant time, regardless of kernel size; and of course slower than special solutions. I had to add a couple of checks to accommodate for (unimportant) corner test case of submatrix with dims (4,3) and example matrix.

```sub sms_WxH_PDL_fftconvolve ( \$m, \$w, \$h ) {
my ( \$W, \$H ) = \$m-> dims;

my \$small = ones( \$w%2?\$w:\$w+1, \$h%2?\$h:\$h+1 );
\$small-> slice( -1 )    .= 0 unless \$w%2; # zero row and/or column
+ for even kernels
\$small-> slice( [],-1 ) .= 0 unless \$h%2;

my \$result = \$m-> copy;
\$result = append( \$result, 0 ) if \$W == \$w and not \$W % 2;
\$result = append( \$result-> transpose, 0 )-> transpose
if \$H == \$h and not \$H % 2;

my \$kernel = kernctr( \$result, \$small ); #full kernel
\$result-> fftconvolve( \$kernel );

\$result = \$result-> slice( '0:-2','' ) if \$W == \$w and not \$W % 2;
\$result = \$result-> slice( '','0:-2' ) if \$H == \$h and not \$H % 2;

return \$result-> slice(
[floor((\$w-1)/2),floor(-(\$w+1)/2)],
[floor((\$h-1)/2),floor(-(\$h+1)/2)]
)
}

__END__

Time (s) vs. N (NxN submatrix, PDL: Double D [1500,1500] matrix)

+-----------------------------------------------------------+
|    +      +      +      +       +      +      +      +    |
|     A                                                     |
|                                                           |
|                                                           |
2 |-+                                                       +-|
|    B   B                        B  B      B               |
|    BB         B      B      B          B         B   B    |
|     B     B      B      B                     B           |
|                         C   C                             |
|                      C          C                         |
1.5 |-+   A                              C                    +-|
|                  C                                        |
|                                        C                  |
|                                                           |
|               C                                           |
1 |-+                                         C             +-|
|     A                                                     |
|           C                                   C           |
|                                                           |
|                                                           |
|                                                  C        |
0.5 |-+  A                                                    +-|
|        C                                                  |
|                                                      C    |
|    A                                                      |
|     D  D                                                  |
|    CC     D   D  D   D  D   D   D  D   D  D   D  D        |
0 |-+                                                    D  +-|
|    +      +      +      +       +      +      +      +    |
+-----------------------------------------------------------+
0     200    400    600     800    1000   1200   1400
sms_WxH_PDL_conv2d    A
sms_WxH_PDL_fftconvolve    B
sms_WxH_PDL_lags    C
sms_WxH_PDL_sliding    D

+------+-------+-------+-------+-------+
| N    | A     | B     | C     | D     |
+------+-------+-------+-------+-------+
| 2    | 0.023 | 1.792 | 0.021 |       |
| 4    | 0.065 | 1.854 | 0.036 |       |
| 8    | 0.237 | 1.896 | 0.031 |       |
| 12   | 0.534 | 1.865 | 0.049 |       |
| 16   | 0.930 | 1.740 | 0.039 |       |
| 20   | 1.432 | 1.812 | 0.057 |       |
| 25   | 2.224 | 1.716 | 0.081 | 0.099 |
| 100  |       | 1.899 | 0.380 | 0.107 |
| 200  |       | 1.779 | 0.826 | 0.075 |
| 300  |       | 1.836 | 1.143 | 0.089 |
| 400  |       | 1.760 | 1.393 | 0.065 |
| 500  |       | 1.826 | 1.560 | 0.047 |
| 600  |       | 1.740 | 1.659 | 0.036 |
| 700  |       | 1.818 | 1.620 | 0.049 |
| 800  |       | 1.919 | 1.589 | 0.031 |
| 900  |       | 1.917 | 1.484 | 0.026 |
| 1000 |       | 1.789 | 1.260 | 0.008 |
| 1100 |       | 1.880 | 1.068 | 0.010 |
| 1200 |       | 1.745 | 0.823 | 0.010 |
| 1300 |       | 1.841 | 0.562 | 0.008 |
| 1400 |       | 1.841 | 0.273 | 0.003 |
+------+-------+-------+-------+-------+

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11156563]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2024-06-25 01:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?
 • erzuuli ‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.