A belated note. This:
pp_def('sms_WxH_pdlpp_4loops',
Pars => 'a(m,n); [o]b(p,q);',
OtherPars => 'PDL_Indx par_sm_w; PDL_Indx par_sm_h;',
RedoDimsCode => '$SIZE(p) = $SIZE(m) - $COMP(par_sm_w) + 1;
$SIZE(q) = $SIZE(n) - $COMP(par_sm_h) + 1;',
Code => '
PDL_Indx i, j, k, l, w, h, W, H;
w = $COMP(par_sm_w);
h = $COMP(par_sm_h);
W = $SIZE(m);
H = $SIZE(n);
for (i = 0; i < H - h + 1; i++) {
for (j = 0; j < W - w + 1; j++) {
for (k = 0; k < w; k++) {
for (l = 0; l < h; l++) {
$b(p=>j, q=>i) += $a(m=>j+k, n=>i+l);
}
}
}
}
',
);
Can use PP features much better (untested, illustrative):
pp_def('sms_WxH_pdlpp_4loops',
Pars => 'a(m,n); [o]b(p,q);',
OtherPars => 'PDL_Indx par_sm_w => w; PDL_Indx par_sm_h => h;',
RedoDimsCode => '$SIZE(p) = $SIZE(m) - $SIZE(w) + 1;
$SIZE(q) = $SIZE(n) - $SIZE(h) + 1;',
Code => 'loop(q,p,w,h) %{ $b() += $a(m=>p+w, n=>q+h); %}',
);
Features used above:
OtherPars can set
$SIZE of dims not otherwise used. When something is a
$SIZE, you can
loop() over it. If an index (named dim) is indexed by a variable of the same name, it can be omitted (as for
$b() above). Obviously I used the fact that you can give several loops at once, which is very neat. Finally, your loop conditions were recalculating the expressions used to set
$SIZE(p) etc.
Looking at the loop(), I suspect you want to switch the w and h loops to do everything as much as possible on the fastest-moving dimension. This algorithm would probably benefit a lot from tiling for maximum cache-hitting, which should probably become a proper PP feature. If someone wants to open a PDL issue with suggestions as to what the syntax would look like, it will probably get implemented and used on matmult.