You are queueing pixel by pixel. But the threads can calculate (x,y) by themselves. Why not do the following:
my $width = 1280 * 4;
my $height = 1024 * 4;
$picturesize = $width * $height;
for(my $i=0; $i< $picturesize-1; $i+=32){
$start = $i;
$end = $i+32;
$x1 = int(($start % $width) /4);
$y1 = int(($start / $width) /4);
$x2 = int(($end % $width) /4);
$y2 = int(($end / $width) /4);
printf("(%4d,%4d)..(%4d,%4d) %d\n",
$x1,$y1,$x2,$y2, $i);
}
This way, the main program only uses the picture size (a large integer), and increases it by 32 units (4 pixels with four byte RGBA perl pixel, I guess). And you give it a start and stop range: $i .. $i+32. The thread can then use those numbers to calculate back the pixels (x,y) themselves, in parallel.
startthread($from,$to,$width)