The ulimit information is barking up the wrong tree. The posix thread stack size routines are the right way to go. ulimit will limit the maximum size of the stack, not the initial reserve. While limiting the maximum size will place a hard upper limit on the memory footprint, it will do nothing to reduce the lower limit. To do that, you must reduce the stack reserve (as the original post says). You can set the initial reserve at link time with the ld option --stack, which defaults to 2MB in the GNU binutils.
To modify this in the binary on a *nix box, you can "relink" it:
$ ld --stack 0x1000 perl -o tperl
$ nm -s perl | grep stack_reserve
00200000 A __size_of_stack_reserve__
$ nm -s tperl | grep stack_reserve
00001000 A __size_of_stack_reserve__
The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon