Because a stack access is 1. ~70 times faster then a heap access, 2. allocation is for free, 3. you do not need to clean up the stack, and 4. stack ptrs are thread safe.
That's why normal programming languages use an ABI which puts parameters onto the stack and better align it properly to be able to use MMX.
It's not only for recursion.
Compare reading or writing at %ebp+8 against any absolute ptr. It's about 5 against 150 micro instructions.
Stack accesses are also relative and hot and local, heap accesses usually not. Some heap ptrs are hot and cached, but you still have the cache overhead.