Can we really call that O(N)? The constant depends upon the number of passes. As you increase the number of items, eventually you have to have long strings which requires lots of passes, indeed with a fixed number of symbols in your alphabet the number of things you can represent rises exponentially in the length you allow...which means it is truly n*log(n) again. :-)
Indeed my explanation about Stirling's formula is still relevant, re-read it and you can see that the fundamental issue is that a set of decisions with fixed branching can only account for an exponential number of possibilities. So the number of branches needed to sort n things grows like log(n!) which is order n*log(n). However the big win is that with radix sort you can get a far better branch factor than 2. (At least initially.)
But unfortunately the radix sort cannot be made to work with arbitrary sort functions since it does not (at least not directly) work off of binary comparisons...