Well, I've now written what I think is basically what Paul has written in his lisp code (including stuff like discarding all but the most interesting 15 features) and tested it.
The results are (unsurprisingly to me) not as accurate as Paul describes on mixed types of messages.
The most important thing to remember about doing anything with probabilities is to not mix up your training and validation data sets. I get the feeling that Paul isn't doing that in calculating his statistics. I get zero false positives too when I validate against the training data set.
However, on the plus side, the amount of data stored by his system compared to the pure Bayesian one used in AI::Categorize is significantly smaller. So I'll probably switch over to using this one instead.
I'll post some of the code to the SpamAssassin list later today probably, in case someone wants to play with it some more.