My experiments with Bayesian filtering were a wash; after training ifile on my entire very large corpus of mail, I found that I had to continually go through the whole spam bin for false positives.

I did the same thing when I first came to Bayesian filtering, but that's not the way to get the best results out of it. Filtering is more accurate if you simply correct its mistakes as they occur than if you preload it with an existing corpus.

There's much more information about Bayesian filtering at Paul Graham's site.