It is exceedingly unlikely that the outcome of this thread is that someone will take it upon themselves to put a fast btree implementation on CPAN which meets all your requirements.
Seriously! I mean you'd need someone with XS expertise who also happens to know about Red/Black trees, and who reads this forum. And unless you're paying them it would need to be a weird masochist who actually enjoys XS, or maybe someone who finds systems-level work cathartic after a long multi-week stretch of web-app tedium. Or maybe if they just really love the Red/Black algorithm and had their own Red/Black implementation in C that they've been porting from project to project since college. Maybe if they had finished a big refactor of the code a few months ago but hadn't gotten to use it in a project yet, that could be enough motivation.
Congratulations, the stars aligned.
Here's your module. Tree-RB-XS-0.00_01.tar.gz
(not visible on metacpan.org yet)
You get the distinct privilege of reporting the first bugs in a recently-refactored pointer-math-heavy C library wrapped with some creative new XS ideas.
Update: I added a new KEY_TYPE_BSTR that copies the keys from Perl into plain buffers, and uses memcmp on them. It's a good percentage faster at the expense of incorrect unicode sorting. It's useful if your strings are ASCII.
Update: I finished off most of the Tree::RB API, polished up the documentation, and gave it an official release. It also now has custom XS compare functions to choose from, such as CMP_NUMSPLIT, which can handle the fairly common scenario of a mixture of numbers and strings in the key.