If you're already storing the unique URL in a database, why not use a unique key from the db for the URL portion? That would let you turn
my $string = '/h/plkr/3/www.plkr.org/rss.pl';
into
my $string = '/h/plkr/3/426933';
where 426933 would be the unique ID for retrieving that URI from the database. It has the added bonus of letting the end user tweak things like depth and output format without having to decode/re-encode the string.
If you really want to encode the string as-is, without having to store anything in a DB, you might be able to use something like Algorithm::Huffman in CPAN to generate a short bit vector for your data, and then use some other method (encode_base64 or somesuch) to make it safe for use in a URL.
You'll need to grab a good sample of the sort of strings you want to encode, and do a frequency count on them so that Algorithm::Huffman can build up a good dictionary.
Here's an untested example:
my $token_counts = {
'/' => 10000,
'.' => 3000,
'w' => 2342,
'e' => 2000,
't' => 2000,
# ... etc, for any character that might appear in the input strin
+g
};
my $string = '/h/plkr/3/www.plkr.org/rss.pl';
my $huff = Algorithm::Huffman->new($token_counts);
my $huff_encoded = $huff->encode($string);
my $printable = encode_base64($huff_encoded);
I have no idea what sort of size savings this will get you, though. Base-64 encoding costs you one character of output for every 6 bits of input, so as long as your Huffman encoding saves you more bits than Base 64 encoding wastes, you should end up with a URL-safe encoding that's shorter than the input.