Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

RFC: Module to take a title and generate the core bit of the URL

by SilasTheMonk (Chaplain)
on Jul 15, 2010 at 10:37 UTC ( #849742=perlquestion: print w/ replies, xml ) Need Help??
SilasTheMonk has asked for the wisdom of the Perl Monks concerning the following question:

I have a need for a good perl module to convert an article title to a nice URL string. For example
'I am a mean perlmonk.' => 'i-am-a-mean-perlmonk',
'RFC: Module to take a title and generate the core bit of the URL' => 'rfc-module-to-take-a-title-and-generate-the-core-bit-of-the-url',
'I have deep-seated emotional problems!' => 'i-have-deep-seated-emotional-problems'
'Sacrificing minions: is there any problem it can't solve?' => 'sacrificing-minions-is-there-any-problem-it-cant-solve'
I find it hard to believe that nothing on CPAN attempt to solve this problem. The following code does a reasonable job on the easy cases:
$title =~ s{\s}{_}g; $title =~ s{\/}{_}g; $title =~ s{\W}{}g; $title =~ s{_}{-}g; return lc($title);
However it needs to do a lot better. It falls up over entities such as '’'. It needs to know about UTF8 and be able to distinguish between exotic letters and exotic punctuation. In some cases it should give up and throw an exception. Does anyone know a module that already does this. I have tried searching CPAN but found nothing.

Comment on RFC: Module to take a title and generate the core bit of the URL
Download Code
Re: RFC: Module to take a title and generate the core bit of the URL
by Corion (Pope) on Jul 15, 2010 at 10:55 UTC

    I'm not sure how well it works for your specs, for example, it doesn't downgrade UTF-8 to its asciificated equivalents, but String::Dirify claims to come from the Movable Type codebase. I'm not aware of any other module and wrote my own ad-hoc solution, so if String::Dirify doesn't suit your needs, maybe writing your own module isn't that out of the question.

      Thanks. That was quick. It looks good enough though I am not quite ready to try it out. I guess if I have find a real issue with it I can try and submit those as change requests.

      Edit: I am trying it out now. So far it is doing more good than harm. ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://849742]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (13)
As of 2014-12-18 22:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (66 votes), past polls