Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^3: Bolding search terms ... which might be URLs?

by mr_mischief (Monsignor)
on Mar 11, 2011 at 08:59 UTC ( #892627=note: print w/replies, xml ) Need Help??

in reply to Re^2: Bolding search terms ... which might be URLs?
in thread Bolding search terms ... which might be URLs?

Thanks for the pointers. Those still cannot deal with every case properly for arbitrary text. It's not a matter of getting the code right. It's a matter of there being too little information in the arbitrary text to be sure how to mark it up.

A valid URI can easily be formed with a comma, semicolon, colon, question mark, or period at the end of it. They are often not the URI intended, though, as people use English punctuation around their URIs without separating them. There are important differences between the URI with and without those characters in some cases.

The manual for the first one you list punts on non-Latin characters, too. Regexp::Common::URI::ftp's docs state that there's no well-defined standard across the RFCs for an FTP URI. You can get closer and closer, but you're just not going to get 100%. The only way to be sure you've marked something up entirely properly with URIs is to visit the URI and make sure the expected content is delivered.

According to the RFCs, a URI such as does not necessarily even need to redirect to the resource if the owner of th site doesn't wish it to. You just can't be sure with arbitrary text and no markup that you are introducing links correctly all the time.

  • Comment on Re^3: Bolding search terms ... which might be URLs?

Replies are listed 'Best First'.
Re^4: Bolding search terms ... which might be URLs?
by Cody Fendant (Friar) on Mar 13, 2011 at 23:29 UTC
    Thanks very much indeed for that work, Mr Mischief. I appreciate it hugely. Sorry I haven't been back to this thread for a while. You've been really helpful. For what it's worth, my users are very unlikely to post edge-case URLs like the ones discussed here, or non-ASCII domain names.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://892627]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2018-02-21 00:19 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (274 votes). Check out past polls.