Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Regex Query

by DanielSpaniel (Scribe)
on Aug 26, 2013 at 20:41 UTC ( #1051009=perlquestion: print w/replies, xml ) Need Help??
DanielSpaniel has asked for the wisdom of the Perl Monks concerning the following question:


I'm trying to create what I thought should be a rather simple regex, but I seem to be having all kinds of problems with it (due in part, maybe, to an absence from Perl for a while)

Anyway, I'm trying to identify, and then alter, URLs in given strings. The URLs and the strings will vary daily, and in quality of formatting. The URLs could be anything at all, but they are just plain URLs (i.e. no HTML tags).

There may be more than one URL in a string, and the strings may contain both http and/or https URLs.

The URLs might be followed by any character, so it's not necessarily easy to figure that bit out. The character following the URL could just as easily be a misplaced quotation mark which doesn't even belong there, or it could be a space, or new line character, etc.

For example, a string might look like any of these (among other possibilities):

a) black and white stuff" blah blah b) rain in spain blah blah c); test this d) Just testing ... #goodtimes e) Super dooper. Looks nice! /

I've played with numerous variations of this regex, but the latest incarnation, which doesn't really work very well, is below:

$string=~s#http://(.*)(\s)#<a href=$1">http://$1</a>$2#g;

As can be seen, I'm trying to create the proper anchor tags to go with the given URL in the string, to create a proper link. The regex above works for very simple examples, but nothing more complex. i.e. it would work on example (d) above, but nothing else.

Any assistance would be much appreciated!

Replies are listed 'Best First'.
Re: Regex Query
by rminner (Hermit) on Aug 26, 2013 at 23:43 UTC
    perhaps Regexp::Common is what you are looking for:
    use strict; use warnings; use Regexp::Common qw /URI/; my $http_and_https = qr{$RE{URI}{HTTP}{-scheme=>'https?'}}; while (my $line = <DATA>) { while ($line =~ m#($http_and_https)#gc) { print $1 , "\n"; } } __DATA__ a) black and white stuff" blah blah b) rain in spain blah blah c); test this d) Just testing ... #goodtimes e) Super dooper. Looks nice! /

      Hey, thanks very much for the suggestion.

      I'd not really planned on using a module to help with it, but it makes no difference if I do, so I'll give that a shot as soon as I get a chance and report back!

      Thank you again, I'm sure I can work with your suggestion.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1051009]
Front-paged by Corion
[Discipulus]: good morning monks! or whatever daypart you are experiencing..
choroba . o O ( workpart )

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2017-02-21 08:11 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (308 votes). Check out past polls.