Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: What is missing from the beginning of this string?

by Marshall (Abbot)
on Oct 07, 2010 at 22:32 UTC ( #864103=note: print w/replies, xml ) Need Help??


in reply to What is missing from the beginning of this string?

Why don't you just get rid of the stuff in front of the www.foo.com stuff? I.e., assume its "bad" and put "http://" in front of it? Or for that matter just leave the http:// off once you've done step (1).
#!/usr/bin/perl -w use strict; my @urls = ('tp://www.foo.com/' , '://www.foo.com', 'http//:www.foo.com', 'www.foo.com'); foreach (@urls) { s/^.*?www/www/; print "http://$_\n"; } __END__ prints: http://www.foo.com/ http://www.foo.com http://www.foo.com http://www.foo.com
Update: well, this could be more complex as a valid URL does not have to start with www, it could be xyz.tv, then I guess you would want: http://xyz.tv? It helps if you present a representative set of test cases.

It also helps if you can say something about the context of the application. Here I suppose you are trying to "guess" the user's intention of a manually entered URL? And then auto-magically "fix" it? Sometimes it is better to just try to use what the user entered and if it doesn't work, present an error message about what is acceptable for a URL.

Just another regex example... I'm sure that other monks can provide even better regex'es, but specifying the problem as clearly as you can is important.

my @urls = ('tp://www.foo.com/' , '://www.foo.com', 'http//:www.foo.com', 'www.foo.com', 'xxx.tv', 'http//:xxx.tv', 'tp:xx.tv'); foreach (@urls) { s/^(.*?)(\w+\.)/$2/; print "http://$_\n"; } __END__ prints: http://www.foo.com/ http://www.foo.com http://www.foo.com http://www.foo.com http://xxx.tv http://xxx.tv http://xx.tv

Replies are listed 'Best First'.
Re^2: What is missing from the beginning of this string?
by japhy (Canon) on Oct 08, 2010 at 13:13 UTC
    Your regex solution s/^(.*?)(\w+\.)/$2/ should work perfectly for this.

    The mad scientist in me, though, is still wondering if there's a way to do this sort of thing abstractly: to provide a prefix for a string where the prefix may be only partially present. I'll think about it later. It's Friday.


    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    Nos autem praedicamus Christum crucifixum (1 Cor. 1:23) - The Cross Reference (My Blog)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://864103]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2020-07-03 20:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?