Re^5: Grab input from the user and Open the file

if (m/^(((ht|f)tp(s?))\://)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu +|gov|m +il|net|org|biz|info|name|museum|us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\ +. +\,\;\?\'\\\+&%\$#\=~_\-]+))*$/){ my($host,$path) = ($4,$5); print "$host => $path\n"; }
[download]

Some thoughts:

You cannot use a m// delimiter character (e.g., / forward slash) within a regex without escaping it. Your quoted regex
m/^(((ht|f)tp(s?))\://)?...(/($|[a-z...]+))*$/
has / delimiter characters within it. I have rewritten the regex below with my favorite, balanced curlies, as delimiters.
The /x regex modifier and whitespace are your friends (see Modifiers). As it stands, your regex is almost unreadable (at least by me) even if it were correct. Rewriting the regex as syntactically (but not necessarily semantically) correct:
```
if (m{    
    ^ (  # open capturing group 1 (was unbalanced)
        ((ht|f)tp(s?) \: //)?  # removed close paren after (s?)
        (www.|[a-zA-Z].) 
        [a-zA-Z0-9\-\.]+ \.
        (com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk)
        (\:[0-9]+)* 
        (/($|[a-zA-Z0-9\.\,\;\?\'\\\+&amp;%\$#\=~_\-]+))*
    ) $  # (maybe?) added for balance: close capture group 1
    }x
    ) {
    my($host,$path) = ($4,$5);
    print "$host => $path\n";
    }
[download]
```
I can identify parts of this regex, but what, for instance, is (www.|[a-zA-Z].) ('www' followed by any character, or else a single alpha character followed by any character) supposed to match in a URL? (This is immediately followed by [a-zA-Z0-9\-\.]+ \. without any delimiter.)
In (com|...|museum|us|ca|uk) the us ca uk bit is suspicious. These are (I think) country codes, and should appear as part of a ccTLD like www.bbc.co.uk or www.what.ever.ac.ca and not on their own, as your regex allows. (Again, I'm sure CPAN has a module with regexes for matching URLs.)
Is (/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))* really supposed to contain &? (This may be an artifact of Perlmonks site rendering.)
Other parts I just can't figure out. (This may simply be due to ignorance on my part.)
A nit: Non-capturing groups (see (?:pattern) in Extended Patterns in perlre) are your friends. Entirely too much stuff is captured unnecessarily in this regex for my taste.
A similar nit: Far too many characters are escaped unnecessarily; not every non-alphanumeric needs escaping.

Bottom line: I doubt this regex would do what you intend even if it would compile (and the rewritten version is at least syntactically correct). Yet again: CPAN.

Give a man a fish: <%-{-{-{-<

Comment on Re^5: Grab input from the user and Open the file Select or Download Code


XP is just a number
	PerlMonks