Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^5: Grab input from the user and Open the file

by haukex (Archbishop)
on Nov 28, 2016 at 10:38 UTC ( [id://1176691]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Grab input from the user and Open the file
in thread Grab input from the user and Open the file

Hi valerydolce,

Designing your own regex is certainly very good practice. For a production system I'd recommend existing modules, for example Regexp::Common to find URIs and URI to parse them. For example:

use warnings; use strict; my $str = <<'END_STR'; I am an example http://www.perlmonks.org/?parent=1176663;node_id=3333 +text that contains <https://perlmonks.pair.com/?node_id=1176651> two URIs END_STR use Regexp::Common qw/URI/; use URI; while ($str=~/$RE{URI}{-keep}/g) { my $uri = URI->new($1); print "$uri\n"; print " Scheme: ", $uri->scheme, "\n"; print " Host: ", $uri->host, "\n"; print " Path: ", $uri->path, "\n"; print " Query: ", $uri->query, "\n"; }

See the URI documentation for lots more ways to access the different parts of the URI. I did notice that unfortunately Regexp::Common apparently doesn't match the #fragment part of the URI, so here's an attempt at an alternate solution, using a regex based on the characters allowed in URIs from RFC 3986.

# NOTE this is based on a quick skim of RFC 3986 and may not be comple +te! my $url_re = qr{ # https://tools.ietf.org/html/rfc3986#section-2 # URI = scheme ":" hier-part ...; hier-part = "//" ... # scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) [A-Za-z][A-Za-z0-9+\-.]* :// # gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" # / "*" / "+" / "," / ";" / "=" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" ( [:/?#\[\]@!\$&'()*+,;=A-Za-z0-9\-._~] # pct-encoded = "%" HEXDIG HEXDIG | %[0-9A-Fa-f]{2} )* }x; while ($str=~/($url_re)/g) { my $uri = URI->new($1); print "$uri\n"; }

Hope this helps,
-- Hauke D

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1176691]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-18 01:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found