Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Avoid a single value in the url

by Anonymous Monk
on Jan 24, 2013 at 09:28 UTC ( [id://1015117]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

How to modify the below regex and https?://(www.)?(abc.com|fdsf.com|3545ab.com|tyrty.com)[^/]*/.*/.*/.*/.*/ URL: https://www.abc.com/gallery/sdasd/sdfdsf/sdfdsf/

If after domain name and an slash, if gallery is present then this regex should not match and any string other than "gallery" should be matched , Please tell how to modify the above regex.

https?://(www.)?(abc.com|fdsf.com|3545ab.com|tyrty.com)[^/]*/gallery/.*/.*/.*/should not match https?://(www.)?(abc.com|fdsf.com|3545ab.com|tyrty.com)[^/]*/news/.*/.*/.*/ should match

Replies are listed 'Best First'.
Re: Avoid a single value in the url
by Anonymous Monk on Jan 24, 2013 at 09:48 UTC

    Please tell how to modify the above regex.

    No thanks, see perlrequick if you're interested

    if( isBlahGallery( $url ) ){ die "failed it"; } sub isBlahGallery { $_[0] =~ m{\.com/gallery}ism; }
Re: Avoid a single value in the url
by ansh batra (Friar) on Jan 24, 2013 at 09:46 UTC

    https:\/\/www\..*\.com\/gellery.*

Re: Avoid a single value in the url
by Anonymous Monk on Jan 24, 2013 at 09:49 UTC
    Is it possible using of this way. Please correct  https?://(www.)?(abc.com|fdsf.com|3545ab.com|tyrty.com)[^/]*/((?!gallery)).*/.*/.*/.*/
Re: Avoid a single value in the url
by sen (Hermit) on Jan 24, 2013 at 09:54 UTC

    Hi

    Please try this, /https?\:\/\/(www.)?(abc.com|fdsf.com|3545ab.com|tyrty.com)[^\/]*\/[^gallery]\/.*\/.*\/.*\//

    Thanks

      sen:

      No, that won't work. The [^gallery] bit is saying "... and doesn't contain a g, a, l, e, r, y after the slash after the domain". While there's plenty of magic in regexes to let you solve the problem, I don't bother trying so hard for things like this. If it were me, I'd try doing it in two steps, like so:

      $ cat t.pl use strict; use warnings; while (<DATA>) { if (m".*?/(.*?)/" and $1 !~ m"gallery"i) { print "Match: $_"; } else { print "No match: $_"; } } __DATA__ foo bar/baz/boffo bar foo/gallary/gallery bim bam/gallery/blam/blim arg blarg/gallery/flarg $ perl t.pl Match: foo bar/baz/boffo Match: bar foo/gallary/gallery No match: bim bam/gallery/blam/blim No match: arg blarg/gallery/flarg

      As you can see, I first check to see if it matches the overall form of the URL and capture the contents of the first thing between slashes. The second part of the if statement checks to see whether the thing I captured matched the string I don't want to see.

      Also, to make the regex look a little less confusing, I used the m"regex" so I wouldn't have to have all the backslashes before the forward slashes.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1015117]
Approved by Ratazong
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-24 00:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found