Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

regexp issue: Porting script from 5.6.1 to 5.005_02

by vladb (Vicar)
on Aug 10, 2004 at 01:00 UTC ( #381417=perlquestion: print w/replies, xml ) Need Help??
vladb has asked for the wisdom of the Perl Monks concerning the following question:

Good monks, Once again I apologize for disturbing your heavenly peace ;-). I come to you with this question:

I wrote a cgi script to emulate SSI behaviour and it works fine for the latest versions of perl (5.6.1+). Now I'm having to port this script to run under perl 5.005_02. I know this one is outrageously old, but one particular server I have to run the cgi on doesn't run any better ;(.

The cgi in question is (

While porting, I ran into this one problem where a regular expression would return 1 (true) on an empty string, whereas in perl 5.6.1 it returns 0, as is expected. For example,

. . . my $query = undef; if ($query =~ /^(([\w\d\-\_]+)[:,]|)(.*)$/) { # we get here! print "$`|$1|$2|$3|$4|$'|$query|"; } else { exit; } . . .
Any idea why the regexp operation evaluates to 1? There should be absolutely no match... Afterall, the query variable is undefined.

Your help is much appreciated ;-) Update: Gush... this was an obviously stupid question. Sorry for the trouble. I'll get on with porting (occassional brain farts are not pleasant, especially when exposed) ;-p

"I'm always right and I can prove it, because to the best of my knowledge, I've never been wrong."

Replies are listed 'Best First'.
Re: regexp issue: Porting script from 5.6.1 to 5.005_02
by hv (Parson) on Aug 10, 2004 at 01:37 UTC

    I'm confused: the regexp you show should match an empty string. It looks like this:

    / ^ # from the start ( # match ( [\w\d\-\_]+ ) # one or more of [class] [:,] # followed by a delimiter | # or match # nothing # nothing ) # then ( .* ) # match zero or more characters $ # .. until we reach end of string /x

    This will match against any string that doesn't have newlines in it, and since undef evaluates to the empty string in a string context, it will match (taking advantage of the "or match ... nothing" option).

    I don't have perl5.005_02 here, but I tried this:

    perl -wle 'print "ok" if undef =~ /^(([\w\d\-\_]+)[:,]|)(.*)$/'
    .. against each of 5.004, 5.004_05, 5.005_03 and 5.6.1 and it printed the expected warning and the expected "ok" in each case.

    Can you come up with a short example like mine that prints "ok" on one of those perl builds, but not on the other?


Re: regexp issue: Porting script from 5.6.1 to 5.005_02
by PodMaster (Abbot) on Aug 10, 2004 at 01:24 UTC
    my $query = undef; if ($query =~ /^(([\w\d\-\_]+)[:,]|)(.*)$/) { # we get here! print "$`|$1|$2|$3|$4|$'|$query|"; } else { exit; } __END__ ||||||| C:\>perl -v This is perl, v5.8.4 built for MSWin32-x86-multi-thread
    nothing matches nothing or nothing :)
    C:\>perl -le"print 1 if undef =~ //" 1 C:\>perl -le"print 1 if undef =~ /|/" 1 C:\>perl -le"print 1 if undef =~ /blahblah|/" 1
    What you say happens with 5.6.1 doesn't sound sounds questionable (I haven't checked the perlbug database).

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

Re: regexp issue: Porting script from 5.6.1 to 5.005_02
by etcshadow (Priest) on Aug 10, 2004 at 01:22 UTC
    An empty string should satisfy that regex. The second half, (.*) should obviously be satisfied by an empty string... and the first half (you may have to look a little) should also be fine on an empty string... (bunch-of-stuff|). Look at that... a bunch of stuff OR nothing at all. So, /^$nothing_ok$nothing_also_ok$/. Why on earth did that evaluate to FALSE on later perls is a better question!
    ------------ :Wq Not an editor command: Wq
Re: regexp issue: Porting script from 5.6.1 to 5.005_02
by jryan (Vicar) on Aug 11, 2004 at 00:29 UTC

    I know you already have your question answered, but just a small note: \d and _ are in \w, so [\w\d\-\_] is just the same thing as [\w-].

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://381417]
Approved by etcshadow
[marinersk]: I've often marvelled over the past few decades at just how little it takes to mess me up with this spacial dependence thing. I wonder if I'm mildly autistic or something.
[Eily]: can't the nodelet be moved though? Maybe you could put one that doesn't change first ("Sections" or "Find Nodes" for example)
[Eily]: "Other Users" seems like a poor choice :P

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2017-05-29 14:05 GMT
Find Nodes?
    Voting Booth?