Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Regexp: not what I'd expected

by greenhorn (Sexton)
on Jul 04, 2000 at 05:06 UTC ( #20967=perlquestion: print w/replies, xml ) Need Help??

greenhorn has asked for the wisdom of the Perl Monks concerning the following question:

I thought I had some proficiency with regular expressions...until I met perl...

The goal was a substitution in which literal square brackets were to be part of the replacement text and in which certain variables would be used in the replacement as well. I had assumed that the brackets would be treated as literals. Wrong. Following are examples of the kinds of results I got:

$s = 'now is the time for all good etc.;' $bef = 'before'; $mid = 'middle'; $aft = 'after'; Regexp: result (was it expected?) s/.+/$bef$mid$aft/: beforemiddleafter (expected result, if no brackets) s/.+/$bef[$mid$aft/: (SYNTAX ERROR: "scalar found where operator expected"...) s/.+/[$bef]$mid$aft/: [before]middleafter (expected) s/.+/$bef[$mid]$aft/: after (huh?) s/.+/$bef$mid[$aft]/: before (whaa?)

So much for proficiency. I had been aware that not every possible
sort of text is taken literally on the "replace-with" side of the substitution,
but I was unprepared for the above results. I need to do some more reading.
Which of the sundry books on perl would provide the best information about
what is happening in these kinds of substitutions?
T.I.A. . . .

Replies are listed 'Best First'.
RE: Regexp: not what I'd expected
by Russ (Deacon) on Jul 04, 2000 at 05:47 UTC
    Had you run with warnings on (perl -w), perl would have given you a better clue about its interpretation. You would have seen that perl thinks $bef[$mid] is an element of an array.

    Knowing that, I think your results will now make perfect sense to you. Since $bef[$mid] is not defined, $bef[$mid]$aft becomes a string containing undef followed by 'after.' $bef$mid[$aft] becomes 'before' followed undef.

    Escape the square brackets with a backslash, and all will be well.

    I think this was mainly just an oversight. One of those things that, until it's pointed out to you, you can't see it at all, but once you see it, it's obvious. :-)

    Russ

      You're right that it's an oversight. I didn't spot $bef[$mid] as an element of an array. For shame.

      Perl--this'd be ActivePerl--had an interesting response to this script when the syntax was checked with '-w': it insisted that a right bracket was missing at line 24. Hmm. The version of the test that I'm running at the moment contains only 14 lines, of which the latter 7 are commented out.

Re: Regexp: not what I'd expected
by maverick (Curate) on Jul 04, 2000 at 05:54 UTC
    $bef[$mid] # is an entry out of the array @bef $mid[$aft] # is an entry out of the array @mid $bef[$mid$aft # is a syntax error becuase there's no ] on the array en +try [$bef] # works because they're no other way to look at the [
    Look in the Perl docs about how to use arrays and this will make a lot more sense.
    The immediate solution is to put a '\' in front of each '['

    /\/\averick

      Aha, I have it. sed did it. I was thinking sed-wise. It wouldn't have cared about the un-escaped brackets in the replacement. (Trying to save face...failing...)
RE: Regexp: not what I'd expected
by jeorgen (Pilgrim) on Jul 04, 2000 at 05:52 UTC
    If you want to replace with literal brackets, you need to escape the brackets with the backslash character ("\"):

    s/.+/$bef[$mid$aft/: (SYNTAX ERROR: "scalar found where operator expected"...)

    should be written as:

    s/.+/$bef\[$mid$aft/;

    There are many characters that need to be escaped in regular expressions. The bracket could be interpreted either as the beginning and end of a character class ( example [a-z]), or as a subscript for an array (example: $bef[4]). In this case it's probably the later, since I don't think perl can make any sense out of a character class in a replacement expression. By putting a backslash in front of the bracket it looses its special meaning. Some other characters that have a special meaning are $," and /.

    Then there are other perfectly normal characters that take on a special meaning when preceded by a backslash in a regular expression, e.g. \d (digit character), \s (any whitespace character), \S (anything that isn't a white space character).

    The place to look for this kind of information is the perlre section of the perl man pages. If you're on Unix or similar, type "man perlre" at the command line (without the quotes). If you're on windows click the perlre link in the HTML documentation.

    Hope this helps, /jeorgen

Re: Regexp: not what I'd expected
by Anonymous Monk on Jul 04, 2000 at 06:51 UTC
    Everyone else has answered the "Why did this happen?" questions, so I'll go for the "What book..." question. _Mastering Regular Expressions_ by Jeffrey Friedl is still the standard, even though it's a bit dated by now. It may also be more than you need. Fortunately, perl comes with a whole boatload of documentation for free. perldoc perlop first to read about the m// and s/// operators and their flags, and then perldoc perlre to get hip deep in Perl regexes. Enjoy!
Re: Regexp: not what I'd expected
by Ovid (Cardinal) on Jul 04, 2000 at 21:32 UTC
    I rather surprised that no one else has mentioned how to constrain variable interpolation to interpolate what you are really looking for. In this case, escaping the square brackets does the trick, but I feel that it is not straightforward as the square brackets are not meta-characters when on the right side of a substitution (on the left side, of course, they wrap a character class). Instead, wrap the variable in curly braces {} to force the correct interpolation.
    s/.+/${bef}[${mid}]${aft}\n/; s/.+/${bef}${mid}[${aft}]\n/;
    I have tested the above two regexes and they work fine without escaping the square brackets. While this may be a matter of style over substance, I often wrap regex vars in curly braces because I have been bit one too many times by what you've experienced.

    Cheers,
    Ovid

Re: Regexp: not what I'd expected
by Anonymous Monk on Jul 04, 2000 at 06:52 UTC
    Everyone else has answered the "Why did this happen?" questions, so I'll go for the "What book..." question. _Mastering Regular Expressions_ by Jeffrey Friedl is still the standard, even though it's a bit dated by now. It may also be more than you need.

    Fortunately, perl comes with a whole boatload of documentation for free. perldoc perlop first to read about the m// and s/// operators and their flags, and then perldoc perlre to get hip deep in Perl regexes. Enjoy!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://20967]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2022-09-26 08:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer my indexes to start at:




    Results (117 votes). Check out past polls.

    Notices?