Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Function call in a regular expression

by mdog (Pilgrim)
on Jul 31, 2003 at 21:41 UTC ( [id://279802]=perlquestion: print w/replies, xml ) Need Help??

mdog has asked for the wisdom of the Perl Monks concerning the following question:

Brethern --

I'd like to be able to call a function in the matching portion of substitution expression. Like this:

# converts the super ascii character known # as cedilla (or a frenched-out "c") to a # good old fashioned "c". $text =~ s|chr 231|c|gise;
However, I can't figure out the correct syntax for doing this. Obviously, I could use a throw away variable to do this but I'd like to know how to do it.

Thanks much,
Matt

Replies are listed 'Best First'.
Re: Function call in a regular expression
by Enlil (Parson) on Jul 31, 2003 at 21:49 UTC
    Perl provides the (??{}) (This is a "postponed" regular subexpression) construct in regular expressions for the purpose you mention. For example:
    $text =~ s|(??{chr 231})|c|gise;
    There is more information about the construct in perldoc perlre.

    update: Note I am answering the question of:

    I'd like to be able to call a function in the matching portion of substitution expression.

    (though there are better solutions that do the same thing as the OP's example below).

    -enlil

      Be aware that this is an experimental feature. One of the other ways to do it might be more appropriate for production code.

      -sauoq
      "My two cents aren't worth a dime.";
      

      Update: Oops, I misread Enlil's (??{}) as (?{}) which is very different. His original code works, it just isn't a great way to write that. Better to write it directly as \xe7.

      No, no this is wrong. (?{}) is zero width and is always true. That's no help to the person who just wants to get the character #231 into their regex. Some other nice monks noticed that you could write it in hex form \xe7 or BrowserUK's verbose (and slightly slower) @{[ chr 231 ]}.

Re: Function call in a regular expression
by BrowserUk (Patriarch) on Jul 31, 2003 at 22:11 UTC

    You can also do it this way.

    $s = "\xe7abc\xe7"; $s =~ s[@{[chr 231]}][X]g; print $s; XabcX


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

Re: Function call in a regular expression
by tedrek (Pilgrim) on Jul 31, 2003 at 21:51 UTC

    Because regexes are like double quoted strings you can use hex escape or octal escapes inside them, so just write that as

    $text =~ s|\xe7|c|gise;

    I'm not sure if there is a way to use decimal escapes though.

Re: Function call in a regular expression
by dga (Hermit) on Jul 31, 2003 at 22:22 UTC

    As for the options the 'i' (ignore case) option doesn't do anything for this RE and if you use the hex character recommended the 'e' won't be needed either. Also using tr would have less overhead since you are changing one character into 1 other character.

    $text =~ tr/\xe7/c/;
Re: Function call in a regular expression
by graff (Chancellor) on Aug 01, 2003 at 04:56 UTC
    If you want the output of a function call as part of the left-side match string in a regex, why not just assign the function output to a scalar, and use the scalar as a match string?
    my $match = chr 231; my $replc = "c"; $text =~ s/$match/$replc/gise;
    The expression content can be as dynamic, flexible and/or complicated as you need it to be -- assign different values to $match on every iteration of some loop, use some kind of data structure (hash, AoA or HoA or whatever) to store pre-computed (or hard-coded) pairings of $match and $replc, and so on. Check out the "qr" operator in "perldoc perlop" for more ideas...

    The only reason to want function calls in the matching part would be to write more compact code, but "more compact" often means "harder to read/debug/maintain". (I guess another reason to want this would be to write "better" obfus.)

    Using executable code in the replacement part is of course a different story -- you can do lots of things with this that would be really hard to do any other way.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://279802]
Approved by Mr. Muskrat
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2024-04-23 11:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found