Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re: using a hash to do substitution?

by no_slogan (Deacon)
on Jun 14, 2001 at 05:20 UTC ( #88268=note: print w/replies, xml ) Need Help??

in reply to using a hash to do substitution?

# create a regex that matches any one of the hash keys $regex = join("|", map(quotemeta, keys %hash)); # substitute them s/($regex)/$hash{$1}/eg;

Replies are listed 'Best First'.
Re: Re: using a hash to do substitution?
by Masem (Monsignor) on Jun 14, 2001 at 05:47 UTC
    I'd change this subtly (unless you only care to match whole words only) to:
    $regex = join("|", map(quotemeta, sort { length $b <=> length $a } keys %hash)); s/($regex)/$hash{$1}/eg;
    That is, the longer keys will appear first in the alternatives list and thus they will be matched first.

    Dr. Michael K. Neylon - || "You've left the lens cap of your mind on again, Pinky" - The Brain
      $code .= map { "s/\b$_\b/$hash{$_}/eg;\n" } keys %hash; eval $code;
      This way, you cut the expense of alteration (if the keys are user defined, or you have metachars in them, instead of quotemeta, add \Q and \E before and after, respectively, the first $_ in the substitution). You can print $code to test its values, and by doing it can easily customize it for your specific needs(though if substituting is the only thing going on, this won't be a concern).
Re: Re: using a hash to do substitution?
by nardo (Friar) on Jun 14, 2001 at 05:34 UTC
    It is likely that s/\b($regex)\b/$hash{$1}/eg; is more appropriate, so that, for example, $hash{'name'} = 'Bryan' doesn't turn 'nameserver' into 'Bryanserver'
Re: Re: using a hash to do substitution?
by Henri Icarus (Beadle) on Jun 14, 2001 at 05:36 UTC
    no_slogan's regex trick is very nice (++), however, I've been led to believe that lots of ors (the | char) in a regex can blow out the stack (or take a long time) because of the back tracking required. Can any more adept monks comment on this?

    The other more brute force less pretty way to solve the problem is:

    foreach $key (keys %hash) { $text =~ s/$key/$hash{$key}/eg; }

    -I went outside... and then I came back in!!!!

      Rather than speculate on efficiency you can always use benchmark. One small point though - whenever you interpolate a string into a m// regex or the first half of a s/// regex you need to backslash your special regex metacharacters $^*()+{[\|.?

      The easiest way is to use quotemeta.

      foreach $key (keys %hash) { $key = quotemeta $key; $text =~ s/$key/$hash{$key}/eg; }

      This is *vital* for reliability. Otherwise you will get unexpected runtime failures when your data eventually contains metachars (typos, deliberate, malicious...)

      use Benchmark; timethese(10000, { 'Simple loop' => ' $text = "This is my test string"; %hash = qw(test text foo bar use loop the end); foreach $key (keys %hash) { $key = quotemeta $key; $text =~ s/$key/$hash{$key}/eg; } ', 'Alternation' => ' $text = "This is my test string"; %hash = qw(test text foo bar use loop the end); $regex = join("|", map(quotemeta, keys %hash)); $text =~ s/\b($regex)\b/$hash{$1}/eg; ', } ); Output: Benchmark: timing 100000 iterations of Alternation, Simple loop... Alternation: 5 wallclock secs ( 4.12 usr + 0.00 sys = 4.12 CPU) @ 2 +4271.84/s (n=100000) Simple loop: 10 wallclock secs ( 9.72 usr + 0.00 sys = 9.72 CPU) @ 1 +0288.07/s (n=100000)

      So as it happens alternation is twice as fast. I thought your solution would be faster but there you go!

      If in doubt - use Benchmark



      On the other hand, an alternating regex can use the /o modifier, while your version can't. Regex compilation is a costly operation, so alternation should be much faster under /o.
                     s aamecha.s a..a\u$&owag.print

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://88268]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (5)
As of 2023-03-26 14:33 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (63 votes). Check out past polls.