Re^2: Substitute 'bad words' with 'good words' according to lists

in reply to Re: Substitute 'bad words' with 'good words' according to lists
in thread Substitute 'bad words' with 'good words' according to lists

pg,

I think the original code is not necessarily inefficient.I feel the performance depends on number of words your split returns.. Here is a benchmark of the original (added keys which was missing). I have modifed the txt to be 100 times the original one.

Again the story could be different when you have way too many replacements and fewer words.

#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw (:all);

my $txt = "ugly anotherugly " x 100;
# print $txt,$/;

sub pg {
    my %words = (
            ugly => 'ug**',
            anotherugly => 'anot*******',
            );
    my @words = split / /, $txt; # largely simplified, you have to cou
+nt ,.:; etc
        for my $i (0 .. $#words) {
            $words[$i] = $words{$words[$i]} if (exists($words{$words[$
+i]}))
        }
#   print join(' ', @words),$/;
}

sub orig {
    my %words = (
            ugly => 'ug**',
            anotherugly => 'anot*******',
            );
    $txt =~ s/$_/$words{$_}/g foreach keys(%words);
#   print $txt,$/;
}

my $test = {'pg' => \&pg, 'Original' =>\&orig,};

my $result = timethese(-10,$test );
cmpthese($result);
[download]

Output

Benchmark: running Original, pg for at least 10 CPU seconds...
  Original: 11 wallclock secs (10.86 usr +  0.00 sys = 10.86 CPU) @ 43
+770.26/s (n=475345)
        pg: 11 wallclock secs (10.68 usr +  0.00 sys = 10.68 CPU) @ 43
+28.46/s (n=46228)
            Rate       pg Original
pg        4328/s       --     -90%
Original 43770/s     911%       --
[download]

NOTE: I removed the join from your code just to show the looping differences.

In Section Seekers of Perl Wisdom