Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: One for the weekend: challenge

by karavelov (Monk)
on Jun 01, 2008 at 05:29 UTC ( #689516=note: print w/replies, xml ) Need Help??

in reply to One for the weekend: challenge

I have solved it but I have passed 3 stages in order to make it run fast. The code has been around 60 lines in all of the stages. I have decided to play on the strong sides of perl and I have used regular expressions.

So my first version took me one and a half hour to write and debug the logic. The code was rewriting regexes and it was pathetically slow. It was running on the set for 5 minutes on my home PC (Pentium 4).

Next version took me another one and an half hour to convert it to use pre-compiled regular expressions. I have used some black magic in order to make exhaustive pattern matching. The interesting line is this :

 qr{^($re1)(?{ push @match, [$-[0], $+[0]] })(*FAIL)};

where $re1 is gigantic alternative (123|2322|...). It was running on the set for 32 seconds - not bad but I was expecting more.

I have read that there is "trie optimization" of the regexes in last version of perl. It take me another 2 hours to make optimizations and to debug why the hell I am not getting trie optimized regexes for the gigantic alternative I was constructing from the dictionary. Experimentally I have found that there is a limit between 11000 and 13000 alternatives so I decided to partition the constructed regex in 10 smaller ones. And this one worked gaining much more speed. It was running on the set of numbers for 2.2 seconds. Now I an happy.

In summary I was developing the code for 5 hours. It consists of 65 lines of code that follows. It runs on my PC (Pentium 4 @ 3GHz) for 2.2 seconds and 1.6 sec on my office PC (Intel Dual Core E2160 @ 1.80GHz) and eats around 22M of ram

#!/usr/bin/perl use strict; my %dic; #load dictionary open DIC,"< dictionary.txt" or die "$!"; while (<DIC>){ chomp; my $w = uc($_); $w =~ s/"//g; if ($w =~ y/EJNQRWXDSYFTAMCIVBKULOPGHZ/01112223334455666777888999/ +){ $w =~ s/[^0-9]//; push @{$dic{$w}},$_; } } close DIC; # Partition the dictionary in 10 regular expressions. # I do not know why the trie regex optimisation is not # supported for very large alternations (> 12000 entries). # Changing ${^RE_TRIE_MAXBUFF} does not help my @rex; my @match; for my $i (0..9){ use re 'eval'; my $re1 = join ('|', grep { /^$i/ } keys %dic); $rex[$i] = qr{^($re1)(?{ push @match, [$-[0], $+[0]] })(*FAIL)}; } my $num; sub print_num { my $n = shift; $n =~ s/\ +/ /g; # remove duplicate spaces $n =~ s/^\ //; # remove leading space $n =~ s/\ $//; # remove trailing space print "$num: $n\n"; } sub process_num { my ($n,$rest,$last) = @_; if (length($rest)==0){ print_num($n); return; } my $i = 0; @match=(); for my $r (@rex){ # match all regexes $rest =~ /$r/; # the result is stored in } # @match global variable my @matches = @match; # and is copied locally for my $x (@matches){ my $match = substr($rest,$x->[0],$x->[1]-$x->[0]); my $tail = substr($rest,$x->[1]); for my $w (@{$dic{$match}}){ # we could have multiple words my $m = $n; # for one number $m =~ s/$match/ $w /; process_num($m,$tail,0); } $i++; } # permit only one digit if there is no match if (($i==0)&&($last==0)) { $rest=~s/.//; process_num($n,$rest,1); } } # Load input numbers while (<>){ chomp; my $n = $_; $n =~ s/[^0-9]//g; # we are interested only in didits if (length($n)){ # if we have number here $num = $_; # num is the original number process_num($n,$n,0); } }

I have learned a little bit more about regular expressions, their optimization and extension.

Replies are listed 'Best First'.
Re^2: One for the weekend: challenge
by ysth (Canon) on Jun 01, 2008 at 06:25 UTC
      Thanks. Yes I have figured out how to workaround this limitation. It will be great if it is fixed some day.
Re^2: One for the weekend: challenge
by BrowserUk (Pope) on Jun 01, 2008 at 16:02 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://689516]
MidLifeXis waves.
[stevieb]: man, I absolutely LOVE driving in the worst winter conditions. In the mountainous roads, there are extremely few (if any) other drivers. I love the challenge. I've been doing extreme off-road my whole life, and as I said, I'm always well...
[stevieb]: ...prepared for even the very worst
Discipulus seems a part of the globe very unfrindly with Kawasaki riders..
[thezip]: My doggies love it when it snows
[perldigious]: As long as you are a good "boy scout" stevieb, more power to you... but I like my Jeep, and don't like the idea of rolling it or crashing it in to anything. :-)
[thezip]: Me, not as much.
[perldigious]: Mine too thezip, especially my German Shepherd. He looks like he is about to die of heat exhaustion all summer, so he goes nuts when winter comes and he can play in the snow.
[stevieb]: my main off-road vehicle nowadays is a strengthened mercedes ML320. I'm very hard on it, but I've never had any real issues at all (I love how small and light it is, and it has a very, very good centre of gravity for how I roll)

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (11)
As of 2016-12-06 16:48 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (112 votes). Check out past polls.