One for the weekend: challenge

Replies are listed 'Best First'.
Re: One for the weekend: challenge by Anonymous Monk on Jun 01, 2008 at 15:27 UTC
This took about two hours to write, through three iterations to get reasonable performance (under one second on a Sempron 3000). I'm sure I'll have issues with the "maintainability" score :) #!perl use strict; my %dict; @ARGV = qw( d.dict d.input ); while(<>) # read in both files, load dict on first, match on second { chomp; (my $d = lc) =~ tr\|a-z"/-\|57630499617851881234762239\|d; $ARGV eq 'd.dict' ? push @{$dict{$d}}, $_ : match($d, 0, "$_:"); } sub match # (to match, last was digit, matches) { my ($in, $digit, @have, $cnt) = @_; $in eq '' and return print "@have\n"; # have full match for my $k (map { substr $in, 0, $_ } 2..length $in) { match(substr($in, length $k), !++$cnt, @have, $_) for @{$dict{$k}} +; } $cnt or $digit or match(substr($in, 1), 1, @have, substr $in, 0, 1); } [download]	[reply] [d/l]
Re^2: One for the weekend: challenge by karavelov (Monk) on Jun 01, 2008 at 21:30 UTC
Amazing! It runs very fast! Here it runs for 0,6 seconds on the standard set of 1000 numbers (compared to 2.2 sec my code). With growing the set of numbers they are comparable in speed - for 50000 numbers they are equal 13.7 seconds. It seems that I am hurt by the regex compile time. I think your code is better than mine because : 1. you do not relay on some experimental features and recent optimizations; 2. The difference is in how we get the possible correct continuations of the number. My code generates them with the hairy regular expression. Your code generates all possible continuations and filter only the correct ones trough hash lookup. 3. less is more Best regards	[reply]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 01, 2008 at 15:59 UTC
Utterly, utterly amazing. As presented, I think you would have to acquire a few maintainability demerits, but I don't think that it would be that hard to clean it up, and I don't think the result would hurt your timing or LOC unduly. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^3: One for the weekend: challenge by Anonymous Monk on Jun 01, 2008 at 16:55 UTC
Remember when considering "maintainability" that the shorter the program, the easier it is to understand :) /me ducks	[reply]
Re^2: One for the weekend: challenge by Anonymous Monk on Jun 03, 2008 at 14:09 UTC
Slightly faster :) #!perl use strict; my %dict; @ARGV = qw( d.dict d.input ); while(<>) # read in both files, load dict on first, match on second { chomp; (my $d = lc) =~ tr\|a-z"/-\|57630499617851881234762239\|d; @ARGV ? push @{$dict{$d}}, $_ : match($d, 0, "$_:"); } sub match # (to match, last was digit, matches) { my ($in, $digit, @have, $k) = @_; map match(substr($in, $k), !++$digit, @have, $_), @{$dict{substr $in, 0, $k = $_}} for 2..length $in; $in =~ s/(.)// ? $digit \|\| match($in, 1, @have, $1) : print "@have\n +"; } [download]	[reply] [d/l]
Re: One for the weekend: challenge by dragonchild (Archbishop) on May 31, 2008 at 01:01 UTC
The problem is not only suited to Lisp because of the reason Anonymonk gave, but also because the most elegant solution to this problem is one that requires more exposure to CS than most Java programmers (in my experience) tend to have. I just coded up something similar in JS for a Go-playing program I'm playing with (thanks to tye's NextPermute() implementation in Algorithm::Loops). I have no desire to actually write it, so I'll give my plan in readmore tags below. Read more... (382 Bytes) My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on May 31, 2008 at 01:16 UTC
All I can say in response is a) I would love to see your approach coded; b) you will not appreciate the complexity of the problem until you have attempted to code a solution. The devil, as always, is in the detail. Performing the mappings is relatively simple, but avoiding the production of the cases that are explicitly excluded: Particularly 1078-913-5: je Bo" 9 1 da , since there are two subsequent digits in the encoding Throws most initial and theoretical solutions into a cocked hat. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re: One for the weekend: challenge by karavelov (Monk) on Jun 01, 2008 at 05:29 UTC
I have solved it but I have passed 3 stages in order to make it run fast. The code has been around 60 lines in all of the stages. I have decided to play on the strong sides of perl and I have used regular expressions. So my first version took me one and a half hour to write and debug the logic. The code was rewriting regexes and it was pathetically slow. It was running on the set for 5 minutes on my home PC (Pentium 4). Next version took me another one and an half hour to convert it to use pre-compiled regular expressions. I have used some black magic in order to make exhaustive pattern matching. The interesting line is this : `qr{^($re1)(?{ push @match, [$-[0], $+[0]] })(*FAIL)};` where $re1 is gigantic alternative (123\|2322\|...). It was running on the set for 32 seconds - not bad but I was expecting more. I have read that there is "trie optimization" of the regexes in last version of perl. It take me another 2 hours to make optimizations and to debug why the hell I am not getting trie optimized regexes for the gigantic alternative I was constructing from the dictionary. Experimentally I have found that there is a limit between 11000 and 13000 alternatives so I decided to partition the constructed regex in 10 smaller ones. And this one worked gaining much more speed. It was running on the set of numbers for 2.2 seconds. Now I an happy. In summary I was developing the code for 5 hours. It consists of 65 lines of code that follows. It runs on my PC (Pentium 4 @ 3GHz) for 2.2 seconds and 1.6 sec on my office PC (Intel Dual Core E2160 @ 1.80GHz) and eats around 22M of ram Read more... See the code (2 kB) I have learned a little bit more about regular expressions, their optimization and extension.	[reply] [d/l] [select]
Re^2: One for the weekend: challenge by ysth (Canon) on Jun 01, 2008 at 06:25 UTC
See 5.10.0 regex slowdown; in particular, Re^4: 5.10.0 regex slowdown. -- Online Fortune Cookie Search	[reply]
Re^3: One for the weekend: challenge by karavelov (Monk) on Jun 01, 2008 at 06:48 UTC
Thanks. Yes I have figured out how to workaround this limitation. It will be great if it is fixed some day.	[reply]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 01, 2008 at 16:02 UTC
This is a correct and efficient solution. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re: One for the weekend: challenge by Anonymous Monk on May 31, 2008 at 00:31 UTC
The deck is pretty-well “stacked” in favor of LISP on this one: the nature of the problem lends itself particularly well to recursive solution and to so-called “pragmatic programming” (a LISP specialty in which programs build lists and then execute those lists). I generally don't think that it's too useful to compare “language to language” without first giving very careful consideration to what it is that you are trying to do. Languages are tools... nothing more. You need to evaluate a great many non-computer factors when deciding what language(s) to deploy against a particular business task.	[reply]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on May 31, 2008 at 01:06 UTC
You might be surprised! But, I didn't post it for that reason. It's purely an exercise in seeing how people approach the problem in Perl, and then attempting to come up with some measure of "goodness" for those approaches. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re: One for the weekend: challenge by Limbic~Region (Chancellor) on Jun 01, 2008 at 02:56 UTC
BrowserUk, The following code should produce the correct output. I have tested it individually on a handful of the "expected output" numbers. I am going to let it run overnight to see how long it takes. Using `perl -MO=Deparse solution.pl \| wc -l` it is 44 lines of code. Read more... (3 kB) In a nutshell, it generates the unrestricted integer partitions of the number. It skips over any partitioning that has so many 1s it could not satisfy the problem constraints. It then generates all permutations of each possible partition - skipping over partitioning with two adjacent 1s. It then checks if it produces a valid solution. I am sure it could be optimized a lot, but that would add code complexity and lines of code. Update: Shortly after posting, I realized by re-arranging some logic it would be more efficient. The node has been updated to reflect that. Cheers - L~R	[reply] [d/l] [select]
Re^2: One for the weekend: challenge by karavelov (Monk) on Jun 01, 2008 at 06:54 UTC
It loops forever on long numbers. And sometimes produces false results. I have not analyzed why exactly this is wrong but for the number -810873502888/74-556227/1 (the last number of the output.txt) it gives 62 results while the number of correct results is 18.	[reply]
Re^3: One for the weekend: challenge by Limbic~Region (Chancellor) on Jun 01, 2008 at 12:55 UTC
karavelov, Thanks. I am not going to invest too much time in it as I believe the logic to be sound - the implementation is the problem. Cheers - L~R	[reply]
Re: One for the weekend: challenge by kyle (Abbot) on Jun 02, 2008 at 06:20 UTC
Here's what I came up with: Read more... (4 kB) The solution itself is absolutely correct. The I/O loop at the beginning might be broken, though. I didn't test that part. Rather, I tested the module with Test::More and wrote the I/O loop as an afterthought. Here are my tests: Read more... (2 kB) I started with the short disambiguating sample material and then moved on to the main input. I found that '2' as an input appears twice, so I had to "fix" my testing code to account for that (I hardcoded the solution for '2'). I found also that I had one last bug to fix for cases that produce words but end in the digit 0 (that's 7 of the 1000 inputs). I did this without reading any of the other solutions. I wasn't going to work on this at all until I saw the one from Anonymous Monk was so short. Seeing that, I figured it couldn't be quite as time-consuming as I was thinking. I wasn't really trying to make it short or especially efficient—just correct. Before I started I had this idea that I would profile it when I was done and make it really fast, but now I'd rather go to bed. Update: I've fixed the problem that BrowserUK pointed out, and I made a performance tweak based on profiling. I was surprised to discover that more time was spent in `new()` than in all 7000 calls to `words_to_phone()` put together. I fixed that by changing "`s/(.)/$num_for{ $1 }/g`" to "`tr/ejnqrwxdsyftamcivbkulopghz/01112223334455666777888999/;`" thereby sacrificing maintainability for speed. It's a big speed difference, however. Run time went from over 1 second to about a half a second.	[reply] [d/l] [select]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 02, 2008 at 10:49 UTC
I had to invert the (physical source-code) positioning of the package and main code in order that `%num_for` would be set up prior to the constructor being called. Alternatively, I could separate the package into a .pm file. With either correction, this produces the correct output quite efficiently. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l]
Re^2: One for the weekend: challenge (self-review) by kyle (Abbot) on Jun 10, 2008 at 21:51 UTC
I've had some time to ponder my own work here, and I'm writing here to share my thoughts. Let me start on the defensive: I wrote this tired and trying to finish in minimal time. I was shooting for functionality more than elegance or brevity. Probably the biggest wart is using caller as a sort of flag parameter to the main `words_for_phone`. I made a `words_only`, and it does nothing but call `words_for_phone`. There, I have a check to see if that's where the call came from: `# If this was called from 'words_only', don't produce an individua +l digit my $words_only; { no warnings 'uninitialized'; my @call = caller 1; $words_only = ( $call[3] eq __PACKAGE__ . '::words_only' ); }` [download] What I should have done instead is change the interface. The new convention would be that `words_for_phone` is called with an array reference and an optional flag second argument. It's not called many places, so this wouldn't be a difficult change. If I'd made "`my $out = []`" instead be "`my @out`", I'd have saved myself some "`@{ }`" in a few places. Really the only place `$out` appears by itself is at the return, so that could be "`return \@out`". On the other hand, making it an array reference from the beginning indicates clearly that we're returning an array reference. I think mine's the only solution that uses a package. I made it modular so I could test it more easily, though this certainly cost some lines of code to support it. I don't regret this decision at all. It was great to be able to just run the tests and see if I'd fixed what I was working on (and if I'd broken anything that worked before). This was written incrementally, and it still has that feel to me. I wrote something to handle the easiest cases first (where the phone can be entirely translated to words) and then built stuff up around it to handle more difficult cases. If this were production code and not a toy problem, it would benefit greatly from sanity checks in several places. Dictionary words should contain no spaces or digits. Don't call the one instance method as a class method and vice versa. I welcome comments from any other monks who might have read what I wrote.	[reply] [d/l] [select]
Re: One for the weekend: challenge by ysth (Canon) on Jun 01, 2008 at 06:01 UTC
Pfah! Cannot they write problems like this without ambiguities? We hates them, we do. The one that's bothering me is the "can" in: "If and only if at a particular point no word at all from the dictionary can be inserted,". Does that mean "can be inserted to successfully produce a solution"? Or "can be inserted whether or not a correct solution results"? If the former, given the single-word dictionary "aaa", "55555" has solution "5 aaa 5". If the latter, it has no solutions. -- Online Fortune Cookie Search	[reply]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 01, 2008 at 12:13 UTC
I would suggest that all ambiguities are resolved by example. In this case, the following would seem to apply: 4824: 4 Ort, because in place of the first digit the words Torf, fort, Tor could be used I read that to say, that as 'aaa' can be applied at the first step, inserting a digit at that point is illegal. Therefore your example has no solutions. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^3: One for the weekend: challenge by ysth (Canon) on Jun 01, 2008 at 15:07 UTC
Nope, that doesn't do it, because using Torf ,fort, or Tor 4 produces a solution. My doubt is whether a digit can be inserted where there could be a word, but not one that produces a solution. I haven't rigorously examined the sample solutions for lack of one that would establish the point, nor do I really care to. -- Online Fortune Cookie Search	[reply]
Re^4: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 01, 2008 at 15:49 UTC
Re^5: One for the weekend: challenge by ysth (Canon) on Jun 01, 2008 at 16:50 UTC
Re^3: One for the weekend: challenge by Limbic~Region (Chancellor) on Jun 01, 2008 at 13:34 UTC
BrowserUk, I also read the rules incorrectly because of the following example: `04824: 0 Torf 04824: 0 fort 04824: 0 Tor 4` [download] The last solution is what lead me astray. If it had read: As you work left to right, if at any point a word can be inserted - using a digit is illegal and at no point can two digits be adjacent I would have understood much better. In any event, my solution is buggy and produces incorrect results. Since it seems like you were after approaches, I believe mine is interesting because of the unrestricted integer partitions. I have no interest at this point in making it a correct solution. Cheers - L~R	[reply] [d/l]
Re^4: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 01, 2008 at 14:06 UTC
Re^2: One for the weekend: challenge by karavelov (Monk) on Jun 01, 2008 at 06:20 UTC
We are looking only for correct results, aren't we? So, I think that only the first interpretation is correct. In your example, the "5 aaa 5" case.	[reply]
Re^3: One for the weekend: challenge by ysth (Canon) on Jun 01, 2008 at 06:28 UTC
I don't understand how that is an argument that there isn't an ambiguity. -- Online Fortune Cookie Search	[reply]
Re: One for the weekend: challenge by Limbic~Region (Chancellor) on Jun 03, 2008 at 03:17 UTC
BrowserUk, Now that I understand the problem, here is a correct working solution that is relatively fast (8 seconds locally). Running it through `perl -MO=Deparse solution.pl \| wc -l` indicates it is 42 lines. It isn't very clean as I didn't have very much time to implement. If I have time, I will see about clarity and optimizations. If nothing else, I will add comments tomorrow so at least the algorithm is clear even if the code is not. Read more... (4 kB) Cheers - L~R	[reply] [d/l] [select]
Re^2: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 03, 2008 at 04:14 UTC
This produces correct output quite efficiently. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^3: One for the weekend: challenge by sparkylu (Monk) on Jun 03, 2008 at 10:50 UTC
So which solution matched up to your expectations the best? I haven't tried all of them yet... waiting for my next break.	[reply]
Re^4: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 03, 2008 at 11:30 UTC
Re^3: One for the weekend: challenge by Limbic~Region (Chancellor) on Jun 03, 2008 at 18:22 UTC
BrowserUk, Here is the same code re-written to be more clear and maintainable. I still think the handful of special case scenarios could be reduced to a more general case. Unfortunately, I don't have time to try. Read more... (9 kB) Cheers - L~R	[reply] [d/l]
Re^4: One for the weekend: challenge by BrowserUk (Patriarch) on Jun 03, 2008 at 18:35 UTC
Re: One for the weekend: (provisional results) by BrowserUk (Patriarch) on Jun 04, 2008 at 14:45 UTC
I'm not antisipating any late entries, so here are provisional findings on the challenge. Final scoring will require me to write up some kind of assessment of the entries and may take a few days, if it gets done at all. (Anyone really interested?) The program that generates these will be posted below. As will my original attempt (labelled:`BrowserUk` below). (Update: Corrected score calculations!) c:\test\685390>685390.plt Testing Anonymonk1-2 Testing Anonymonk1-reworked Testing Anonymonk1 Testing BrowserUk Testing BrowserUk2 Testing BrowserUk3 Testing karavelov Testing kyle Testing kyle2 Testing LimbicRegion2 Testing LimbicRegion3 { Anonymonk1 =>{ ELAPSED=> 1.64, LOC=> 15, OUTCOUNT=> 262, SCORE=> 2 +4.6 }, Anonymonk1-2 =>{ ELAPSED=> 1.48, LOC=> 12, OUTCOUNT=> 262, SCORE=> 1 +7.76 }, Anonymonk1-rw =>{ ELAPSED=> 1.93, LOC=> 27, OUTCOUNT=> 262, SCORE=> 5 +2.11 }, BrowserUk =>{ ELAPSED=>10.57, LOC=> 36, OUTCOUNT=> 262, SCORE=> 38 +0.52 }, BrowserUk2 =>{ ELAPSED=> 2.07, LOC=> 28, OUTCOUNT=> 262, SCORE=> 5 +7.96 }, BrowserUk3 =>{ ELAPSED=> 1.67, LOC=> 49, OUTCOUNT=> 89, SCORE > 8 +1.83 }, LimbicRegion2 =>{ ELAPSED=>11.03, LOC=> 40, OUTCOUNT=> 262, SCORE=> 44 +1.2 }, LimbicRegion3 =>{ ELAPSED=>12.04, LOC=> 63, OUTCOUNT=> 262, SCORE=> 75 +8.52 }, karavelov =>{ ELAPSED=> 5.08, LOC=> 63, OUTCOUNT=> 262, SCORE=> 32 +0.04 }, kyle =>{ ELAPSED=> 4.69, LOC=> 79, OUTCOUNT=> 262, SCORE=> 37 +0.51 }, kyle2 =>{ ELAPSED=> 2.98, LOC=> 73, OUTCOUNT=> 262, SCORE=> 21 +7.54 }, } [download] ()manually compacted to comply with the pathetic wrap ilmit. Anybody would think this was the 1970s! Ignore, BrowserUk2, BrowserUk3 and Anonymonk1-reworked. BrowserUk2 is my attempt to clean up my own entry once I saw the performance of other peoples solutions. It must be discounted because it was too influenced by the work of others. BrowserUk3 is my second attempt at clean up avoiding the influence of others. A work in progress and currently non-complient. Anonymonk1-reworked is my attempt to sanitize anonymonk's first entry for 'maintanence purposes'. This may get posted as part of a further post discussing that subject. The others above are as posted in this thread. Kyle is kyle's entry prior to his updates. kyle2 post-updates. Test program: #! perl -slw use strict; use Data::Dump qw[ pp ]; use Time::HiRes qw[ time ]; our $DIFF \|\|= 0; my $perl = q[c:\Perl510\bin\perl5.10.0.exe]; my $filter = q[find /V "strict" \| find /V "warnings"]; my %results; while( <.pl> ) { my( $name ) = m[(^[^.]+).pl]; my $label = sprintf "%-20.20s", $name; my $res = $results{ $label } = {}; print "Testing $name"; my $LOC = 0+ qx[ $perl -MO=Deparse $name.pl 2>nul \| $filter \| wc - +l]; $res->{ LOC } = $LOC; my $start = time; my @output = qx[ $perl $name.pl input.txt ]; my $end = time; $res->{ ELAPSED } = sprintf "%5.2f", $end - $start; $res->{ OUTCOUNT } = @output; $res->{ SCORE } = $res->{ ELAPSED } * $res->{ LOC }; next unless $DIFF; open my $sort, "\| sort > $name.srt" or warn "Couldn't create sorted output for $name" and next; printf $sort "$_" for @output; close $sort; $results{ $label }{ DIFF } = [ qx[ diff -adU 0 output.srt $name.sr +t ] ]; } pp \%results; [download] My pre-challenge solution, warts an all. It took less than 1 hour to program a recursive solution that found all possible encodings. Another ~2 hours to hack it to fulfill the specs arbitrary limitations: No adjacent digits. No digit where a word is available (even if the use of that word does not result in a successful encoding). I was quite pleased with the LOC & timings, relative to the information available for the Lisp and Java entries.(Until I saw anonymonks entry! :). I was aware it was convoluted and would need refactoring. Hence the attempts above. #! perl -slw use strict; use List::Util qw[ reduce ]; $a = $b; use constant DICTIONARY => 'dictionary.txt'; my %mappings; reduce{ $mappings{ $_ } = $a for split '', $b; $a + 1; } 0, qw[ e jnq rwx dsy ft am civ bku lop ghz ]; my %dict = map{ $_ => [ $_ ] } 0 .. 9; keys %dict = 300000; open DICT, '<', DICTIONARY or die "${\DICTIONARY}: $!"; while( <DICT> ) { chomp; my $orig = $_; tr[A-Z"/-][a-z]d; #" push @{ $dict{ join '', @mappings{ split '', $_ } } }, $orig; } close DICT; sub encode { my $number = shift; my $start = length( $number ); --$start while $start and not exists $dict{ substr $number, 0, $st +art }; $start = 2 if $start > 1; return @{ $dict{ $number } \|\| [] }, grep !/\d \d/, map { my( $pre, $post ) = unpack "a$_ a*", $number; map{ my $head = $_; map "$head $_", encode( $post ); } @{ $dict{ $pre } \|\| [] }, } $start .. length( $number )-1; } while( <> ) { chomp; my $orig = $_; tr[/-][]d; print "$orig: $_" for encode( $_ ); } [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks

One for the weekend: challenge

The challenge