more useful options PerlMonks

### Shortest string containing all from 0000 to 9999 challenge.

by EvdB (Deacon)
 on May 22, 2003 at 08:00 UTC Need Help??

Here is something for the monks to mull over on the weekend. It is not a perl specific thing but I fear that large amounts of perl ingenuity will be required to solve it.

Q: What is the shortest string that contains all the numbers from 0000 to 9999.

For example: '012345678' contains 0123, 1234, 2345, 3456, 4567, 5678.

Is it possible to create a string which contains all the numbers and no duplicates? To date my best achievement is:

```      Length is: '10427'.
Missing count: '0'
Duplicate count: '424'

Please feel free to use this function to test any strings produced:

```# Will return an analysis of the string.
# Usage: print analyse( \$string );
sub analyse {

# Make an array from the number and create hash.
my \$number = shift;
my @string = split //, \$number;
my %count  = ();

# Work over the number adding values to the hash.
while (1) {
my \$candidate = \$string[0].\$string[1].\$string[2].\$string[3];
\$count{\$candidate}++;

# Get rid of first element of the array.
shift @string;

# Break out at the end of array.
last if scalar @string < 4;
}

# Tally up the number of duplicates and missing numbers.
my \$duplicates = 0;
my \$missing    = 0;

for ( 0 .. 9999 ) {
my \$k = 0 x ( 4 - length \$_ ) . \$_;
my \$v = \$count{\$k};
unless ( defined \$v ) {
\$missing++;
next;
}
\$duplicates += ( \$v - 1 ) if \$v > 1;
}

# Create the summary.
my \$text;
\$text .= "      Length is: " . length(\$number) . "\n";
\$text .= "  Missing count: " . \$missing . "\n";
\$text .= "Duplicate count: " . \$duplicates . "\n\n";

return \$text;
}

My thoughts on this puzzle are that you can go about it one of two ways: add unused numbers to the end of the string (technique used for results above) or remove duplicates. Probably the best is a combination of both.

Update: Made a change to the analysis code - pesky CVS caught me out.

--tidiness is the memory loss of environmental mnemonics

Replies are listed 'Best First'.
Re: Shortest string containing all from 0000 to 9999 challenge.
by sauoq (Abbot) on May 22, 2003 at 09:51 UTC
What is the shortest string that contains all the numbers from 0000 to 9999.

Assuming the numbers must all have 4 digits, the shortest theoretical length is 10003 digits if you can get them to overlap perfectly (10000 unique numbers and 3 digits of overhead.) As it happens, you can do it perfectly. Here's one example (be careful of newlines if you copy and paste it):

The (admittedly lousy) script I used follows. It's a bit c-ish...

```-sauoq
"My two cents aren't worth a dime.";
```
Very nicely done - although I wish you had waited until after the weekend to show me up...

--tidiness is the memory loss of environmental mnemonics

Re: Shortest string containing all from 0000 to 9999 challenge.
by sauoq (Abbot) on May 22, 2003 at 10:23 UTC

By the way, you can improve your analyze function quite a bit. This piece in particular stood out to me:

```   for ( 0 .. 9999 ) {
my \$k = 0 x ( 4 - length \$_ ) . \$_;
Even if you had to use a for loop, you could have just written it as for ( '0000' .. '9999' ) instead. Isn't Perl great? You don't even need the for though. I wrote my analyze function like this:
```sub analyze {
my \$number = shift;
my \$length = length \$number;
my \$duplicates = 0;
my %seen;

while (length \$number >= 4) {
\$seen{substr(\$number, -4)}++ and \$duplicates++;
substr(\$number, -1, 1, '');
}

my \$missing = 10000 - keys %seen;
return (\$length, \$missing, \$duplicates)
}

```-sauoq
"My two cents aren't worth a dime.";
```
When I started the problem I used alot of arrays - which carried over into my analysis code.

Yours is slicker and a good demonstration of the substr function. Just to prove it here are the benchmarks:

```        Rate  EvdB saouq
EvdB  7.89/s    --  -67%
saouq 23.8/s  202%    --

--tidiness is the memory loss of environmental mnemonics

Re: Shortest string containing all from 0000 to 9999 challenge.
by hardburn (Abbot) on May 22, 2003 at 17:23 UTC

Q: What is the shortest string that contains all the numbers from 0000 to 9999.

Probably not the shortest, but likely close:

```perl -MCompress::Bzip2 -e "
print Compress::Bzip2::compress('all the numbers from 0000 to 9999')"

----
I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer

Note: All code is untested, unless otherwise stated

Re: Shortest string containing all from 0000 to 9999 challenge.
by BrowserUk (Pope) on May 22, 2003 at 11:07 UTC

So far, it would appear that there are at least 10000 minimal solutions. 1 for every possible set of first 4 chars. There are also multiple--number yet to be determined, my poor ol' pII is groaning:)--variations of each of these starting points.

There's probably some mathematical way of calculating the total number of unique minimum solutions, but I can't see it. tye?, tilly?, Abigail? the NSA?. Anyone?

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
A better lower bound is 10! Every permutation of the 10 digits will change a valid solution to another valid solution.

Wrong.

By the way it's not that sure that you can start with every 4 digit number. I think you can - but you need to prove it.

Update: Or is it right? Let's have two differend 4 digit strings - if they are different then at least on one of the 4 position they have different digits - the permutation will change those digits to different digits as well. It means the permutation induces a 1-1 function on the 10000 4 digit strings. There can't be any more of them so in the whole resulting string there will be exactly 10000 different 4 digit strings. QED

That was long time since I proved a theorem.

By the way it's not that sure that you can start with every 4 digit number. I think you can - but you need to prove it.

You can.

First, it is sufficient to prove that you can do it with every possible pattern of same/differing numbers. Permutations take care of the rest. (Your thinking on that was fine, by the way.) There are only twelve such patterns: AAAA AAAB AABA ABAA BAAA AABC ABAC ABCA BAAC BACA BCAA ABCD. All you have to do is choose A, B, C, and D as four different digits and find a solution for each one. I chose A=0, B=1, C=2, and D=3 and then, using the same code in my original solution, found that there was indeed a solution for each of the 12 patterns.

So, for a lower bound, we have at least 12 * 10! (or the number of starting patterns multiplied by the permutations.)

Some other interesting facts (proofs left as an exercise):

• The reverse of a valid solution is a valid solution.
• No valid solutions are palindromic.

```-sauoq
"My two cents aren't worth a dime.";
```

I meant unique in the 'these two string compare differently' sense, rather that the set theory sense.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Based on a naive statistical analysis, we might expect about 2.8e5659 unique shortest strings. I got this number by computing the odds of a string of 10003 digits being a solution and multiplying that by the number of 10003-digit strings. Of course, floating point won't handle that. I got around this by reducing the formula a bit and computing the log() of the value instead and dividing by log(10) to get about 5659.45 ( all from the "perl -de 0" prompt, just in case anyone thought this was off-topic q-: ).

The odds of me having made a mistake in this calculation are rather large (it was a quick hack with no double checking -- both the math and the arithmatic). I'll try to find time later to post my analysis and code used to do the calculations.

I'm also not convinced (having not spent much time thinking about it) that a naive statistical analysis would be very valid. My qualms here appear to be diminishing the more I think about it (in part because the answer is so *huge*), but I still think the worry is worth mentioning.

It certainly was an interesting problem. (:

Also note that sauoq has shown us the lexicographically first of the shortest strings.

I also have an inkling that we can eliminate the backtracking in sauoq's program (or maybe not...). But I've got real work™ to do... ):

- tye
I also have an inkling that we can eliminate the backtracking in sauoq's program (or maybe not...).

Your intuition is right. My (much nicer) second try below generates a valid solution string without backtracking. Notice it isn't the same string as my original answer though. (Nor is it a permutation.)

```#!/usr/bin/perl -w
use strict;

my \$string = '9999';  # Starting with nines simplifies the loop
my %seen = ( \$string => 1 );
my \$seen = 1;

while (\$seen < 10000) {
for (0 .. 9) {
my \$candidate = substr(\$string, -3) . \$_;
next if \$seen{\$candidate};
\$string .= \$_;
\$seen{\$candidate} ++;
\$seen++;
last;
}
}

print \$string, "\n";
```-sauoq
"My two cents aren't worth a dime.";
```
For more information, you could do a quick review of de Bruijn strings/sequences (there are various links on the page.), and I believe, though I just did a precursory glance, that this page, answers your question (it is on the original page as well.)

update:i fixed a typo in the name of the sequences, when BrowserUK alerted me of my misspelling

.-enlil

Create A New User
Node Status?
node history
Node Type: perlmeditation [id://260002]
Approved by Tanalis
Front-paged by grinder
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2018-01-23 14:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How did you see in the new year?

Results (248 votes). Check out past polls.

Notices?