Re^2: Check randomly generated numbers have not been used before
by BrowserUk (Pope) on Jun 20, 2014 at 19:28 UTC

If you had any programming skills, you'd make some attempt to verify a wild-assed guess like that:

```#! perl -slw
use strict;
use Math::Random::MT qw[ rand ];

our \$T //= 50;

my( \$total, \$min, \$max ) = ( 0, 1e9, 0 );
for( 1 .. \$T ) {
my( %h, \$r );
while( 1 ) {
\$r = int( 1e6 + rand( 9e6 ) );
++\$h{ \$r } == 2 and last;
}
my \$subt = scalar keys %h;
\$total += \$subt;
\$min = \$subt if \$subt < \$min;
\$max = \$subt if \$subt > \$max;
printf "Duplicate value(\$r) seen after %d iterations\n", \$subt;
}

printf "Ave.%.3f Min:%d Max:%d\n", \$total / \$T, \$min, \$max;

__END__
C:\test>junk54
Duplicate value(2665199) seen after 4496 iterations
Duplicate value(6555509) seen after 5166 iterations
Duplicate value(8242500) seen after 1302 iterations
Duplicate value(8032591) seen after 1597 iterations
Duplicate value(6169187) seen after 4540 iterations
Duplicate value(3999236) seen after 6578 iterations
Duplicate value(4470150) seen after 7858 iterations
Duplicate value(4580644) seen after 3798 iterations
Duplicate value(4856238) seen after 4780 iterations
Duplicate value(9540961) seen after 1793 iterations
Duplicate value(8106058) seen after 5114 iterations
Duplicate value(8231462) seen after 1289 iterations
Duplicate value(5437248) seen after 1915 iterations
Duplicate value(6177470) seen after 2986 iterations
Duplicate value(7463793) seen after 3506 iterations
Duplicate value(3028064) seen after 5916 iterations
Duplicate value(9985723) seen after 1536 iterations
Duplicate value(4848638) seen after 1384 iterations
Duplicate value(1227666) seen after 3463 iterations
Duplicate value(8764243) seen after 4122 iterations
Duplicate value(6364225) seen after 5994 iterations
Duplicate value(5322923) seen after 506 iterations
Duplicate value(7557328) seen after 1820 iterations
Duplicate value(4731867) seen after 2791 iterations
Duplicate value(8199944) seen after 1903 iterations
Duplicate value(8481265) seen after 4763 iterations
Duplicate value(5209684) seen after 1039 iterations
Duplicate value(3202941) seen after 6198 iterations
Duplicate value(6657755) seen after 5574 iterations
Duplicate value(7013349) seen after 2369 iterations
Duplicate value(7736365) seen after 3088 iterations
Duplicate value(3171737) seen after 1359 iterations
Duplicate value(5533710) seen after 2921 iterations
Duplicate value(9403663) seen after 2459 iterations
Duplicate value(8811253) seen after 4271 iterations
Duplicate value(1649095) seen after 5702 iterations
Duplicate value(4630011) seen after 3550 iterations
Duplicate value(9044573) seen after 6169 iterations
Duplicate value(5711853) seen after 4171 iterations
Duplicate value(3313811) seen after 2539 iterations
Duplicate value(5055725) seen after 5712 iterations
Duplicate value(6910134) seen after 4571 iterations
Duplicate value(5598759) seen after 6135 iterations
Duplicate value(8065116) seen after 2466 iterations
Duplicate value(4587634) seen after 6195 iterations
Duplicate value(4192726) seen after 1373 iterations
Duplicate value(4299363) seen after 5448 iterations
Duplicate value(2418217) seen after 4396 iterations
Duplicate value(2225622) seen after 1414 iterations
Duplicate value(4347823) seen after 1962 iterations

Ave.3639.940 Min:506 Max:7858

Using one of the best PRNGs around, that hits a duplicate after less than 4000 attempts on average, and as few 500.

Re^2: Check randomly generated numbers have not been used before # Math!
by LanX (Cardinal) on Jun 21, 2014 at 14:30 UTC
> The odds of an actual collision for a 7-digit random number are so astronomically small that I quite frankly would not bother to check for it.

the odds for no collision are easily calculated:

```> perl -e ' \$x=1; \$x*=(1e7-\$_)/1e7 for 1..3723; print \$x '
0.499919268547978

So the odds of a collision is already bigger than 50% after 3723 draws!¹

Actually there are already combinatorial formulas that avoids the loop completely.

Re^2: Check randomly generated numbers have not been used before
by marto (Cardinal) on Jun 21, 2014 at 13:45 UTC

Re^2: Check randomly generated numbers have not been used before
by Anonymous Monk on Jun 20, 2014 at 17:06 UTC
... I quite frankly would not bother to check for it.

R3search3R said the check is to avoid duplicate part numbers, and didn't say how many numbers they're generating, so that advice seems very wrong.

R3search3R: Why aren't you using a real database for this? Here's a solution based on your approach, but note that if the list is long and this is run often, it's horribly inefficient because it reads the entire file every time it's run.

```use warnings;
use strict;
use diagnostics;

my \$database = "Database.txt";
my \$part_number_range = 10;

open (my \$input, "<" , \$database) || die "Can't open \$database: \$!";
my %part_numbers = map { chomp; \$_=>1 } <\$input>;
close \$input;

use Data::Dumper; print Dumper(\%part_numbers); # DEBUG

my \$new_part_number;
my \$bail_out_count;
while (1) {
\$new_part_number = sprintf("%07d", int(rand(\$part_number_range)));
last unless exists \$part_numbers{\$new_part_number};
die "Bailing out after too many retries" if ++\$bail_out_count>1000
+;
print "Collision with part number '\$new_part_number', retrying\n";
+ # DEBUG
}

print "Chose part number: '\$new_part_number'\n"; # DEBUG

open (my \$output, ">>" , \$database) || die "Can't open \$database: \$!";
print \$output "\$new_part_number\n";
close \$output;