This is very similar to the idea I had. I discovered that compiling the offsets using DBM::Deep was extremely slow, but was fast for subsequent runs. This also has the advantage of not requiring the dictionary file to be sorted.
#!/usr/bin/perl use strict; use warnings; use DBM::Deep; open(my $dict, '<', 'words.raw') or die "Unable to open 'words.raw' fo +r reading: $!"; my $db = DBM::Deep->new("offsets.db"); build_db($db, $dict) if ! scalar keys %$db; for my $char ('a' .. 'z') { for (1 .. 100) { print get_rand_word($db, $char, $dict); } } sub build_db { my ($db, $dict) = @_; my $pos = tell $dict; while ( <$dict> ) { my $char = substr($_, 0, 1); push @{$db->{$char}}, $pos; $pos = tell $dict; } } sub get_rand_word { my ($db, $char, $dict) = @_; my $offset = $db->{$char}[rand @{$db->{$char}}]; seek $dict, $offset, 0; my $word = <$dict>; return $word; }
Other options include Storable and DBD::SQLite if a real RDBMS isn't available.

Cheers - L~R

