Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: improving the efficiency of a script

by lima1 (Curate)
on Jun 18, 2006 at 16:32 UTC ( [id://556119]=note: print w/replies, xml ) Need Help??

in reply to improving the efficiency of a script

if your dictionary is sorted, you could create a "map" of your dictionary with binary search: e.g. search for "b", "c" etc (most implementations return the position of the query in the dict., even when query is not found). it is log(n). so m * log(n) for the map and m * 100 for the word selection.

your solution would be m * n.

  • Comment on Re: improving the efficiency of a script

Replies are listed 'Best First'.
Re^2: improving the efficiency of a script
by liverpole (Monsignor) on Jun 18, 2006 at 18:35 UTC
    I like your solution, lima1.

    Here's an example of how one might implement it:

    #!/usr/bin/perl -w # Strict use strict; use warnings; # User-defined my $wordfile = "words.txt"; # Libraries use FileHandle; use File::Basename; use Data::Dumper; # Main program my $iam = basename $0; # Read words, saving file offsets my $fh = new FileHandle; open($fh, '<', $wordfile) or die "$iam: can't read $wordfile ($!)\n"; my ($word, %word_offsets_by_letter); while (1) { my $offset = tell($fh); defined($word = <$fh>) or last; chomp $word; if ($word =~ /^(.)/) { my $first_letter = lc $1; $word_offsets_by_letter{$first_letter} ||= [ ]; push @{$word_offsets_by_letter{$first_letter}}, $offset; } } # Test (this will give 100 random words beginning with 'a') foreach (1..100) { my $next = random_word_starting_with("a", $fh); print "Next word => $next\n"; } # Subroutines # # random_word_starting_with # # In: $1 ... the first letter of the word (eg. 'a', 'b', 'c', etc.) # $2 ... the open filehandle of the word file # sub random_word_starting_with { my ($first_letter, $fh) = @_; my $p = $word_offsets_by_letter{lc $first_letter}; my $offset = $p->[int rand @$p]; seek($fh, $offset, 0) or die "$iam: failed to seek ($!)\n"; my $word = <$fh>; chomp $word; return $word; }

    And if this is something you need to do a lot of, I like Zaxo's suggestion of using a database.

Re^2: improving the efficiency of a script
by ambrus (Abbot) on Jun 18, 2006 at 21:22 UTC

    As for binary searching for first letters in a sorted dictionary, I've done that in my submission to the word ladder problem which was a problem in perl quiz of the week expert edition week 22.

    However, I don't quite see how that would help in this word selection problem. Unless you somehow preprocess the dictionary, you cannot select a random word in constant time even if you know the offsets of the first letters, as just choosing a random offset favors long words. So, IMO, you have to read the whole or most of the dictionary file for this problem unless it's preprocessed in some way. Random access to the file may still slightly help after reading it completely, but not very much unless there are very long words in the "dictionary".

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://556119]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-06-25 09:14 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.