Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^2: Challenge: Optimal Animals/Pangolins Strategy

by Limbic~Region (Chancellor)
on May 02, 2013 at 17:31 UTC ( [id://1031785]=note: print w/replies, xml ) Need Help??


in reply to Re: Challenge: Optimal Animals/Pangolins Strategy
in thread Challenge: Optimal Animals/Pangolins Strategy

roboticus,
Maybe - I am not sure. The idea that I am not able to explain has to do with compression. I understand Huffman coding which is where my idea came from. In my head it is different so if the example I came up with is equivalent to Huffman coding I have failed.

Update: I am now pretty sure you are right. I still can't explain why what I am thinking about is different but this may help: In addition to coding for individual characters, I want to use any nodes that only have 1 leaf entry to code for character tuples. This is done AFTER the Huffman coding.

Cheers - L~R

  • Comment on Re^2: Challenge: Optimal Animals/Pangolins Strategy

Replies are listed 'Best First'.
Re^3: Challenge: Optimal Animals/Pangolins Strategy
by roboticus (Chancellor) on May 02, 2013 at 23:58 UTC

    Limbic~Region:

    Here's kind of what I was thinking. The Huffman tree keeps the most common animals (fish, dog, cat) at the top of the tree with few questions, and the unusual critters (unicorn, pegasus, axolotl) at the bottom of the tree.

    Is this something along the lines of what you're after?

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; # Read list of animals/occurrences into @T my @T; my $ttl=0; while (<DATA>) { next if /^\s*$/; my ($animal, $occurrences) = split /\s+/, $_; $ttl += $occurrences; my $t = bless [ $animal, $occurrences ], "Animal"; insert( $t ); } print "Total occurrences: $ttl\n"; # Turn it into a huffman tree while (@T > 1) { my $L = shift @T; my $R = shift @T; my $t = bless [ [$L, $R], $L->[1]+$R->[1] ], "Question"; insert( $t ); } #print Dumper($T[0]); my $avg = compute_avg_num_guesses($T[0]) / $ttl; print "Avg: $avg\n"; sub compute_avg_num_guesses { my $t = shift; my $question_depth = shift // 0; if (ref $t eq "Animal") { my $num_questions = $t->[1] * $question_depth; print "Animal: $t->[0], depth $question_depth * occurrences $t +->[1] ==> $num_questions\n"; return $num_questions; } else { my $sum_questions = compute_avg_num_guesses($t->[0][0], $quest +ion_depth+1); $sum_questions += compute_avg_num_guesses($t->[0][1], $questio +n_depth+1); return $sum_questions; } } sub insert { # Inserting in order simplifies the conversion into a huffman tree +... my $t = shift; push @T, $t; my $idx = $#T; while ($idx) { ($T[$idx],$T[$idx-1]) = ($T[$idx-1],$T[$idx]) if $T[$idx][1] < + $T[$idx-1][1]; --$idx; } } __DATA__ fish 150 horse 62 dog 85 cat 91 frog 28 axolotl 1 gerbil 3 hamster 5 ocelot 5 platypus 3 seal 18 badger 17 walrus 15 wolf 55 wolverine 22 sheep 60 cow 75 unicorn 1 pegasus 1

    Running it gives me:

    $ perl huffman.pl Total occurrences: 697 Animal: walrus, depth 5 * occurrences 15 ==> 75 Animal: badger, depth 5 * occurrences 17 ==> 85 Animal: seal, depth 5 * occurrences 18 ==> 90 Animal: pegasus, depth 8 * occurrences 1 ==> 8 Animal: axolotl, depth 9 * occurrences 1 ==> 9 Animal: unicorn, depth 9 * occurrences 1 ==> 9 Animal: hamster, depth 7 * occurrences 5 ==> 35 Animal: ocelot, depth 7 * occurrences 5 ==> 35 Animal: gerbil, depth 8 * occurrences 3 ==> 24 Animal: platypus, depth 8 * occurrences 3 ==> 24 Animal: cow, depth 3 * occurrences 75 ==> 225 Animal: fish, depth 2 * occurrences 150 ==> 300 Animal: dog, depth 3 * occurrences 85 ==> 255 Animal: cat, depth 3 * occurrences 91 ==> 273 Animal: wolverine, depth 5 * occurrences 22 ==> 110 Animal: frog, depth 5 * occurrences 28 ==> 140 Animal: wolf, depth 4 * occurrences 55 ==> 220 Animal: sheep, depth 4 * occurrences 60 ==> 240 Animal: horse, depth 4 * occurrences 62 ==> 248 Avg: 3.45050215208034

    So if I haven't screwed up, it looks like given the stated distributions and guesses, that you'd need 3.45 guesses on average.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      roboticus,
      I don't think you screwed anything up - this is the standard 1 queue version of Huffman coding. It meets the stated goals of my objective.

      Let me try and give some more insight into this fuzzy idea I have. Huffman coding gives ideal compression when the symbol set it is coding for and the frequencies those symbols appear is known up front. Typically, those symbols are single characters and for my purposes that will be true. If the bit sequence '01101' is used but '01100' is empty/unused, I see that as an opportunity to fill in the tree with some unknown symbols. Since all the single character symbols are known and populated, I would plug these "holes" with symbols representing character tuples.

      I then realized that I should reduce the frequency of the single character symbols that were part of the newly inserted tuple because not all of them would be encoded. This could then generate a different looking tree with different holes to fill. Looks like an infinite recursion problem.

      I then thought - if I can come up with a strategy to populate a tree given a desired outcome, I could just jump to the answer. In my sleep deprived state, I thought the Animals problem I posted got me there but it really just got me back to the beginning.

      Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1031785]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (2)
As of 2024-04-25 06:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found