Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re^6: Challenge: Optimal Animals/Pangolins Strategy

by Limbic~Region (Chancellor)
on May 06, 2013 at 15:28 UTC ( #1032325=note: print w/replies, xml ) Need Help??

in reply to Re^5: Challenge: Optimal Animals/Pangolins Strategy
in thread Challenge: Optimal Animals/Pangolins Strategy

I feel like this should be an episode of Doctor Who with incongruent time lines.

At the time I responded to the clarification of what I meant by inversely proportional, I had already moved on from thinking it was an appropriate solution. I only clarified for the sake of completeness. What I should have said was something along the lines of:

Not that it matters since I now realize it does not solve my problem but there is a difference between having an inverse relationship and being inversly proportional. The idea that the more popular an animal is the fewer questions it should take to identify is an inverse relationship. When I said inversly proportional I meant that the product of popularity to questions should be a constant defining exactly how many questions should be asked. In the end, I was wrong.

As for what I am trying to achieve - I am attempting to build on top of Huffman coding. Let's say you have a file that you have done single byte frequency analysis on and generated a Huffman code tree. You notice that a few of the branches only have 1 leaf entry instead of 2. You decide you want to fill in those "holes" with with the highest frequency 2 byte pairs in the file. You fill in the first hole but before moving on to the next one, you realize a problem. The frequency analysis of the single bytes requires recalculating which means rebuilding the tree which means different holes.

I am not sure if that makes any more sense. I took the weekend off from thinking about it in hopes that I would have clarity today but it is still a jumbled pile of mud in my mind.

Cheers - L~R

  • Comment on Re^6: Challenge: Optimal Animals/Pangolins Strategy

Replies are listed 'Best First'.
Re^7: Challenge: Optimal Animals/Pangolins Strategy
by BrowserUk (Pope) on May 06, 2013 at 16:00 UTC

    Hm. No mention of Huffman or frequency analysis in the OP, but whatever.

    It seems to me that you might get close to what (I think) you now want, without having to to any iterative recalculating, this way:

    Calculate your single characters and bigram frequencies. Round the number of single characters up to the next power of 2, and add the N most frequent bigrams where N is the number required to make the number of singles up to that next power of two. Now when you construct your Huffman tree it will be a fully populated, balanced binary tree.

    As an alternative, you might sort your singles and bigrams together by frequency and then select the top N most frequent (where N is a power of 2 that suits your requirements) from that combined set to build your fully populated, balanced binary tree.

    And finally, you might consider using a heap rather than a tree as it has the same order of complexity for lookups (just as fast), but is a considerably more compact representation in memory, thus meeting your need for compact representation whilst potentially holding a greater number of items in the same space.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1032325]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2018-05-26 21:59 GMT
Find Nodes?
    Voting Booth?