http://www.perlmonks.org?node_id=1032486


in reply to Dynamically Updating Frequency Analysis

Oh, that's the XYZ problem. Tt has been proven to be NP complete but the ABC heuristic approach is likely of value.

here to help o/

Alright alright. CB smartassery aside, this reminds me of the natural language text indexing projects I embarked upon when I was a puppy. It sounds like he's going too far at once. Why not emit the n-grams (for n 1-4) into a frequency distribution hash, then worry about building the tree out of it.