|Perl: the Markov chain saw|
Re: Comparing large filesby wjw (Curate)
|on Feb 11, 2014 at 19:41 UTC||Need Help??|
First I would start the other way around, look for words in words-only file that match those
with pronunciations. The assumption is that the smaller set is going to be those with pronunciations.
pronunciation-words -> words
Next, I think I would look for uniqueness. With a 10Mg file, it is hard to imagine that some
The other thing I might look at if this is not a one-off type thing is using a database if
Otherwise: pumping comparisons into a simple hash like
Hope that is somewhat helpful..
...the majority is always wrong, and always the last to know about it...
Insanity: Doing the same thing over and over again and expecting different results.