in reply to Reaped: a large text file into hash
Did you try my suggestion of using Search::Dict?
I assume that you want it in a hash because you are planning on doing further processing on it. If that is the case, then I am going to strongly recommend that you try to think about your processing in terms of the whole map-reduce paradigm that I suggested. Because your data volume is high enough that you really will benefit from doing that.
It takes practice to realize that, for instance, you can join two data sets by mapping each to key/value where the key is the thing you are joining on, while the value is the original value and a tag saying where it came from. Then sort the output. Then it is easy to pass through the sorted data and do the join.
You have to learn how to use this toolkit effectively. But it can handle any kind of problem you need it to - you just need to figure out how to use it. And your solutions will scale just fine to the data volume that you have.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: a large text file into hash
by perl_lover_always (Acolyte) on Jan 28, 2011 at 10:37 UTC | |
by tilly (Archbishop) on Jan 28, 2011 at 16:55 UTC |