| [reply] |
Very interesting. Sereal deserves some attention too. I'll read through that.
Thanks,
Matt.
| [reply] |
There are two data structures that remain the same... the first structure describe bit-fields within a 64-bit register. The 2nd structure describes some meta-attributes about the register.
min to max will be 0 to 2^64 - 1.
So given these data structures are not varying, it is sounding like pack templates might be the way to go. Perhaps there will be a challenge in that the 1st data structure is an array with varying numbers of elements, although the structure will always be the same.
Thanks for pointing out Storable and BZip2. That is more food for thought along the way.
Thanks,
Matt. | [reply] |
There are two data structures that remain the same... the first structure describe bit-fields within a 64-bit register. The 2nd structure describes some meta-attributes about the register.
min to max will be 0 to 2^64 - 1.
So given these data structures are not varying, it is sounding like pack templates might be the way to go. Perhaps there will be a challenge in that the 1st data structure is an array with varying numbers of elements, although the structure will always be the same.
The OP shows two hashes; one of which is a hash of arrays. Above you say "the 1st data structure is an array with varying numbers of elements,"? The OP mentions "many 10's of MB of computer generated data files" and shows two small data structures. My point is that you are not giving us clear information. If you want actual help rather than speculative possibilities, you need to be more clear and accurate in the specifications of the problem.
Ie. Is this two files containing a huge version of one of the OP data structures in each? Or are the myriad files for each type of data structure? Or myriad files containing the two versions of the OP data structures?
- How many MBs?
- Spread across how many files?
- Are the sub data structures fixed or variable in length?
Note: If the top level entity in a file has a variable length, that's easily accommodated; but if the sub structures vary in length that's harder. Ie. if the hash of arrays, contains a variable number of hash elements, but the values are fixed length arrays, that easily handled; but if the arrays vary in length that's much harder.
- Does the application need to load all of the "10s of MBs" at once for every run, or does it only use a small subset for each run?
- So many more questions, before I would choose an approach to solving your problem.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
In the absence of evidence, opinion is indistinguishable from prejudice.
Suck that fhit
| [reply] |