Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Reducing application footprint: large text files

by Anonymous Monk
on Mar 01, 2018 at 00:58 UTC ( [id://1210114]=note: print w/replies, xml ) Need Help??


in reply to Re: Reducing application footprint: large text files
in thread Reducing application footprint: large text files

There are two data structures that remain the same... the first structure describe bit-fields within a 64-bit register. The 2nd structure describes some meta-attributes about the register.

min to max will be 0 to 2^64 - 1.

So given these data structures are not varying, it is sounding like pack templates might be the way to go. Perhaps there will be a challenge in that the 1st data structure is an array with varying numbers of elements, although the structure will always be the same.

Thanks for pointing out Storable and BZip2. That is more food for thought along the way.

Thanks, Matt.

  • Comment on Re^2: Reducing application footprint: large text files

Replies are listed 'Best First'.
Re^3: Reducing application footprint: large text files
by BrowserUk (Patriarch) on Mar 01, 2018 at 11:12 UTC
    There are two data structures that remain the same... the first structure describe bit-fields within a 64-bit register. The 2nd structure describes some meta-attributes about the register. min to max will be 0 to 2^64 - 1. So given these data structures are not varying, it is sounding like pack templates might be the way to go. Perhaps there will be a challenge in that the 1st data structure is an array with varying numbers of elements, although the structure will always be the same.

    The OP shows two hashes; one of which is a hash of arrays. Above you say "the 1st data structure is an array with varying numbers of elements,"? The OP mentions "many 10's of MB of computer generated data files" and shows two small data structures. My point is that you are not giving us clear information. If you want actual help rather than speculative possibilities, you need to be more clear and accurate in the specifications of the problem.

    Ie. Is this two files containing a huge version of one of the OP data structures in each? Or are the myriad files for each type of data structure? Or myriad files containing the two versions of the OP data structures?

    • How many MBs?
    • Spread across how many files?
    • Are the sub data structures fixed or variable in length?

      Note: If the top level entity in a file has a variable length, that's easily accommodated; but if the sub structures vary in length that's harder. Ie. if the hash of arrays, contains a variable number of hash elements, but the values are fixed length arrays, that easily handled; but if the arrays vary in length that's much harder.

    • Does the application need to load all of the "10s of MBs" at once for every run, or does it only use a small subset for each run?
    • So many more questions, before I would choose an approach to solving your problem.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1210114]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-20 02:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found