Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: How to improve this data structure?

by sundialsvc4 (Abbot)
on May 21, 2013 at 15:38 UTC ( [id://1034567]=note: print w/replies, xml ) Need Help??


in reply to How to improve this data structure?

++ on using a database

SQLite files are wonderful for this sort of thing ... there is no “server.”   There are useful utilities for importing text-files and so forth.   The only gotcha, when you finally get down to programming that updates things, is a rather important one:   use transactions.   SQLite is specifically designed to commit data to disk and to re-read the data to verify it, every single time, unless a transaction is in-progress.   (If there is, it buffers the data much more sensibly.)   But, with that being said, it is an excellent and robust tool ... and free.   (It is actually in the public domain.)

You’ve got millions of records to deal with, and you can’t be writing a Perl program every single time . . .   You might find need to get to this data in all sorts of ways – reports, spreadsheets, who-knows.   SQLite can take you there.

  • Comment on Re: How to improve this data structure?

Replies are listed 'Best First'.
Re^2: How to improve this data structure?
by karlgoethebier (Abbot) on May 21, 2013 at 18:30 UTC

    I assume you wanted to say something like:

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

      Yeah, more or less.

      “A million records” is a volume that is “vaguely interesting” to SQLite ... it will take a few minutes’s one-time cost to import the data (and might not require a program).   A couple minutes more to add some indexes.   From that point on, you can use any tool or combination of tools that has a DB-interface (including Perl of course ...) to get the job done, and now the query-engine is the one that’s doing all the heavy lifting.   So long as the “transactions” caveat is carefully adhered-to esp. when doing updates, it’s really quite a remarkable piece of software engineering.   (It’s rare when a piece of software genuinely surprises me blows me away.   Perl/CPAN did it.   So did this.)   I suspect that most of the things that the O.P. is right now “writing programs to do” can probably be reduced to a query (and perhaps a now-trivial program to digest the results).   Furthermore, a huge bonus is that you can put results into a different table, which of course is a self-describing data structure.   (A single SQLite file can contain any number of tables.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1034567]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-19 07:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found