http://www.perlmonks.org?node_id=1064970


in reply to Creating SELECT-like look-up tables as hashes

I have data from a database which I have read into an array of hashrefs. I want to perform SELECT-like operations on the data.

If the data is already in a database, why would you read all the data from the db into Perl and then try to recreate the basic db SELECT operation -- for which the DB is designed and highly optimised -- using Perl?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Creating SELECT-like look-up tables as hashes

Replies are listed 'Best First'.
Re^2: Creating SELECT-like look-up tables as hashes
by loris (Hermit) on Dec 02, 2013 at 16:34 UTC

    I am not actually reading all the data into Perl, just the smaller of two tables in an SQLite DB (one has 40000 records, the other 26 million). Profiling with Devel::NYTProf indicated that around 50% of the time was spent doing SELECTs on the database, so that looked like a good starting point for optimisation. As I have access to a cluster with available memory varying between 18-90 GB per node, I thought it might be worthwhile investigating whether some of the operations could be performed in memory, especially as I currently only require two such look-up tables.

    The idea is that with the look-up tables I can create them once and thereby effectively store the results of multiple SELECTS, which can then be accessed via the hash key. This seemed like it might be faster in Perl, but as you point out, a single SELECT is what the DB is optimised to do. So I guess my mileage may vary.

    Cheers,

    loris

      I'd suggest creating temporary copies of one or both of your tables in an SQLite In-memory DB.

      If you really need more performance than that gives you; and if your select queries are sufficiently predictable that you could pre-index the smaller table, that might be worth thinking about.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I did try the in-memory approach, but this didn't seem any faster than reading from the file system, presumably because the entire database file, which is only 1.4 GB, is being cached anyway.

        But I shall look into indexing.

        Thanks,

        loris