Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Re: Question about properly laying out a database

by perrin (Chancellor)
on Dec 12, 2001 at 04:08 UTC ( [id://131120]=note: print w/replies, xml ) Need Help??


in reply to Re: Question about properly laying out a database
in thread Question about properly laying out a database

No, I don't think Storable would be a better option for this. With Storable, you would have to load the entire database into memory just to search it. DB_File creates fast, indexed access to records without actually loading them into RAM.

However, your suggestion of getting an array of matches for each criterion and then finding the overlapping set is a good one. You can make one DB_File database for each criterion (make.db, model.db, etc.) and then the content for each record could be a list of car IDs (like unique object IDs) that you use to look up the car data in a separate content database (also a DB_File, with data serialized using Storable in each record).

However, it would be much easier to just use MySQL.

  • Comment on Re: Re: Question about properly laying out a database

Replies are listed 'Best First'.
Re: Re: Re: Question about properly laying out a database
by joealba (Hermit) on Dec 12, 2001 at 09:31 UTC
    But if he's going to search all the records to match on some set of criteria, won't he have to load the whole thing in memory anyway?

    In this application, it's not very often that you call up one record by its key id. Searches are more common. So, you'll have to evaluate just how much data each record will hold, how much RAM will be used up, and how quickly you can load all that data into memory.
      But if he's going to search all the records to match on some set of criteria, won't he have to load the whole thing in memory anyway?

      No. All he has to do to find the cars with make = 'volvo' (or some normalized key like 'make_7') is say my $cars = $make_db{'volvo'} and have that return a ref to an array of car IDs (serialized by Storable). Then he does the same for each of the other criteria, and finds the overlapping set.

      You do access records by ID, because you make multiple indexes (dbm files) which are each using a different criterion of the search as a key. It's kind of a roll-your-own MySQL.

        You do access records by ID, because you make multiple indexes (dbm files) which are each using a different criterion of the search as a key. It's kind of a roll-your-own MySQL.

        Isn't it also overkill?

        Given the claimed upper bound of 500 records (which means be generous and assume an upper bound of 1,000), it's still going to take fewer reads (and fewer disk head movements) to suck in an entire flat file of search criteria than it would to search using multiple DBM files.

        There's a point at which the multiple DBM approach is a win, but I suspect this application is far below that point.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://131120]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-20 00:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found