Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Version that I'm working on is 8.3.5 - no autocacuum, though I run it manually basically before (batch of) tests and now playing with vacuuming them from within Perl code. My test data is small - I test with 10.000 to 100.000 records so sizes are small.

Real system has ~5 new records per second + all the queue tables for each of those (and it's the queue tables that are being updated/deleted from) and the idea is that they are "small" or preferably as close to 0 records as possible. It's relatively recent 8.x version running on a dedicated server double Xeon something with 2 (or perhaps 4) GB of RAM. At most data that is kept there is for last 3 days - those queue tables are vacuumed every couple of hours (manually) since each of them gets ~ 500K records added and deleted each day. Main data table being partitioned per day (so ~ 500K records that are added and from time to time updated - per partition - 3 partitions) each partition deleted after 3 days - so no vacuum is running on them.

I know that full vacuum is locking the whole table - the fact that there are only two processes working with any given queue table (previous state process, and the one processing that queue/state) make it acceptable.

IMHO the problem is that DB's aren't primary meant to be queues where in one day you get 5 million records added/deleted (estimate for 10 queue tables). But at the same time you need transactions to be sure where each and every record is in the processing. OTOH all servers are having up-time of up to 3 years - so it might be OK to do everything in RAM and occasionally just update the DB. Another alternative I'm looking at is the DBM::Deep since it sales pitch is quite impressive:

A unique flat-file database module, written in pure perl. True multi-level hash/array support (unlike MLDBM, which is faked), hybrid OO / tie() interface, cross-platform FTPable files, ACID transactions, and is quite fast. Can handle millions of keys and unlimited levels without significant slow-down.
In this system everything is called by it's Key/ID (anyone mentioned hashes?), or in FIFO sequence (arrays?). With only some reports needing real SQL. Of course using DBM::Deep would give the PHP guys that are working on administration interface (web) quite a headache :) So I'm guessing I'll try to just shift those queues to DBD::Deep and see what happens.

Have you tried freelancing/outsourcing? Check out Scriptlance - I work there since 2003. For more info about Scriptlance and freelancing in general check out my home node.

In reply to Re^2: Working with large number of small constantly updated records by techcode
in thread Working with large number of small constantly updated records by techcode

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-19 23:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found