Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

mysql cluster

by InfiniteLoop (Hermit)
on Aug 03, 2005 at 19:23 UTC ( #480609=perlquestion: print w/replies, xml ) Need Help??

InfiniteLoop has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Monks,
In my latest project, we are planning to use a cluster of mysql, in a single master/multiple slave mode. Basically, we plan to have to do all writes (insert and update) to the master database, and all reads (select) happen on the slave database(s).
In all my model classes I want all database access to happen seamlessly, i.e the model class need not know which database it should connect to.
I plan to write my own db access module, which base classes DBI and over-ride execute() for this.
Do you know of any easier ways ? Are there any existing perl module for this ?

Replies are listed 'Best First'.
Re: mysql cluster
by gmax (Abbot) on Aug 03, 2005 at 20:29 UTC
    my own db access module, which base classes DBI and over-ride execute() for this

    I advise you against doing this.

    Even if you succeed in changing the host overriding the execute() method (perhaps you should aim at prepare()), you would undermine any transaction that could be under way.

    The MySQL replication schema may be unconvenient for its division of write-on-master and read-from-slaves, but it's rather solid. In my experience, using it with transactional tables (InnoDB) guarantees replication of transactions without a glitch.

    The solution you are proposing is filter-based, and it's similar to the one offered from some commercial clusters, (emic networks, for example), which lose their transaction capabilities in the bargain.

    Remember that, when using transactions, you should direct to the master the whole transaction, including any reading instruction within it. If you don't, you may get completely wrong results.

    Instead of filtering, you could create an application wrapper, to send all modifying statements and transactions to the master, and rotate the slaves who should receive a simple read statement.

    I don't know of any (public) Perl modules that deal with this problem. I may only recommend reading Jeremy Zawodny's High Performance MySQL (there is a sample chapter about replication).

    If you are tempted to solve your problem by looking at MySQL Cluster, be aware that it's a completely different database system. Every node is at the same time master and slave, but you pay this luxury with a huge amount of RAM. According to the docs, in the current implementation, you need to have enough RAM to cover twice the size of your data, plus 20%. Therefore, if your data is 10 GB, your cluster must have 24 GB of RAM available (divided among all data nodes). There are plans of changing the in-memory architecture to something more resources-friendly, but don't hold your breath.

     _  _ _  _  
    (_|| | |(_|><
Re: mysql cluster
by Joost (Canon) on Aug 03, 2005 at 20:45 UTC
Re: mysql cluster
by aufflick (Deacon) on Aug 04, 2005 at 03:09 UTC
    Is your main aim load sharing or redundancy?
      The aim is to share the load of accessing data, hence many "read" databases. For now I plan to go with gmax's suggestion of having an application wrapper.
        Ok. If it was mainly redundency I was going to suggest you look at DRBD but it's *not* going to help your performance!

        DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1.

Re: mysql cluster
by dragonchild (Archbishop) on Aug 04, 2005 at 18:27 UTC
    Similar to gmax's solution, I have a wrapper (that I've written myself) that provides, among other things, a very simple do_select() and do_write(). (It also builds the SQL and does a lot of other book-keeping for me.) So, when you create this wrapper, you'd specify a set of read_dbs and a write_db. do_select() would pick the correct read_db and do_write() would go against the write_db.

    Now, it sounds like you're building some sort of distributed app, like a web application. It's not going to be simple to do this round-robin'ing for the read_db's from within an Apache child process. And, frankly, I don't think you need to until you start hitting the 100 request/second mark. That is, not with a good schema and well-tuned server parameters. And, the master/slave relationship may actually SLOW you down when you're trying to speed up.

    Frankly, I'd build a basic DB server in some 2/3-tier layout and see if that's good enough. If it is, you're good. If it's not, then you have a reason to spend more money. Often, it's cheaper to build a 4-CPU Xeon w/8G of RAM than it is to develop some sort of round-robin'ing DBI wrapper. YMMV

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://480609]
Approved by ww
Front-paged by Arunbear
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2021-01-23 15:04 GMT
Find Nodes?
    Voting Booth?