Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: Comparing tables over multiple servers

by rnahi (Curate)
on Oct 27, 2005 at 19:48 UTC ( #503454=note: print w/ replies, xml ) Need Help??


in reply to Re: Comparing tables over multiple servers
in thread Comparing tables over multiple servers

Would you still recommend this solution if the table size were, say, 100 GB?

Think about locking all seven tables until copies are delivered. (If you don't, the table you pass to the slaves may be different from the one that is being modified in the master copy.)


Comment on Re^2: Comparing tables over multiple servers
Re^3: Comparing tables over multiple servers
by BrowserUk (Pope) on Oct 27, 2005 at 20:18 UTC

    Hmm. If the tables are that large, then you have a problem anyway. No matter how you arrange to do the comparison, it still requires that at least one copy of the table be tranmitted between each pair of machines in order for the comparison to take place.

    Having transmitted that data, a blanket replacement will always be quicker than a compare. The time spent locked whilst transmitting remains a constant and the time spent locked replacing will pale into insignificance relative to the time spent comparing.

    If the tables are this large and dynamic (from what the OP said, his seemed to be relatively small & static), then you would obviously need to use some kind of dynamic replication or true distributed updates, but that is quite a different situation and a considerably more involved undertaking than sync'ing a few, small, relatively static reference tables, which was my interpretation of the OP situation.

    Maintaining multiple copies of large dynamic tables in a distributed DB environment is the current holy grail of RDMBS development, still a proprietary black art for the most part, and hardly the sane subject of a "quick perl script" to undertake.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      it still requires that at least one copy of the table be tranmitted between each pair of machines in order for the comparison to take place

      Not necessarily. As this solution suggests, you can calculate a global CRC locally on each server, and then compare the results.

        Okay. You got me there. Though I do have to wonder about the time spent calculating the CRCs (using SQL) relative to transmitting the data. If the machines are connected by a high-speed (10Mbits/s or greater) network connection, I would think it would be a close run thing.

        It would be quicker if you could CRC the underlying filesystem entities in which the data is stored, but of course, any difference in the transaction histories, database configurations or versions, or even the disk drives would thrown the comparison out.

        And, once you have discovered that there is a difference, you have still to corrected it, and you're right back where you started from needing to transmit the data.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://503454]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (14)
As of 2014-09-30 14:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (373 votes), past polls