Beefy Boxes and Bandwidth Generously Provided by pair Networks RobOMonk
P is for Practical
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

You would? Not me.

Did you take a good look at all the data conversions and substrings and stuff going on in that SQL? SQL can be pretty optimal at performing comparison, that's its bread and butter work, but those types of data manipulations and conversions are not it's strong suite.

I attempted to verify my suspicions, but about half of the syntax in that article doesn't seem to be valid with the only SQL database I have available, but I'm betting (a coffee:) that it ain't quick on any platform.

I would hazzard that dumping the table using the export facilty and using a dedicated binary digest(or) program would be considerably faster.

Either way, once the determination of difference is made, you have still to correct it and that means transmitting the data. Easier, surer and possibly quicker to just zip up the dumped table and send it I think.

Unless the data involved is already compressed binary--jpgs or similar--then the 100GB would probably reduce to 25% or so, and transmitting 25GB at 100Mb/s will take 34 minutes, assuming no contention.

Running a dedicated md5 executable on 1GB takes around 20 seconds, so around 1/2 hour for 100GB, but that is calculating a single hash from a contiguous datastream.

You're suggesting calculating 2 hashes for every piece of data, retrieved in iddy biddy chunks and doing all the math in SQL?

In the absence of evidence to the contrary, my money would be on the transmission finishing long before the checksumming.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^7: Comparing tables over multiple servers by BrowserUk
in thread Comparing tables over multiple servers by mnlight

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others meditating upon the Monastery: (8)
    As of 2014-04-19 06:31 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      April first is:







      Results (478 votes), past polls