Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^4: locales, encodings, collations, charsets... how can I match a given MySQL collation?

by xaprb (Scribe)
on Apr 02, 2007 at 14:12 UTC ( #607832=note: print w/ replies, xml ) Need Help??


in reply to Re^3: locales, encodings, collations, charsets... how can I match a given MySQL collation?
in thread locales, encodings, collations, charsets... how can I match a given MySQL collation?

Thanks! This is *exactly* what I'm doing currently. The only trouble is to get the le, ge, cmp etc operators to agree with MySQL's idea of whether a row is lt/gt/eq the other row.

Imagine the left-hand table has "éclair" and "ecstatic", and the right-hand table has only "ecstatic". MySQL sorts é before other 'e' values, so the merge algorithm sees "éclair" on the left and "ecstatic" on the right after fetching the first row from each table. Perl thinks "ecstatic" should come first though -- at least that's how it works with just the default collations. So the merge algorithm concludes "ecstatic" doesn't exist in the left-hand table. After this, it runs out of rows in the right-hand table and decides "éclair" doesn't exist there, then fetches the next row from the left-hand table -- and decides "ecstatic" doesn't exist in the right-hand table.

It's quite a train wreck unless I get Perl's sorting to exactly match MySQL's!


Comment on Re^4: locales, encodings, collations, charsets... how can I match a given MySQL collation?
Re^5: locales, encodings, collations, charsets... how can I match a given MySQL collation?
by roboticus (Canon) on Apr 02, 2007 at 20:37 UTC
    xaprb:

    Okay ... then perhaps MySQL can help you in a different way. I don't know if this is suitable or not, but you might create a temporary table of the columns you're comparing, and put an index on it, and insert the key fields of both tables. Then you can select from all three columns using eq (which I (perhaps falsely) presume should work well in Perl).

    So the temp table would hold the ordering, and you simply select which of the other two tables holds the match for the record...the first, the second, or both.

    It may not be a good solution for you, if speed and/or database space is limited, but it's the only thing I can think of. (I know diddly-squat about fancy collation sequences....)

    ...roboticus

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://607832]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (10)
As of 2014-07-22 18:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (126 votes), past polls