Another try at explaining this:
When dealing with 7-bit ascci, the Uppercase begins at 65 and the lowercase at 97 -- 32 higher.   Since 32 is a power of two represented by bit 5 of the character, if this bit is set, the letter is lc, if unset, Uc.
$ perl -lwe'$,=$\;print unpack("B*","A"), unpack("B*","a"), unpack"B*"
01000001 <- "A": 64 + 1
01100001 <- "a": 64 + 32 + 1
00100000 <- result of XORing
The bit will be set only if the original was uppercase.   Since XORing something with itself is always 0, that is the only bit which can be set.   The lc of the replacement will have that bit set because that's what makes it lc, with other bits set to determine which letter.
So, bit 5 is set in the XORing of the original with its lc self only if the original is Uc (the opposite of the bits meaning!) and set in the lc replacement.   If they are both set XOR clears the result: hence Uc; if only the replacement is set it leaves it: lc.
I think at this point I should exclaim "QED" and run.   It seemed clear enough before I started trying to explain it in this little box!
But note that jryan
's answer above will work with any locale !
points out (and I should've checked) that capitalizing-by-resetting-bit-5 also works for the 8-bit characters in the standard ISO8859-1 ("latin-1") character set.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] |