Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: unicode lc uc fc wtf

by Your Mother (Archbishop)
on Sep 17, 2018 at 01:57 UTC ( [id://1222487]=note: print w/replies, xml ) Need Help??


in reply to unicode lc uc fc wtf

On a related note, which I consider required reading–

  • Code that assumes roundtrip equality on casefolding, like lc(uc($s)) eq $s or uc(lc($s)) eq $s, is completely broken and wrong. Consider that the uc("σ") and uc("ς") are both "Σ", but lc("Σ") cannot possibly return both of those.
  • Code that assumes every lowercase code point has a distinct uppercase one, or vice versa, is broken. For example, "ª" is a lowercase letter with no uppercase; whereas both "ᵃ" and "ᴬ" are letters, but they are not lowercase letters; however, they are both lowercase code points without corresponding uppercase versions. Got that? They are not \p{Lowercase_Letter}, despite being both \p{Letter} and \p{Lowercase}.
  • Code that assumes changing the case doesn’t change the length of the string is broken.

Replies are listed 'Best First'.
Re^2: unicode lc uc fc wtf
by jeffenstein (Hermit) on Sep 17, 2018 at 14:03 UTC
    There is also Tom Christiansen's Unicode Recipes which could save your life one day, or leave you a quivering mess hiding under your desk at the mere mention of Unicode.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1222487]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2024-04-23 18:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found