Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

I don't know what you mean by "unicode encoding" (are there encodings that map to non-unicode chars?), but in the perl context it's worth mentioning that iso-8859-1 strings don't follow unicode-semantics by default, the need to be encoded like any other string

It is a unicode encoding, in that after you've decoded the character number, the number maps 1-on-1 to the Unicode space. Don't forget that UTF-8 is just a way of encoding a sequence *numbers*.

That non-SvUTF8-flagged strings get ASCII semantics in some places, is indeed by design, but that wasn't sufficiently thought through IMO. Note that these strings may get unicode semantics in some circumstances, and ascii semantics in others. The ascii semantics are for charclass and upper-/lower case stuff.

I consider this a bug in Perl. See also Unicode::Semantics, and expect the bug to be fixed in 5.12.

And I really like the Perl 6 spec which allows string operations on byte, codepoint and grapheme level ;-)

Just realise that Unicode strings don't have a byte level :)


In reply to Re^3: How to reverse a (Unicode) string by Juerd
in thread How to reverse a (Unicode) string by moritz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others pondering the Monastery: (5)
    As of 2014-04-18 01:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      April first is:







      Results (460 votes), past polls