Beefy Boxes and Bandwidth Generously Provided by pair Networks chromatic writing perl on a camel
There's more than one way to do things
 
PerlMonks  

Re: Whether 'use utf8;' is good style

by SuicideJunkie (Priest)
on Dec 18, 2012 at 15:12 UTC ( #1009394=note: print w/ replies, xml ) Need Help??


in reply to Whether 'use utf8;' is good style

Are you actually using utf8 in your source code? If so, you must have it. If not, you don't.

I don't see why you'd want to include something that doesn't help. If some future maintainer finds they want to use utf8, they can easily add use utf8 to the top.


Comment on Re: Whether 'use utf8;' is good style
Re^2: Whether 'use utf8;' is good style
by McA (Deacon) on Dec 18, 2012 at 16:56 UTC

    Hi,

    thank you for your comment. After your answer I've seen that the context of my question was not precise enough.

    Take the following example:

    die ("Was für ein Müll!");

    This is a German string saying "What a rubbish!". There are umlauts in that string. When I store this source code file Latin-1 encoded there is one byte per German umlaut. The string is interpreted as a byte string. And this byte string gets interpreted correctly as Perl assumes Latin-1 encoding. But: When you run in an UTF-8 environment you would see a square and not an 'ü' when the program dies. When you use ONLY Ascii characters it doesn't matter and you're never aware of this subtle difference.

    So with use utf8; and a correct source code file encoding I would force a character semantic of this string which would result in a subtle different semantic of the string thrown.

    And I want to know whether there are pitfalls, when someone is using a module with that pragma probably expecting the good old byte string world.

    Best regards
    McA

      And I want to know whether there are pitfalls, when someone is using a module with that pragma probably expecting the good old byte string world.

      What would you expect to happen in that case? (I can't imagine things working correctly except by accident.)

        Hi chromatic,

        does that mean you suggest avoiding use utf8; in public packages?

        Best regards
        McA

      Incidentally if your original question is not precise enough you can edit it or add more information. Its traditional to highlight the fact that you've put more information in by adding the word Update to show what you've added.

      A Monk aims to give answers to those who have none, and to learn from those who know more.
      There are umlauts in that string. When I store this source code file Latin-1 encoded there is one byte per German umlaut. The string is interpreted as a byte string. And this byte string gets interpreted correctly as Perl assumes Latin-1 encoding.

      I don't thinks it's entirely accurate to say "Perl assumes Latin-1". It's probably better to say that Perl assumes binary or byte semantics - the bytes from your source code will be output unmodified. If the characters in the source code were in ISO8859-5 Cyrillic it would also "work" in the way you describe.

      However, I would recommend that if you have non-ASCII characters in your source code, you should save the source file in UTF8 format and add the use utf8; pragma.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009394]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-04-16 07:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (418 votes), past polls