http://www.perlmonks.org?node_id=989377

anakin30 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I need your help. I want to extract output file into Unicode format. how can i do that??. Thank you in advanced.

Replies are listed 'Best First'.
Re: how to output file in Unicode
by moritz (Cardinal) on Aug 23, 2012 at 20:51 UTC

    Unicode is not a file format.

    Unicode is rather abstract concept: a collection of characters, and a mapping from the characters to numbers.

    But to write a file, there is one step missing: a character encoding, that transforms those numbers into bytes. Popular ones are UTF-8, and UTF-16LE, UTF-16BE and UTF-32.

    Some programs on windows write "Unicode" when they mean one of the UTF-16 encodings, but it's still wrong.

    So, what encoding do you want?

    Also please read Encodings and Unicode.

      My apology,i've missed out a step over here. I want to output file into Unicode (UTF-8). thanks for your reply

        hello anakin30

        How do you get the string you are going to print ?
        From file? from database? or it is the result of some module?

        And do you know the encoding of the string?

Re: how to output file in Unicode
by philiprbrenan (Monk) on Aug 23, 2012 at 21:20 UTC
    sub readUnicode($) {my ($f) = @_; open(my $F, "<:encoding(UTF-8)", $f) or die "Cannot open $f for unic +ode input"; local $/ = undef; <$F>; } sub writeUnicode($$) {my ($f, $s) = @_; if ($f =~ /\A(.+[\\\/])/) {my $d = $1; makePath($d); } open(my $F, ">:encoding(UTF-8)", $f) or die "Cannot open $f"; say {$F} $s; }

      apreciate if you could explain below lines and what it does?

      {my ($f, $s) = @_; can you explain what this line does?

      {my $d = $1; can you explain what this line does?

      say {$F} $s; can you explain what this line does?

        sub writeUnicode($$) {my ($f, $s) = @_;

        Assigns parameter 1 to the subroutine from @_ to $f and parameter 2 to $s. One could also write my $f = $_[0]; my $s = $_1;

        if ($f =~ /\A(.+[\\\/])/) {my $d = $1; makePath($d); }

        Extracts the path component of the file name and places it in variable $d so that we can make a directory for the output file. If you know that the output directory exists, then there is no need for this code. The assignment to $d is somewhat verbose, one could use makePath($1) to use the first expression captured by the regular expression in the if statement directly. The regular expression captures the text up to the last \ or / in the file name and uses that as the path component.

        say {$F} $s;

        Writes the contents of $s to the file whose handle is in $F. The file is automatically closed at the end of the block containing my $F. This statement is an alternative to $F->say($s);

Re: how to output file in Unicode
by zentara (Archbishop) on Aug 24, 2012 at 09:33 UTC