http://www.perlmonks.org?node_id=1051286

roho has asked for the wisdom of the Perl Monks concerning the following question:

In the following code, pack is mangling utf output (the dash "hex 2013") compared to simply printing it.

#!/usr/bin/perl use strict; use warnings; use utf8; use open ':encoding(utf8)'; my $ofile = 'utf_issue.txt'; open(my $fh,'>', $ofile) or die "Error opening $ofile: $!\n"; binmode($fh, ":utf8"); my $x = 'FREIGHT – INTRASTATE'; print $fh $x, "\n"; print $fh pack('A20',$x), "\n";

Is there a work-around for this? This is messing up my output big time!
Thanks.

UPDATE

Here is my work-around, since I am only using pack to pad with blanks:

################################################################### # This subroutine pads 'txt' on the right to length 'len'. ################################################################### sub mypack { my $len = shift; my $txt = shift; $len =~ s/^A//i; # Remove 'A' that system "pack" function uses. return sprintf("%s%s",$txt, ' 'x(($len-length($txt)))); }

"Its not how hard you work, its how much you get done."

Replies are listed 'Best First'.
Re: pack mangles utf output
by daxim (Curate) on Aug 28, 2013 at 14:47 UTC
    pack works for me.
    $ use Devel::Peek
    
    $ Dump pack 'A20', "FREIGHT – INTRASTATE"
    SV = PV(0x4a5bce0) at 0x4a29660
      REFCNT = 1
      FLAGS = (PADTMP,POK,pPOK,UTF8)
      PV = 0x4644800 "FREIGHT \342\200\223 INTRASTATE"\0 [UTF8 "FREIGHT \x{2013} INTRASTATE"]
      CUR = 22
      LEN = 40
    []
    $ open my $fh, '>:encoding(UTF-8)', '/tmp/foobar'
    1
    $ $fh->print(pack 'A20', "FREIGHT – INTRASTATE")
    1
    
    /tmp$ hex /tmp/foobar
    0000  46 52 45 49 47 48 54 20  e2 80 93 20 49 4e 54 52  FREIGHT – INTR
    0010  41 53 54 41 54 45                                 ASTATE
    
      Hmmm ... I converted the utf character to hex in both output lines and here's what I got:
      without pack: e28093
      with pack: c3a2c280c293

      Could be my version of Perl (5.89) or my platform (Windows 7), but it is not working for me.

      "Its not how hard you work, its how much you get done."

        Yep, perl-5.8.9 under Linux affected
        $ perlbrew exec --with perl-5.8.9 perl -MDevel::Peek -e 'use utf8; Dump pack 'A20', "FREIGHT – INTRASTATE"'
        perl-5.8.9
        ==========
        SV = PV(0x1d14100) at 0x1d11350
          REFCNT = 1
          FLAGS = (PADTMP,POK,pPOK)
          PV = 0x1d69870 "FREIGHT \342\200\223 INTRASTA"\0
          CUR = 20
          LEN = 32
        
        
        $ perlbrew exec --with perl-5.10.0 perl -MDevel::Peek -e 'use utf8; Dump pack 'A20', "FREIGHT – INTRASTATE"'
        perl-5.10.0
        ==========
        SV = PV(0x1773098) at 0x178e288
          REFCNT = 1
          FLAGS = (PADTMP,POK,pPOK,UTF8)
          PV = 0x1798ca0 "FREIGHT \342\200\223 INTRASTATE"\0 UTF8 "FREIGHT \x{2013} INTRASTATE"
          CUR = 22
          LEN = 32