Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

pack mangles utf output

by roho (Monsignor)
on Aug 28, 2013 at 13:58 UTC ( #1051286=perlquestion: print w/ replies, xml ) Need Help??
roho has asked for the wisdom of the Perl Monks concerning the following question:

In the following code, pack is mangling utf output (the dash "hex 2013") compared to simply printing it.

#!/usr/bin/perl use strict; use warnings; use utf8; use open ':encoding(utf8)'; my $ofile = 'utf_issue.txt'; open(my $fh,'>', $ofile) or die "Error opening $ofile: $!\n"; binmode($fh, ":utf8"); my $x = 'FREIGHT INTRASTATE'; print $fh $x, "\n"; print $fh pack('A20',$x), "\n";

Is there a work-around for this? This is messing up my output big time!
Thanks.

UPDATE

Here is my work-around, since I am only using pack to pad with blanks:

################################################################### # This subroutine pads 'txt' on the right to length 'len'. ################################################################### sub mypack { my $len = shift; my $txt = shift; $len =~ s/^A//i; # Remove 'A' that system "pack" function uses. return sprintf("%s%s",$txt, ' 'x(($len-length($txt)))); }

"Its not how hard you work, its how much you get done."

Comment on pack mangles utf output
Select or Download Code
Re: pack mangles utf output
by daxim (Chaplain) on Aug 28, 2013 at 14:47 UTC
    pack works for me.
    $ use Devel::Peek
    
    $ Dump pack 'A20', "FREIGHT  INTRASTATE"
    SV = PV(0x4a5bce0) at 0x4a29660
      REFCNT = 1
      FLAGS = (PADTMP,POK,pPOK,UTF8)
      PV = 0x4644800 "FREIGHT \342\200\223 INTRASTATE"\0 [UTF8 "FREIGHT \x{2013} INTRASTATE"]
      CUR = 22
      LEN = 40
    []
    $ open my $fh, '>:encoding(UTF-8)', '/tmp/foobar'
    1
    $ $fh->print(pack 'A20', "FREIGHT  INTRASTATE")
    1
    
    /tmp$ hex /tmp/foobar
    0000  46 52 45 49 47 48 54 20  e2 80 93 20 49 4e 54 52  FREIGHT – INTR
    0010  41 53 54 41 54 45                                 ASTATE
    
      Hmmm ... I converted the utf character to hex in both output lines and here's what I got:
      without pack: e28093
      with pack: c3a2c280c293

      Could be my version of Perl (5.89) or my platform (Windows 7), but it is not working for me.

      "Its not how hard you work, its how much you get done."

        Yep, perl-5.8.9 under Linux affected
        $ perlbrew exec --with perl-5.8.9 perl -MDevel::Peek -e 'use utf8; Dump pack 'A20', "FREIGHT  INTRASTATE"'
        perl-5.8.9
        ==========
        SV = PV(0x1d14100) at 0x1d11350
          REFCNT = 1
          FLAGS = (PADTMP,POK,pPOK)
          PV = 0x1d69870 "FREIGHT \342\200\223 INTRASTA"\0
          CUR = 20
          LEN = 32
        
        
        $ perlbrew exec --with perl-5.10.0 perl -MDevel::Peek -e 'use utf8; Dump pack 'A20', "FREIGHT  INTRASTATE"'
        perl-5.10.0
        ==========
        SV = PV(0x1773098) at 0x178e288
          REFCNT = 1
          FLAGS = (PADTMP,POK,pPOK,UTF8)
          PV = 0x1798ca0 "FREIGHT \342\200\223 INTRASTATE"\0 UTF8 "FREIGHT \x{2013} INTRASTATE"
          CUR = 22
          LEN = 32
        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1051286]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2014-10-01 18:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (32 votes), past polls