Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

pp module messes encoding

by palkia (Monk)
on May 13, 2011 at 23:07 UTC ( #904757=perlquestion: print w/replies, xml ) Need Help??
palkia has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone

I used the pp module (this one) to convert my perl code to exe.

I wrote a simple code to extract some text from an online page with (this) WWW:Mechanize.
(The page contains Hebrew characters)
My code works gr8 as .pl but after it's converted to .exe with the pp module,
it gets a strange encoded (I think) text, that I can't understand nor convert to it's standard form.
Another weird thing is that the .exe created version of out.txt is about 4 times longer than the .pl's.

(tried & failed to convert with some functions from the utf8 pragma )

Suggestions ?
thx

My code (at least it's relevant part):
use feature 'state'; use WWW::Mechanize 'new'; my $mech = WWW::Mechanize->new(); #$mech->get('http://www.google.co.il'); $mech->get('http://tv.walla.co.il/?w=/2//200//2011-05-14/1'); my $content = $mech->content( format => 'text' ); writeFile('Out.txt',$content); print "got\n"; print $content; print "\nand thats it\a\n"; sleep 10; sub writeFile #form: writeFile(path,content) { my $filepath = shift; my $nwntnt = join("",@_); $nwntnt = "\x{feff}".$nwntnt; #utf8 char starter open(txt, ">:utf8",$filepath); print txt $nwntnt; close txt; }

Replies are listed 'Best First'.
Re: pp module messes encoding
by GrandFather (Sage) on May 13, 2011 at 23:41 UTC

    What does "(tried & failed to convert with some functions from the utf8 pragma )" mean?

    How about writing a small stand alone script that demonstrates the issue? That may mean embedding problematic text in the script. See I know what I mean. Why don't you?.

    True laziness is hard work
      It means: that I tried to convert the text result with:
      utf8::encode, utf8::decode, utf8::upgrade, & utf8::downgrade,
      and none converted the extracted text with the .exe version, to what was extracted in the .pl version.

      umm.. that was the script.
      I can't demonstrate the problem with a shorter script, because the Hebrew characters can't be shown right without my writeFile function, since Padre (or perl, not sure) doesn't support utf8 chars (like Hebrew chars) by default.

        The following works fine for me in a utf8 savy console (Komodo output window actually):

        use strict; use warnings; use utf8; my $str = 'עוד להיט'; binmode STDOUT, ":encoding(utf8)"; print $str;

        It's important to note that whatever is rendering your script output must be capable of handling whatever unicode characters you throw at it. If this script doesn't work right in the context you are having trouble then the fault is not with Perl but with the context.

        Update: Note that PerlMonks has munted the unicode characters :-(. The original script used Unicode characters for the string, not entities. The following also works and uses HTML entities in place of the unicode characters:

        use strict; use warnings; use utf8; use HTML::Entities; my $str = 'עוד להיט'; binmode STDOUT, ":encoding(utf8)"; print HTML::Entities::decode ($str);
        True laziness is hard work
        You forgot binmode STDOUT, ":encoding(utf8)";
        umm.. that was the script.

        But you're still not showing how you're invoking pp

Re: pp module messes encoding
by ZJ.Mike.2009 (Scribe) on May 14, 2011 at 10:39 UTC

    "My code works gr8 as .pl but after it's converted to .exe with the pp module, it gets a strange encoded (I think) text, that I can't understand nor convert to it's standard form. Another weird thing is that the .exe created version of out.txt is about 4 times longer than the .pl's."

    I did a quick test. I can confirm that I experienced the same problems (garbled characters and large file) as you described. For a quick solution, maybe you can try PerlApp. I did a test with PerlApp using your demo script. The executable produced by PerlApp works as expected.
      Thx for testing for yourself.
      It sound promising but I'm not sure what PerlApp is (module / program ..). I've encountered many results when googling for it.
      Can you please add a link to it?
      thx.
        PerlApp is a utility included in Perl Dev Kit (PDK), a commercial or 21-day free trial version of which can be downloaded here at www.activestate.com/perl-dev-kit. I've also tried Cava Packager, the author of which provides a fully working free-of-charge non-commericial licence, and the resulting executable can also produce a readable text file. Cava Packager can be downloaded here at www.cava.co.uk/download.html.

        For your reference, my testing environment is as follows: Windows XP SP3 ActiveState Perl 5.10.0 Par::Packer 1.002 PerlAPP 8.1 Cava Packager 2.0.48.443

      I did a quick test

      But what did you do?

        The OP says "My code works gr8 as .pl but after it's converted to .exe with the pp module, it gets a strange encoded (I think) text." Notice that the script works when evoked by the perl interpreter. The suspect is the pp utility. I packed the demo script that OP posted to an exectuable using the PP utility like the OP did and I ran the executable to see if the problem is reproducible on my system. Isn't this the first step to solving the problem?
Re: pp module messes encoding
by Anonymous Monk on May 14, 2011 at 14:10 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://904757]
Approved by GrandFather
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2018-01-23 18:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How did you see in the new year?










    Results (251 votes). Check out past polls.

    Notices?