Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

convert a Unicode file to Ansi, using filehandle..

by nico7nibor (Novice)
on Nov 06, 2009 at 06:47 UTC ( #805414=perlquestion: print w/replies, xml ) Need Help??
nico7nibor has asked for the wisdom of the Perl Monks concerning the following question:


I have a registry file already decoded in Unicode, I need to replace a value within that file and implement a search and replace...but I keep getting the original file contents instead of the expected result that I am aiming.

$Profile="C:\\registry.reg"; # unicode file $Param="Name"; open (FILE, "<:encoding(UTF-16)", $Profile); @Lines = <FILE>; open FILE, ">$Profile" or die $!; @Lines = <FILE>; close(FILE); foreach $Lines (@Lines) { if ($Lines =~ m/HKEY_LOCAL_MACHINE\\SOFTWARE\\$Param/g) { @LinesSub = split(/\\/,$Lines); $Lines =~ s/$Temp/$Param/g; } print FILE "$Lines"; } close(FILE);

When the file is converted to Ansi manually, the value of the parameter is changed. But I need to make the script do that. I cant seem to change the original (Unicode) file after updating the value from search and replace with the script that i have. It keeps giving me this garbage strings.

I appreciate your help, thanks in advance...

Replies are listed 'Best First'.
Re: convert a Unicode file to Ansi, using filehandle..
by CountZero (Bishop) on Nov 06, 2009 at 07:29 UTC
    Why do you think your file is converted to "ANSI"? Unless you are using an old Perl (pre 5.6.0) Perl can handle Unicode strings.

    From the docs (perluniintro - Perl Unicode introduction):

    The principle is that Perl tries to keep its data as eight-bit bytes for as long as possible, but as soon as Unicodeness cannot be avoided, the data is transparently upgraded to Unicode.
    To make sure that your output file is encoded the same way as the input file, make sure that you do
    open FILEHANDLE, '>:encoding(UTF-16)', 'my/output/file';


    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: convert a Unicode file to Ansi, using filehandle..
by markuhs (Scribe) on Nov 06, 2009 at 07:08 UTC
    Hi, welcome to perlmonks!

    First of all: use strict; and use warnings; is great!
    This would tell you, that "Global symbol "$Temp" requires explicit package name".
    Second: use autodie; or add another or die $!; to line 4.

    Why do you open the file twice? You should at least close the filehandle before opening it again.
    Closing it before writing to it, does not work either!
    Oh, now I understand!!! You use the same filehandle for input and output! That will never work, use two different ones...
    open (FILE, "<:encoding(UTF-8)", $Profile) or die $!; my @Lines = <FILE>; close(FILE); open (FILE2, ">", $Profile) or die $!;
    and the end...
    print FILE2 "$Line"; } close(FILE2);
Re: convert a Unicode file to Ansi, using filehandle..
by ikegami (Pope) on Nov 06, 2009 at 16:24 UTC

    use strict; use warnings;! It'll find numerous errors including the fact that you use $Temp without ever assigning a value to it.

    m//g in scalar context is almost always an error. It is here.

    Your pattern will accidentally match keys "Named", "Names", etc.

    use strict; use warnings; my $Profile = "C:\\registry.reg"; # unicode file my $old_name = 'Foo'; my $new_name = 'Bar'; my @reg_file; { open(my $fh, '<:raw:perlio:encoding(UTF-16):crlf', $Profile) or die("Can't open file $Profile: $!\n"); @reg_file = <$fh> } for (@reg_file) { s/ ^ ( \[ HKEY_LOCAL_MACHINE\\SOFTWARE\\ ) $old_name ( (?:\\.*)? \] ) /$1$new_name$2/x; } { open(my $fh, '>:raw:perlio:encoding(UTF-16le):crlf', $Profile) or die("Can't overwrite file $Profile: $!\n"); print $fh "\x{FEFF}", @reg_file; }

    The weird stuff on the open line is to handle CRLF properly with UTF-16

    Using UTF-16 instead of the real encoding (UTF-16le) removes the byte order mark, if I remember correctly. That's the odd character I print out at the end.

    Ok, fine, I think the file is actually encoded using UCS-2le, not UTF-16le, but it's a subset of UTF-16le, so all's good.

Re: convert a Unicode file to Ansi, using filehandle..
by nico7nibor (Novice) on Nov 06, 2009 at 08:52 UTC

    Thanks for the reply guys.

    I am using Perl 5.8.9. I tried to create a separate script based from your example. Making 2 different variables for input and output and encoded both input and output..but i am getting these error outputs..

    utf8 "\xFF" does not map to Unicode at line 7.
    utf8 "\xFE" does not map to Unicode at line 7.

    bare with me, my apologies... and thanks...

      That could be a BOM (byte order mark) at the begining of your file. In these cases I've used File::BOM with success. Something along the lines of
      use File::BOM qw( :all ); # ... open_bom(my $fh, $file, q{:utf8}) or die qq{cant open ->$file<- to read: $!\n};
Re: convert a Unicode file to Ansi, using filehandle..
by nico7nibor (Novice) on Nov 11, 2009 at 02:59 UTC

    Hi guys,

    It works now though there is a small issue. Thank you for helping me.

    Gb...Thanks again...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://805414]
Approved by moritz
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2018-04-20 11:15 GMT
Find Nodes?
    Voting Booth?