in reply to Re: Text File Encoding under Windows
in thread Text File Encoding under Windows
Thanks, almut -
This solved part of the problem. The file is now getting read in OK and the expected regex matches occur. However, the output is still causing problem because the non-ASCII characters in it do not get represented correctly. I have tried the following two approaches:
1) Printing out to the DOS console and redirecting the output from there into a file. The result looks fine - if it were not for the special characters that get represented as EF,FC etc.
2) Printing to a UTF-16 encoded file with the following code:
Can you please advise what I need to do to fix both output variants?
Thanks again for your help!
Pat
This solved part of the problem. The file is now getting read in OK and the expected regex matches occur. However, the output is still causing problem because the non-ASCII characters in it do not get represented correctly. I have tried the following two approaches:
1) Printing out to the DOS console and redirecting the output from there into a file. The result looks fine - if it were not for the special characters that get represented as EF,FC etc.
2) Printing to a UTF-16 encoded file with the following code:
The result was an output file which represented all special characters correctly but contained a line of empty boxes in every second line.#! /usr/bin/perl -w use strict; use locale; open INPUT, "<:encoding(UTF-16LE)", $ARGV[0]; open OUTPUT, ">:encoding(UTF-16LE)", "./Output_UTF-16"; while ( <INPUT> ) { # long list of regex-based replacements print OUTPUT $_; }
Can you please advise what I need to do to fix both output variants?
Thanks again for your help!
Pat
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: Text File Encoding under Windows
by almut (Canon) on Mar 18, 2010 at 09:35 UTC | |
by pat_mc (Pilgrim) on Mar 18, 2010 at 14:49 UTC |
In Section
Seekers of Perl Wisdom