Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

How can I create an UTF8 encoded txt file contains strings like "aaaa"?

by sinbao (Initiate)
on Sep 04, 2007 at 10:01 UTC ( #636883=perlquestion: print w/replies, xml ) Need Help??

sinbao has asked for the wisdom of the Perl Monks concerning the following question:

I failed to do that by code
__BEGIN__
open MyFile, ">:encoding(utf-8 )", "a.txt";
print MyFile "aaaa";
close MyFile
__END__

Question is, I want to write a visual studio solution file (.sln) which can't be recognized if it is not encoded in UTF-8 although the characters in it are all ASCII
  • Comment on How can I create an UTF8 encoded txt file contains strings like "aaaa"?

Replies are listed 'Best First'.
Re: How can I create an UTF8 encoded txt file contains strings like "aaaa"?
by zby (Vicar) on Sep 04, 2007 at 10:15 UTC
    If you want to have only ASCII characters in that file then you don't need to warry about the encoding - just print the characters into the file. For all ASCII characters their UTF8 encoding is identical as ASCII - or in other words UTF8 is an extension of ASCII - it is different only on characters that don't belong to ASCII.
Re: How can I create an UTF8 encoded txt file contains strings like "aaaa"?
by graff (Chancellor) on Sep 05, 2007 at 03:57 UTC
    If "visual studio" (presumably a Micro$oft product) happens to be "formally dependent" on using a "Micro$oft-Sanctioned" notion of utf8-encoded input file format, then you may need to ensure that the file begins with a "byte-order-mark" (BOM) character (U+FEFF) -- see whether this helps:
    open( OUT, ">:utf8", "a.txt" ) or die "a.out: $!"; print OUT "\x{feff}aaaa\n"; close OUT;
    For some reason, M$ apps seem to have adopted the use of a file-initial BOM character to signal that a "plain-text" data file contains utf8-encoded unicode characters. If a file contains utf8 wide characters without the initial BOM, apps like wordpad, etc, will misinterpret the wide characters as something else. And maybe "visual studio" is insisting that a file be "marked as containing utf8" even when it doen't need to include wide characters...

    (Of course, the BOM was originally intended to be of use only in UTF16-encoded unicode data files, to indicate the "endian-ness" (byte-order) of the 16-bit data, and it shouldn't really be needed at all in a utf8-encoded file, because utf8 is not affected by big-endian vs. little-endian byte-order. But a number of applications -- particularly M$ apps that are able to handle plain-text files along with their rogues-gallery of "application-specific file formats" -- have inexplicably come to depend on a utf8-encoded BOM at the start of the file, acting like a sort of "magic number" to let them know that they are looking at a utf8-encoded file.)

Re: How can I create an UTF8 encoded txt file contains strings like "aaaa"?
by rdfield (Priest) on Sep 04, 2007 at 12:41 UTC
    Your code appears to work as it is. Can you be more specific as to the problem you're having?

    rdfield

Re: How can I create an UTF8 encoded txt file contains strings like "aaaa"?
by sago (Scribe) on Sep 04, 2007 at 12:32 UTC

    open(IN, "<:utf8", "D:/Test/$file"); #$file is the filename

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://636883]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2020-02-18 04:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (74 votes). Check out past polls.

    Notices?