Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

help neeed in identifying file encoding

by uva (Sexton)
on Mar 20, 2006 at 17:22 UTC ( [id://537997]=perlquestion: print w/replies, xml ) Need Help??

uva has asked for the wisdom of the Perl Monks concerning the following question:

dear monks,
refer to help needed in file encoding this also,
In the below,i tried to compare the BOM in order to identify whether the file is encoded in utf8 or not.
but i got problem over here ,
this program gives output as "its utf8" , for all the files even for non utf8 encoded files also.
use bytes; open IN,"<d:\\input.txt" or print "could not open the input file"; read IN,my $text,6,0; print "its utf8" if (chr(0xFEFF)==$text) ; close IN;

GrandFather fixed links and bOrked br tags.

Replies are listed 'Best First'.
Re: help neeed in identifying file encoding
by Aristotle (Chancellor) on Mar 20, 2006 at 17:32 UTC

    You are using the numeric equality operator == to compare strings. Most strings numify to 0, so you end up testing a 0 == 0, which is true. Use the string equality operator eq instead.

    That won’t work either, though, since under use bytes your chr 0xFEFF will simply produce chr 0xFF – not what you’re looking for.

    Makeshifts last the longest.

Re: help neeed in identifying file encoding
by idsfa (Vicar) on Mar 20, 2006 at 19:51 UTC

    You can use File::BOM to determine the encoding of a file:

    open $fh, '<', 'd:\input.txt' or die 'Could not open file'; $encoding = get_encoding_from_filehandle($fh);

    The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://537997]
Approved by friedo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2025-01-13 14:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Which URL do you most often use to access this site?












    Results (32 votes). Check out past polls.