Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^5: Weird encoding after grabing filenames

by ikegami (Pope)
on Jun 17, 2009 at 14:13 UTC ( #772406=note: print w/replies, xml ) Need Help??


in reply to Re^4: Weird encoding after grabing filenames
in thread Weird encoding after grabing filenames

Because some characters are encoded the same in both ISO-8859-7 and UTF-8. So even though you program was buggy, you still happened to get some correct output.

  • Comment on Re^5: Weird encoding after grabing filenames

Replies are listed 'Best First'.
Re^6: Weird encoding after grabing filenames
by Nik on Jun 17, 2009 at 17:11 UTC
    But the data that actually got prined correctly weren't greek characters. They were data composed of english letters like "/home/nikos/public_html/data/text".

      Despite the "but", that doesn't contradict what I said. Those are characters that are encoded the same in both ISO-8859-7 and UTF-8.

      Specifically, U+0020..U+007E are encoded the same in ASCII, ISO-8859-* and UTF-8.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://772406]
help
Chatterbox?
[Corion]: Yaerox: There is Encode::Guess, but that needs a limited set of inputs, and it also cannot handle multiple single-byte encodings
[Corion]: If you have a BOM, that's a really easy way to recognize UTF-8. Otherwise, you can try to decode a file from UTF-8, and if that works OK and doesn't crash, most likely the file was valid UTF-8
[Corion]: But as "ansi" (Latin-1?) is a single-byte encoding, any file is a valid ANSI file

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (9)
As of 2017-03-28 13:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should Pluto Get Its Planethood Back?



    Results (332 votes). Check out past polls.