By the way I found that \x{FEFF} was not the same as \x{ef}\x{bb}\x{bf}
Yeah, "\x{ef}\x{bb}\x{bf}" is the UTF-8 encoding of the BOM / U+FEFF / "\x{FEFF}".
| [reply] |
| [reply] |
$ perl -MEncode -e'print encode("UTF-8", chr(0xFEFF))' | od -t x1
0000000 ef bb bf
0000003
$ perl -MEncode -e'print encode("UTF-16be", chr(0xFEFF))' | od -t x1
0000000 fe ff
0000002
$ perl -MEncode -e'print encode("UTF-16le", chr(0xFEFF))' | od -t x1
0000000 ff fe
0000002
| FEFF | BOM
|
|---|
| 2B,2F,76,38,2D | BOM encoded using UTF-7
|
|---|
| EF,BB,BF | BOM encoded using UTF-8
|
|---|
| FE,FF | BOM encoded using UTF-16be
|
|---|
| FF,FE | BOM encoded using UTF-16le
|
|---|
| 00,00,FE,FF | BOM encoded using UTF-32be
|
|---|
| FF,FE,00,00 | BOM encoded using UTF-32le
|
|---|
So you won't find FE,FF in a UTF-8 file, but just like in a UTF-16be file, you can find an encoded FEFF in a UTF-8 file.
| [reply] [d/l] |