UTF-8 can also be guessed, but it is harder and can be mixed up with some legacy encoding.

Not really. The problem is the amount of lookahead needed. With a BOM, one can be sure after reading just a few bytes.

