Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Remove BOM ?

by davido (Archbishop)
on Oct 01, 2012 at 20:49 UTC ( #996744=note: print w/ replies, xml ) Need Help??

in reply to Remove BOM ?

This little snippet comes from Mojo::JSON:

# Remove BOM $bytes =~ s/^(?:\357\273\277|\377\376\0\0|\0\0\376\377|\376\377|\377 +\376)//g;


Comment on Re: Remove BOM ?
Download Code
Replies are listed 'Best First'.
Re^2: Remove BOM ?
by GrandFather (Sage) on Oct 02, 2012 at 20:04 UTC

    except that /g can't be right. A BOM can only appear as the first few bytes of a data stream. If there is a further BOM then most likely you've got a binary file rather than a text file.

    It's not clear to me what the nulls are doing in there.

    True laziness is hard work
      It's not clear to me what the nulls are doing in there.

      BOM in UTF-32 LE and BE encodings.


      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      It depends. Unless all your text processing tools are UNICODE-smart, you can easily end up with a BOM at the beginning of any line, not just the first, and really they could end up anywhere, depending on what you're doing. Imagine using cat and paste on files with BOMs. In my experience (and I've had quite a bit), I almost always end up having to delete BOM-looking strings from the entire file, not just the beginning.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://996744]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2015-11-27 15:44 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (731 votes), past polls