Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: Remove BOM ?

by davido (Archbishop)
on Oct 01, 2012 at 20:49 UTC ( #996744=note: print w/replies, xml ) Need Help??

in reply to Remove BOM ?

This little snippet comes from Mojo::JSON:

# Remove BOM $bytes =~ s/^(?:\357\273\277|\377\376\0\0|\0\0\376\377|\376\377|\377 +\376)//g;


Replies are listed 'Best First'.
Re^2: Remove BOM ?
by GrandFather (Sage) on Oct 02, 2012 at 20:04 UTC

    except that /g can't be right. A BOM can only appear as the first few bytes of a data stream. If there is a further BOM then most likely you've got a binary file rather than a text file.

    It's not clear to me what the nulls are doing in there.

    True laziness is hard work
      It's not clear to me what the nulls are doing in there.

      BOM in UTF-32 LE and BE encodings.


      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      It depends. Unless all your text processing tools are UNICODE-smart, you can easily end up with a BOM at the beginning of any line, not just the first, and really they could end up anywhere, depending on what you're doing. Imagine using cat and paste on files with BOMs. In my experience (and I've had quite a bit), I almost always end up having to delete BOM-looking strings from the entire file, not just the beginning.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://996744]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (10)
As of 2018-06-22 18:00 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (124 votes). Check out past polls.