Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Remove BOM ?

by davido (Archbishop)
on Oct 01, 2012 at 20:49 UTC ( #996744=note: print w/ replies, xml ) Need Help??


in reply to Remove BOM ?

This little snippet comes from Mojo::JSON:

# Remove BOM $bytes =~ s/^(?:\357\273\277|\377\376\0\0|\0\0\376\377|\376\377|\377 +\376)//g;

Dave


Comment on Re: Remove BOM ?
Download Code
Replies are listed 'Best First'.
Re^2: Remove BOM ?
by GrandFather (Sage) on Oct 02, 2012 at 20:04 UTC

    except that /g can't be right. A BOM can only appear as the first few bytes of a data stream. If there is a further BOM then most likely you've got a binary file rather than a text file.

    It's not clear to me what the nulls are doing in there.

    True laziness is hard work
      It's not clear to me what the nulls are doing in there.

      BOM in UTF-32 LE and BE encodings.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      It depends. Unless all your text processing tools are UNICODE-smart, you can easily end up with a BOM at the beginning of any line, not just the first, and really they could end up anywhere, depending on what you're doing. Imagine using cat and paste on files with BOMs. In my experience (and I've had quite a bit), I almost always end up having to delete BOM-looking strings from the entire file, not just the beginning.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://996744]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2015-08-05 09:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The oldest computer book still on my shelves (or on my digital media) is ...













    Results (81 votes), past polls