Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: A Regex for no-break space Unicode Entities

by graff (Chancellor)
on Sep 13, 2006 at 13:00 UTC ( [id://572731]=note: print w/replies, xml ) Need Help??


in reply to A Regex for no-break space Unicode Entities

bart is right -- this is a cleaner, safer way:
#!/usr/bin/perl -w use warnings; use strict; binmode(STDIN,":utf8"); binmode(STDOUT,":utf8"); while(<>) { # if you just want to get rid of non-breaking spaces, do this: tr/\xA0/ /; # if you really want to change every kind of whitespace and every stri +ng # of two or more whitespace to a single space, do this instead: s/\s+/ /g; # in utf8 strings, \s matches non-breaking space s/ $/\n/; # (puts back the \n at the end of the line) print; }
(updated to remove incorrect use of "g" modifier on tr///)

Replies are listed 'Best First'.
Re^2: A Regex for no-break space Unicode Entities
by kettle (Beadle) on Sep 13, 2006 at 13:40 UTC
    # of two or more whitespace to a single space, do this instead: s/\s+/ /g; # in utf8 strings, \s matches non-breaking space I read this on a webpage somewhere, but for one reason or another, it did not produce the desired results. The binmode utf8 thing did not work either. Though more unpredictable, and for reasons I cannot completely explain, the byte mode solution was the only one I could get to produce the desired results.
      but for one reason or another, it did not produce the desired results.

      It would be neat if you could show a minimal self-contained example to demonstrate this. It could be you were still missing something simple, like you did binmode STDOUT, ":utf8"; but then actually read your input from some other file handle (e.g. ARGV), instead of actually piping or redirecting data to the script. And see what the results actually were could help as well.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://572731]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-25 23:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found