Beefy Boxes and Bandwidth Generously Provided by pair Networks chromatic writing perl on a camel
We don't bite newbies here... much
 
PerlMonks  

Regex to convert non-printable to printable char

by nvivek (Priest)
on Aug 18, 2012 at 04:41 UTC ( #988145=perlquestion: print w/ replies, xml ) Need Help??
nvivek has asked for the wisdom of the Perl Monks concerning the following question:

Dear All, I need to convert a non-printable characters to printable characters in my perl program. I am able to find non-printable characters by following regular expression.
if($string=~/[\x00-\x1F]+/) { # here, I want to replace the non-printable characters by printable ch +aracters which shouldn't cause any problem to XML::Writer emptyTag fu +nction. }

Comment on Regex to convert non-printable to printable char
Download Code
Re: Regex to convert non-printable to printable char
by Anonymous Monk on Aug 18, 2012 at 06:29 UTC
Re: Regex to convert non-printable to printable char
by CountZero (Chancellor) on Aug 18, 2012 at 06:49 UTC
    Replacing all these unprintable characters by the same printable character (in this case a full stop): use the substitution operator
    $string=~s/[\x00-\x1F]+/./g
    Note the g at the end: it means it will do the replacement many times, i.e. for all the unprintable characters in $string

    Added s. Thanks to AnomalousMonk keen eyes.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Regex to convert non-printable to printable char
by tobyink (Abbot) on Aug 18, 2012 at 06:52 UTC

    You know \x09 is tab, \x0A is line feed, and \x0D is carriage return. You probably don't want to replace those.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: Regex to convert non-printable to printable char
by BillKSmith (Hermit) on Aug 18, 2012 at 15:09 UTC

    This seems to be just what the tr operator is for.

    Bill
Re: Regex to convert non-printable to printable char
by GlitchMr (Sexton) on Aug 22, 2012 at 10:04 UTC

    Just a note, \x7F is also unprintable character, In one of my scripts, I use following translation to filter non-printable characters.

    tr[\0-\x1F\x7F] [\x{2400}-\x{241F}\x{2421}]

    \x{2420} is space, in case you have noticed a gap in range. If you don't want to replace them with Unicode non-printable graphics characters, you could replace \x{2400}-\x{241F}\x{2421} with ? or \x{FFFD} (Unicode replacement character).

    Also, in most cases you wouldn't want to match \x09 (tab), \x0A (line feed) or \x0D (carriage return). You could use this translation when you don't want to match them.

    tr[\0-\x08\x0B\x0C\x0E-\x1F\x7F] [\x{2400}-\x{2408}\x{240B}\x{240C}\x{240E}-\x{241F}\x{2421}]

    It also doesn't work on EBCDIC, if you want to match EBCDIC, most likely you will need different range of characters (\x00-\x3F\xFF).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://988145]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2014-04-17 01:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (437 votes), past polls