Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

How to replace non-printable characters within delimited field

by xaprb (Scribe)
on Jan 03, 2007 at 19:56 UTC ( [id://592818]=perlquestion: print w/replies, xml ) Need Help??

xaprb has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, another question related to innotop. Innotop displays currently running queries in MySQL to a terminal, and one user wrote me with troubles caused by gzipped data in queries:

update tbl set col='@@@@', col="@@@@" ...

The @@@@ is gzipped cuss-words that freak his terminal out. The way we'd like this to be handled is to simply replace the field's contents with three question marks:

update tbl set col='???', col="???" ...

Now for the fun part: fields can be delimited with either single or double quotes. If single, doubles can be included and vice versa. If single quotes are used, more single quotes can be embedded, preceded by backslashes. And finally, no substitution should happen unless the field has non-printable characters in it.

A friend and I tried for a while to write a regex to accomplish this, but it gets really confusing. I have a nagging feeling this is a solved problem, but I can't turn it up in web searches.

Here's where we've gotten so far:

$text =~ s/ (?<!\\)' # A quote, NOT after a backslash (?:[^'\p{IsC}]*?(?:\\')?)*? # This is hard to explain... \p{IsC} # At least one non-printable char (?:[^']*?(?:\\')?)*? # More stuff... (?<!\\)' # Another single quote /'???'/gx;

This sorta works for single-quoted fields only. But I have a feeling this is entirely the wrong approach, and I'd like to do it in one pass, not twice (one for single quotes, one for double). Among other reasons, if there's a single-quoted string embedded within double quotes or vice versa, one pass that handles both cases is much preferable.

Where's the a-ha moment? Or is there one?

Thanks in advance!

Replies are listed 'Best First'.
Re: How to replace non-printable characters within delimited field
by GrandFather (Saint) on Jan 03, 2007 at 20:04 UTC
      Indeed, I think I am :-) Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://592818]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-04-19 23:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found