Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^3: Masking data in a MySQL dump file. Ideas?

by clueless newbie (Curate)
on Oct 13, 2016 at 13:22 UTC ( [id://1173935]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Masking data in a MySQL dump file. Ideas?
in thread Masking data in a MySQL dump file. Ideas?

Hi, Doesn't look too bad. First you want something along the lines of
my $balpar_re=qr{ ( \( # opening ( (?:[^'"()]++ # Not a ', ", ( or ) - no + backtracking # incorrect |(?:'(?:[^']|\')*?') #' a single qu +ote string # incorrect |(?:"(?:[^"]|\")*?") #" a double qu +ote string |(?:'(?:\\'|[^'])*?') #' a single quote string |(?:"(?:\\"|[^"])*?") #" a double quote string |(?-1) # must be a ( or ) so re +curse )*+ # zero or more time - no +backtracking \) # closing ) ) }x;
so you can get a "(" with its matching ")". Then you want a substitution regex that grabs the contents of each of the sets of balanced parenthesis and feeds them to a function that processes them (one set at a time). Note the above regex does deal with quoted strings.
#!/usr/bin/env perl use Data::Dumper; use strict; use warnings; my $string=q{INSERT INTO sample_table (first_name,last_name) VALUES (' +John','Doe'),('John','Smith'),('Foo','O\'Bar'),('Baz','D\'Qux (II)'), +('(and so on, ','and so forth);');}; my $balpar_re=qr{ ( \( # opening ( (?:[^'"()]++ # Not a ', ", ( or ) - no + backtracking |(?:'(?:\\'|[^'])*?') #' a single quote string |(?:"(?:\\"|[^"])*?") #" a double quote string |(?-1) # must be a ( or ) so rec +urse )*+ # zero or more time - no +backtracking \) # closing ) ) }x; $string=~ s{$balpar_re}{mung($1)}ge; exit; sub mung { warn Data::Dumper->Dump([\@_],[qw(*_)]),' '; }
gives
@_ = ( '(first_name,last_name)' ); at perlmonks.pl line 25. @_ = ( '(\'John\',\'Doe\')' ); at perlmonks.pl line 25. @_ = ( '(\'John\',\'Smith\')' ); at perlmonks.pl line 25. @_ = ( '(\'Foo\',\'O\\\'Bar\')' ); at perlmonks.pl line 25. @_ = ( '(\'Baz\',\'D\\\'Qux (II)\')' ); at perlmonks.pl line 25. @_ = ( '(\'(and so on, \',\'and so forth);\')' ); at perlmonks.pl line 25.

Replies are listed 'Best First'.
Re^4: Masking data in a MySQL dump file. Ideas?
by dliakh (Acolyte) on Oct 13, 2016 at 15:14 UTC

    Hi,

    The dump files are huge to be loaded into a variable entirely. Could you please suggest a way to load and process the content by parts of limited size?

    Your regular expression is interesting. Thank you for this example

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1173935]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-03-28 14:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found