Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Masking data in a MySQL dump file. Ideas?

by dliakh (Acolyte)
on Oct 13, 2016 at 11:49 UTC ( [id://1173931]=note: print w/replies, xml ) Need Help??


in reply to Re: Masking data in a MySQL dump file. Ideas?
in thread Masking data in a MySQL dump file. Ideas?

Hi Laurent,

Only the SQL statements are delimited from each other by the new line characters. All the rows of data in the INSERT statement look like this:

INSERT INTO sample_table (first_name,last_name) VALUES ('John','Doe'),('John','Smith'),('Foo','O\'Bar'),('Baz','D\'Qux (II)'),('(and so on, ','and so forth);');

No new line character starting from the beginning of the INSERT word and until the very terminating ; character.

The values inside quotes, of course, can contain embedded special characters like apostrophe, comma, opening and closing parentheses, semicolon.

Thank you

Replies are listed 'Best First'.
Re^3: Masking data in a MySQL dump file. Ideas?
by clueless newbie (Curate) on Oct 13, 2016 at 13:22 UTC
    Hi, Doesn't look too bad. First you want something along the lines of
    my $balpar_re=qr{ ( \( # opening ( (?:[^'"()]++ # Not a ', ", ( or ) - no + backtracking # incorrect |(?:'(?:[^']|\')*?') #' a single qu +ote string # incorrect |(?:"(?:[^"]|\")*?") #" a double qu +ote string |(?:'(?:\\'|[^'])*?') #' a single quote string |(?:"(?:\\"|[^"])*?") #" a double quote string |(?-1) # must be a ( or ) so re +curse )*+ # zero or more time - no +backtracking \) # closing ) ) }x;
    so you can get a "(" with its matching ")". Then you want a substitution regex that grabs the contents of each of the sets of balanced parenthesis and feeds them to a function that processes them (one set at a time). Note the above regex does deal with quoted strings.
    #!/usr/bin/env perl use Data::Dumper; use strict; use warnings; my $string=q{INSERT INTO sample_table (first_name,last_name) VALUES (' +John','Doe'),('John','Smith'),('Foo','O\'Bar'),('Baz','D\'Qux (II)'), +('(and so on, ','and so forth);');}; my $balpar_re=qr{ ( \( # opening ( (?:[^'"()]++ # Not a ', ", ( or ) - no + backtracking |(?:'(?:\\'|[^'])*?') #' a single quote string |(?:"(?:\\"|[^"])*?") #" a double quote string |(?-1) # must be a ( or ) so rec +urse )*+ # zero or more time - no +backtracking \) # closing ) ) }x; $string=~ s{$balpar_re}{mung($1)}ge; exit; sub mung { warn Data::Dumper->Dump([\@_],[qw(*_)]),' '; }
    gives
    @_ = ( '(first_name,last_name)' ); at perlmonks.pl line 25. @_ = ( '(\'John\',\'Doe\')' ); at perlmonks.pl line 25. @_ = ( '(\'John\',\'Smith\')' ); at perlmonks.pl line 25. @_ = ( '(\'Foo\',\'O\\\'Bar\')' ); at perlmonks.pl line 25. @_ = ( '(\'Baz\',\'D\\\'Qux (II)\')' ); at perlmonks.pl line 25. @_ = ( '(\'(and so on, \',\'and so forth);\')' ); at perlmonks.pl line 25.

      Hi,

      The dump files are huge to be loaded into a variable entirely. Could you please suggest a way to load and process the content by parts of limited size?

      Your regular expression is interesting. Thank you for this example

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1173931]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2025-07-19 11:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.