Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

RFC: Placeholder creation for SQL statements

by LanX (Archbishop)
on Mar 08, 2018 at 23:33 UTC ( #1210536=perlmeditation: print w/replies, xml ) Need Help??

Hi

Using placeholders are a must in SQL!

#prepare my $sth = $dbh->prepare('SELECT * FROM people WHERE lastname = ? AND f +irstname = ?'); #execute with list of bindvars $sth->execute( $lastname, $firstname );

But it's a bit cumbersome to adjust the bind values if the order changes.

It's even more work if you have to use an array of values like inside an IN ( ?, ?, ?) operation.

I started to hack something to auto-generate placeholders, for a string passed inside a code-block:

  • $scalars from the closure are replaced with a placeholder ?
  • @arrays are replaced with a list of comma separated placeholders ?,?,?
  • underscored _var_names are ignored ( placeholders can't be everywhere)
The second returned parameter is a list of var-refs in the correct order, such that the bind variables can be safely changed.

Parsing the output of B::Deparse is even more fragile than I thought, the next version will walk the OP-Tree directly. (For instance parsing multiline SQL doesn't work yet.)

I'm not yet sure how to combine this in the best way with DBI.

This is a one days job in the sense of "release often".

Comments?

update

Hmmm ... I can probably avoid the hassle of parsing the OP-tree by tying the variables ...

use strict; use warnings; use B::Deparse; use PadWalker qw/closed_over peek_sub set_closed_over/; use Data::Dump qw/pp/; # ========= Tests use Test::More; # lexicals for placeholders my $a = 'A'; my @list = qw/L I S T/; my $x = 'X'; # no placeholders for underscore vars my @_table = "any_table"; my $sql = sub { "SELECT * FROM @_table WHERE a = $a AND b IN (@list) A +ND c = $x" }; my @stm = holderplace($sql); is_deeply( \@stm, [ "SELECT * FROM any_table WHERE a = ? AND b IN (?, ?, ?, ? +) AND c = ?", [\"A", ["L", "I", "S", "T"], \"X"] ], "statement with placeholders plus bind variables" ); # change bind variables $a = 'AA'; @list = qw/LL II SS TT/; $x = 'XX'; is_deeply( \@stm, [ "SELECT * FROM any_table WHERE a = ? AND b IN (?, ?, ?, ? +) AND c = ?", [\"AA", ["LL", "II", "SS", "TT"], \"XX"] ], "statement with placeholders plus changed variables" ); done_testing(); # ========== Code sub holderplace { my ($lambda)=@_; my $h_vars = closed_over($lambda); my %new_vars; my @value_refs; for my $key ( keys %$h_vars) { my $sigil = substr $key,0,1; # exclude variables starting with _ next if $key =~ m/^\Q${sigil}\E_/; if ( '$' eq $sigil ) { $new_vars{$key} = \'?'; } elsif ( '@' eq $sigil ) { $new_vars{$key} = [ join ", ", ("?") x @{$h_vars->{$key} } ]; } else { next; # Error? } } # Create Statement with placeholders set_closed_over( $lambda, \%new_vars ); my $newstr = $lambda->(); # Variable refs in order of placeholders my @var_refs = map { $h_vars->{$_} } grep { $new_vars{$_} } @{ get_vars($lambda) }; return ("$newstr", \@var_refs ); } sub get_vars { # scans output of B::Deparse to get interpolated vars in order my ($lambda)=@_; # deparse sub body my $source = B::Deparse->new('-q')->coderef2text($lambda); # returns something like: # { # use warnings; # use strict; # 'SELECT * FROM ' . join($", @_table) . ' WHERE x = ' . $a . ' AN +D b IN (' . join($", @list) . ') ' . $x; # } # truncate {block} and use statements $source =~ s/^{\s*(use.*?;\s*)*//s; $source =~ s/;\s*}$//s; #warn $source; my %quotes = qw"[ ] ( ) < > { } / /"; $quotes{'#'}='#'; # single quotes like q(...) my $re_q = join "|", map { "q\\$_.*?\\$quotes{$_}" } keys %quotes; #warn pp my @parts = split /\s* (?: '(?:\\'|[^'])*?' | $re_q )\s*/msx, $so +urce; for my $part (@parts) { next unless $part =~ /^\..*\.?$/; if ( $part =~ /^\. join\(.*? (\@\w+)\)( \.)?$/) { $part = $1; # array } elsif ( $part =~ /^\. (\$\w+)( \.)?$/) { $part = $1; # scalar } } return \@parts; }

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re: RFC: Placeholder creation for SQL statements
by Eily (Prior) on Mar 09, 2018 at 16:20 UTC

    I really don't like that it uses closures to be honest, because closures are a totally unrelated concept that would be needed to understand why declaring the request suddenly needs to be done with that extra sub layer (maybe a (&) prototype might help). Or that would mean using the syntax "because that's how it works", ie, enforced cargo-culting. And even if you understand closures, it's not really their normal use (potentially shared, static-like variables) but hacking around their implementation and side effects (access the variables outside their scopes, delay the interpolation), so that's magic where the implementation leaks into the calling context, the wrong kind of magic IMHO.

    If tying the variables allows you to remove the closures I think that's fine, but otherwise if using your tools requires tied variables and closures, that would be too many advanced concept to make using DBI "simpler".

      Upvoted for constructive critic! :)

      First, that's not the final interface, just a prove of concept.

      > maybe a (&) prototype might help

      That's the plan, but for testing and demonstration I needed a lambda.

      > "because that's how it works", ie, enforced cargo-culting.

      I don't understand your point here.

      There are different use cases at my job already

      • we have a lot of legacy code without placeholders.
      • colleagues are reluctant to adjust it
      • some colleagues insist they need "named" placeholders
      • I need a lot of boiler plate to prepare (?,?,...) lists
      Legacy code looks like

      local $"=","; my $sth = $dbh->prepare( "SELECT * FROM $table WHERE lastname = $lastname AND firstname IN (@firstnames) " ); $sth->execute( $lastname, @firstnames );

      I want to change it to something like:

      my $sth = $dbh->xprepare( ph { "SELECT * FROM $_table WHERE lastname = $lastname AND firstname IN (@firstnames) " }); $sth->xexecute();

      (not sure yet, but methods can't have prototypes and DBI allows to hook in with callbacks)

      > (access the variables outside their scopes, delay the interpolation), so that's magic where the implementation leaks into the calling context, the wrong kind of magic IMHO.

      I don't get your point, nothing is leaking here. I'm overwriting the pad with references of new vars. This means all effects are restricted to the code block, which is only executed once.

      My goals are:

      • DRY-ness : avoiding repetition of code and boilerplates
      • more flexibility and speed when SQL changes
      • less errors
      • better control over code quality

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Wikisyntax for the Monastery

      update

      While writing this post I had to re-edit the "legacy" part twice, because I mistyped repeated variables.

      And it's still wrong. ;-)

        nothing is leaking here
        Your code is implemented with closures because they let you replace the closed over variables. That's fine. But those closures must be created by the user, so either the user understands that your code is creating a closure, and that's what I call a leak (calling the tool correctly is done by knowing how it's implemented), or the user doesn't know that and just knows that this is how the call must be written, because reasons (that's where you get into cargo cult territory).

        I think your proposed version with ph is nearly fine though, becauses it kind of hides that exposed implementation detail, and it's easier to abstract away as a magic block where interpolation becomes placeholders, unlike the version with sub where advanced features of a common keyword are used.

        You didn't answer my question about package vars though, which is one of the potential issues with using closures: if you don't know that this is what happens, you don't understand that you only have to use lexicals (assuming your code doesn't work with package vars).

        Also, what would the following do, search for Eily twice or work as intended?

        my $username = 'LanX'; my $req1 = $db->xprepare(ph { "SELECT from users WHERE nick = '$userna +me'" }); $username = 'Eily'; my $req2 = $db->xprepare(ph { "SELECT from users WHERE nick = '$userna +me'" }); $req1->xexecute(); $req2->xexecute();
        If they work as intended then the main issue might be package vars (unless your code works with them), and the fact that it looks like you have a more complicated syntax to do basically the same thing. That's again a good reason for ph { } rather than sub { }, several call to ph may work fine, but you have to be careful with sub. If it searches twice for Eily, this means that you have to understand how it works.

        If you can ask your colleagues to change how the prepare is made, maybe you could do something like that instead:

        my %vars = (rank => 'Pope', xp => 4); my $req1 = $mytool->xprepare(ph { "SELECT from users WHERE rank = '$_{ +rank}' AND xp > '$_{xp}'" } %vars ); $req1->execute(); #There's no reason your new object can't call its me +thod execute BTW
        That way it looks like a more classical command BLOCK PARAMS (except instead of setting $_ before each call, it sets %_). And then the values can be stored as members of the returned object, or closed over by a closure that will set %_ before calling &BLOCK(), ie the implementations details stay hidden.

Re: RFC: Placeholder creation for SQL statements
by roboticus (Chancellor) on Mar 09, 2018 at 22:10 UTC

    LanX:

    Pretty nifty concept. It got me thinking a bit, and I came up with a little thing like this:

    #!env perl # # ex_sql_placeholders_via_tied_variables.pl # use strict; use warnings; use PH: # my fake placeholders package # Tell PH which variables are to be converted into placeholders. tie my $c1, 'PH'; tie my $c2, 'PH'; tie my $c3, 'PH'; tie my $c4, 'PH'; my $table='a_table'; sub the_sql { "select foo from $table where x=$c2 or y in ($c3, $c4, $ +c1)" }; # Prepare the statement my $X = PH::prep(\&the_sql); # Set up some values and execute: ($c2, $c3, $c4, $c1) = ('b','3','e','f'); PH::exec($X); # Again... ($c1, $c2, $c3, $c4) = ('m','n','o','p','q'); PH::exec($X);

    When I run it, I get:

    $ perl ex_sql_placeholders_via_tied_variables.pl PREPARED: select foo from a_table where x=? or y in (?, ?, ?) EXEC(b, 3, e, f) EXEC(n, o, p, m)

    The placeholder package is:

    package PH; # Set to an empty array to indicate that we're preparing a statement my $inside_prepare = undef; sub TIESCALAR { my ($class, $var) = @_; return bless \$var, $class; } # When we're preparing a statement, we'll return '?' for a tied value, # and push a reference to the variable on the inside_prepare vector # so the values and placeholders line up sub FETCH { my $self = shift; return $$self unless defined $inside_prepare; push @$inside_prepare, $self; return "?"; } sub STORE { my ($self,$new_val) = @_; $$self = $new_val; } # Fake DBI prep-like statement sub prep { # Tell FETCH() that we're prepping a statement $inside_prepare = []; # call the sub to make perl do the string interpolation and # build the vector of variable references that align with # the placeholders my $t = shift; my $retval = [ $t->(), $inside_prepare ]; # Turn off preparation mode $inside_prepare = undef; print "PREPARED: $retval->[0]\n"; return $retval; } # Fake DBI execute-like statement sub exec { my $x = shift; print "EXEC(", join(", ", map { $$_ } @{$x->[1]}), ")\n"; } 1;

    Mine isn't as fully-developed as yours, as I don't handle array types and such. I was just trying to find a way to simplify the interface as much as I could.

    I'm not as happy with it as I could be: I couldn't find an easy way to make it defer evaluation of the string other than the ugly "pass a coderef" bit I show in the example. On the bright side, I didn't have to slog through the optree, either.

    I look forward to what you ultimately come up with, as a way to simplify moving people string whacking to placeholders would be a nice thing to have.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Thanks, I had already similar Tie::Array code but was struggling with the antics of the perltie implementation.

      But TIMTOWTDI ...

      I had the idea that blessing the variables into a package which overload stringification '""' should be even easier and more precise, because only placeholders inside strings would be caught.

      Unfortunately I couldn't make this work without having leaking effects. :(

      Good news is, that at least the tie approach works with pads.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Wikisyntax for the Monastery

      update

      ) found the bug right after posting this :)

Re: RFC: Placeholder creation for SQL statements
by LanX (Archbishop) on Mar 09, 2018 at 17:34 UTC
    > underscored _var_names are ignored ( placeholders can't be everywhere)

    probably it's better to only ignore scalar references.

    "SELECT * FROM $$table"

    is a better visual hint than

    "SELECT * FROM $_table"

    and migrating old code is probably easier then....

    A custom function

    no_placeholders($table, VARLIST)

    might help too.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Wikisyntax for the Monastery

Re: RFC: Placeholder creation for SQL statements
by Anonymous Monk on Mar 09, 2018 at 14:15 UTC
    What you call "cumbersome" and "more work," though, might make the code more maintainable in the long run. Code changes are usually made through version-control commits which are tied to work-order or trouble tickets. The delta within that commit describes exactly what was changed, and it should be seen to be specific to the problem. Also, the change should minimize testing impact we should not have to go back and see if other, unrelated code is now broken. If you create "clever" code to do something you might create a black-hole which can't be readily touched ... and/or that can't be instantly understood. DBI already has the ability to pass an array of placeholder values to a query so that they will be substituted for the ? marks left-to-right.
      > DBI already has the ability to pass an array of placeholder values to a query so that they will be substituted for the ? marks left-to-right.

      But it only works if you know the number of the placeholders in advance, which is exactly the problem the OP tries to solve.

      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
    A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://1210536]
Approved by marto
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2018-12-09 23:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many stories does it take before you've heard them all?







    Results (46 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!