Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Help SQL::Tidy

by eric256 (Parson)
on Feb 18, 2008 at 22:12 UTC ( #668678=perlquestion: print w/ replies, xml ) Need Help??
eric256 has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to fix up SQL::Tidy to do a few more things (and working on contacting the author about patching and why its not on CPAN ;) ) but I've run into an issue I'm not sure how to approach. The code uses SQL::Tokenizer to break up the SQL. However it ends up breaking count(*) into ['count', '(','*',')']. I can't figure out how to get from that back to the proper output, unless I guess I supply a list of functions? However since I could have user defined functions I'm not really sure what to do here. Can anyone see how I could get back to the original form?

Input SQL

SELECT "CATEGORY_ID", "CATEGORY_NAME", "DESCRIPTION", "PARENT_ID", (SELECT count(*) FROM nf_posts WHERE category_id IN ( SELECT category_id FROM NF_CATEGORY START WITH category_id = cat.category_id CONNECT BY parent_id = PRIOR category_id ) AND parent_id is null) topics, (SELECT count(*) FROM nf_posts WHERE category_id IN ( SELECT category_id FROM NF_CATEGORY START WITH category_id = cat.category_id CONNECT BY parent_id = PRIOR category_id ) AND parent_id is not null) replies from "NF_CATEGORY" cat where Parent_ID = :P4_Category_ID and nf_user_can_str(:APP_USER, category_id, 'view') = 'T'

Formatting with my version gives me:

SELECT "CATEGORY_ID", "CATEGORY_NAME", "DESCRIPTION", "PARENT_ID", ( SELECT count ( * ) FROM nf_posts WHERE CATEGORY_ID IN ( SELECT CATEGORY_ID FROM NF_CATEGORY START WITH CATEGORY_ID = CAT.CATEGORY_ +ID CONNECT BY parent_id = PRIOR CATEGORY_ID ) AND parent_id is null ) topics, ( SELECT count ( * ) FROM nf_posts WHERE CATEGORY_ID IN ( SELECT CATEGORY_ID FROM NF_CATEGORY START WITH CATEGORY_ID = CAT.CATEGORY_ +ID CONNECT BY parent_id = PRIOR CATEGORY_ID ) AND parent_id is not null ) replies FROM "NF_CATEGORY" cat WHERE Parent_ID = :P4_CATEGORY_ID AND nf_user_can_str ( :APP_USER, CATEGORY_ID, 'view' ) = 'T'

And the whole module:

package SQL::Tidy; use strict; use warnings; our $VERSION = 0.01; use SQL::Tokenizer; use Data::Dumper; use constant KEYWORDS => ( 'select', 'from', 'where', 'order', 'group', 'join', ); sub new { my $class = shift; my %args = ( # Some defaults indent => ' ', width => 75, keywords => [ KEYWORDS ], margin => '', @_ ); my $self = bless {}, ref($class) || $class; my $keywords = delete($args{'keywords'}); while (my ($k, $v) = each(%args)) { $self->$k($v); } $self->add_keywords(@$keywords); return $self; } sub add_keywords { my ($self, @keywords) = @_; for my $keyword (@keywords) { $self->{'_keywords'}{lc($keyword)} = 1; } return; } sub _is_keyword { exists(shift->{'_keywords'}{lc(shift)}) } sub newline { my $self = shift; $self->{line}++; my $current_group = $self->{groups}->[-1]; my $prefix = $self->margin ; if (defined $current_group) { #add group indentation level $prefix .= ' ' x $current_group->{indent}; }; $prefix .= $self->indent x $self->{level}; $self->{column} = length($prefix); #reset colum to end of pre +fix $self->{retval} .= "\n" . $prefix; $self->{newline} = 0; } sub tidy { my ($self, $query) = @_; my @tokens = grep !/^\s+$/, SQL::Tokenizer->tokenize($query); my $retval; #array of hash, including base indentation for the whole group and # including the line it starts on my $keywords = qw/^SELECT|FROM|WHERE|LIMIT|BY|GROUP|CONNECT|AND|OR +|JOIN$/; my @groups; $self->{column} = length($self->margin); $self->{line} = 0; $self->{newline} = 0; my ($token,$next_token, $last_token) = ('','',''); while (1) { last unless @tokens || $next_token; $last_token = $token; $token = $next_token || shift @tokens; $next_token = shift @tokens; $token = uc($token) if $token =~ /$keywords/i; $self->{level} += 2 if $last_token =~ /WHERE/i; if ($token eq 'SELECT' and $last_token ne '(') { $self->{level}++; } elsif ($token eq '(' ) { $self->newline() if ($next_token eq 'SELECT'); push @{$self->{groups}}, { level => $self->{level}, start_line => $self->{line}, indent => $self->{column} + length($tok +en) + 1}; $self->{level} = 1; } elsif ($token =~ /FROM|WHERE|AND/i) { $self->{newline} = 1; $self->{level}-- if $token =~ /FROM|WHERE/i; } elsif ($token eq ')') { my $group = pop @{$self->{groups}}; if ($group) { $self->{level} = $group->{level}; + if ($self->{line} != $group->{start_line}) { $self->newline(); }; } } # if we arn't forcing a newline then check if the lenght # of the token + $column +1 is larger than the width # where column = current indentation if ( !$self->{newline} && length($token) + $self->{column} + 1 > $self->width) { $self->{level}++; $self->{newline} = 1; } #if flagged for newline before current token increment line co +unter, # calc indentation prefix, and current indentatio +n length # add output to $retval string. $self->newline() if $self->{newline}; unless ($self->{newline} || $token eq ',') { #append space unless newline or token is a , $self->{retval} .= ' '; $self->{column}++; } $self->{retval} .= $token; $self->{column} += length($token); } $self->{retval} .= "\n"; return $self->{retval}; } # Generate accessors: for my $method (qw(width indent margin keywords)) { no strict 'refs'; *{$method} = sub { my $self = shift; if (@_) { $self->{'_' . $method} = shift; } return $self->{'_' . $method}; }; } 1;

___________
Eric Hodges

Comment on Help SQL::Tidy
Select or Download Code
Re: Help SQL::Tidy
by stiller (Friar) on Feb 19, 2008 at 11:36 UTC
    This is great, I realy don't see any problem with getting ( * ) in stead of (*), and otherwise too this seems fine, unless I read you post too fast and missed something?
    cheers
Re: Help SQL::Tidy
by tilly (Archbishop) on Feb 19, 2008 at 21:37 UTC
    Instead of having to have a list of all functions, you can just look at the next token. If the next token is '(' then assume you have a function and format appropriately.

    While I won't be using this module (because I already write my SQL in a nicely formatted form), I should point out that one thing you need to figure out how to format is a CASE statement. I say this because I frequently write things like this:

    SELECT e.eventid , SUM(c.revenue) as total_revenue , SUM( CASE WHEN c.cart_type = 'ST' THEN c.revenue END ) as standard_revenue FROM event e JOIN cart c ON e.eventid = c.eventid GROUP BY e.eventid
    and the expressions within the CASE statement can get surprisingly complicated...

      Thanks. This is for autodocumenting an oracle system where I don't write all the SQL, but I still want it to look nice. I've got a new version i'll post up afer some more testing, re-arranged completly and its getting pretty nice. At least acceptable enough for this document (thats probably going to end up sitting in a safe and never looked at agian. lmao).


      ___________
      Eric Hodges
        Hi There, I am wondering where this stands? I sure could use a nice SQL Formatter and I saw that you were working on this. Any plans for a CPAN release? Or at least making your new code available? Thanks for letting me know! Eric J
Re: Help SQL::Tidy
by andreas1234567 (Vicar) on Feb 27, 2009 at 11:24 UTC
    It seems there now exists a SQL::Beautify that does largely what SQL::Tidy did.
    --
    No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://668678]
Approved by Corion
Front-paged by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (13)
As of 2014-12-19 20:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (91 votes), past polls