Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Help SQL::Tidy

by eric256 (Parson)
on Feb 18, 2008 at 22:12 UTC ( #668678=perlquestion: print w/ replies, xml ) Need Help??
eric256 has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to fix up SQL::Tidy to do a few more things (and working on contacting the author about patching and why its not on CPAN ;) ) but I've run into an issue I'm not sure how to approach. The code uses SQL::Tokenizer to break up the SQL. However it ends up breaking count(*) into ['count', '(','*',')']. I can't figure out how to get from that back to the proper output, unless I guess I supply a list of functions? However since I could have user defined functions I'm not really sure what to do here. Can anyone see how I could get back to the original form?

Input SQL

SELECT "CATEGORY_ID", "CATEGORY_NAME", "DESCRIPTION", "PARENT_ID", (SELECT count(*) FROM nf_posts WHERE category_id IN ( SELECT category_id FROM NF_CATEGORY START WITH category_id = cat.category_id CONNECT BY parent_id = PRIOR category_id ) AND parent_id is null) topics, (SELECT count(*) FROM nf_posts WHERE category_id IN ( SELECT category_id FROM NF_CATEGORY START WITH category_id = cat.category_id CONNECT BY parent_id = PRIOR category_id ) AND parent_id is not null) replies from "NF_CATEGORY" cat where Parent_ID = :P4_Category_ID and nf_user_can_str(:APP_USER, category_id, 'view') = 'T'

Formatting with my version gives me:

SELECT "CATEGORY_ID", "CATEGORY_NAME", "DESCRIPTION", "PARENT_ID", ( SELECT count ( * ) FROM nf_posts WHERE CATEGORY_ID IN ( SELECT CATEGORY_ID FROM NF_CATEGORY START WITH CATEGORY_ID = CAT.CATEGORY_ +ID CONNECT BY parent_id = PRIOR CATEGORY_ID ) AND parent_id is null ) topics, ( SELECT count ( * ) FROM nf_posts WHERE CATEGORY_ID IN ( SELECT CATEGORY_ID FROM NF_CATEGORY START WITH CATEGORY_ID = CAT.CATEGORY_ +ID CONNECT BY parent_id = PRIOR CATEGORY_ID ) AND parent_id is not null ) replies FROM "NF_CATEGORY" cat WHERE Parent_ID = :P4_CATEGORY_ID AND nf_user_can_str ( :APP_USER, CATEGORY_ID, 'view' ) = 'T'

And the whole module:

package SQL::Tidy; use strict; use warnings; our $VERSION = 0.01; use SQL::Tokenizer; use Data::Dumper; use constant KEYWORDS => ( 'select', 'from', 'where', 'order', 'group', 'join', ); sub new { my $class = shift; my %args = ( # Some defaults indent => ' ', width => 75, keywords => [ KEYWORDS ], margin => '', @_ ); my $self = bless {}, ref($class) || $class; my $keywords = delete($args{'keywords'}); while (my ($k, $v) = each(%args)) { $self->$k($v); } $self->add_keywords(@$keywords); return $self; } sub add_keywords { my ($self, @keywords) = @_; for my $keyword (@keywords) { $self->{'_keywords'}{lc($keyword)} = 1; } return; } sub _is_keyword { exists(shift->{'_keywords'}{lc(shift)}) } sub newline { my $self = shift; $self->{line}++; my $current_group = $self->{groups}->[-1]; my $prefix = $self->margin ; if (defined $current_group) { #add group indentation level $prefix .= ' ' x $current_group->{indent}; }; $prefix .= $self->indent x $self->{level}; $self->{column} = length($prefix); #reset colum to end of pre +fix $self->{retval} .= "\n" . $prefix; $self->{newline} = 0; } sub tidy { my ($self, $query) = @_; my @tokens = grep !/^\s+$/, SQL::Tokenizer->tokenize($query); my $retval; #array of hash, including base indentation for the whole group and # including the line it starts on my $keywords = qw/^SELECT|FROM|WHERE|LIMIT|BY|GROUP|CONNECT|AND|OR +|JOIN$/; my @groups; $self->{column} = length($self->margin); $self->{line} = 0; $self->{newline} = 0; my ($token,$next_token, $last_token) = ('','',''); while (1) { last unless @tokens || $next_token; $last_token = $token; $token = $next_token || shift @tokens; $next_token = shift @tokens; $token = uc($token) if $token =~ /$keywords/i; $self->{level} += 2 if $last_token =~ /WHERE/i; if ($token eq 'SELECT' and $last_token ne '(') { $self->{level}++; } elsif ($token eq '(' ) { $self->newline() if ($next_token eq 'SELECT'); push @{$self->{groups}}, { level => $self->{level}, start_line => $self->{line}, indent => $self->{column} + length($tok +en) + 1}; $self->{level} = 1; } elsif ($token =~ /FROM|WHERE|AND/i) { $self->{newline} = 1; $self->{level}-- if $token =~ /FROM|WHERE/i; } elsif ($token eq ')') { my $group = pop @{$self->{groups}}; if ($group) { $self->{level} = $group->{level}; + if ($self->{line} != $group->{start_line}) { $self->newline(); }; } } # if we arn't forcing a newline then check if the lenght # of the token + $column +1 is larger than the width # where column = current indentation if ( !$self->{newline} && length($token) + $self->{column} + 1 > $self->width) { $self->{level}++; $self->{newline} = 1; } #if flagged for newline before current token increment line co +unter, # calc indentation prefix, and current indentatio +n length # add output to $retval string. $self->newline() if $self->{newline}; unless ($self->{newline} || $token eq ',') { #append space unless newline or token is a , $self->{retval} .= ' '; $self->{column}++; } $self->{retval} .= $token; $self->{column} += length($token); } $self->{retval} .= "\n"; return $self->{retval}; } # Generate accessors: for my $method (qw(width indent margin keywords)) { no strict 'refs'; *{$method} = sub { my $self = shift; if (@_) { $self->{'_' . $method} = shift; } return $self->{'_' . $method}; }; } 1;

___________
Eric Hodges

Comment on Help SQL::Tidy
Select or Download Code
Re: Help SQL::Tidy
by stiller (Friar) on Feb 19, 2008 at 11:36 UTC
    This is great, I realy don't see any problem with getting ( * ) in stead of (*), and otherwise too this seems fine, unless I read you post too fast and missed something?
    cheers
Re: Help SQL::Tidy
by tilly (Archbishop) on Feb 19, 2008 at 21:37 UTC
    Instead of having to have a list of all functions, you can just look at the next token. If the next token is '(' then assume you have a function and format appropriately.

    While I won't be using this module (because I already write my SQL in a nicely formatted form), I should point out that one thing you need to figure out how to format is a CASE statement. I say this because I frequently write things like this:

    SELECT e.eventid , SUM(c.revenue) as total_revenue , SUM( CASE WHEN c.cart_type = 'ST' THEN c.revenue END ) as standard_revenue FROM event e JOIN cart c ON e.eventid = c.eventid GROUP BY e.eventid
    and the expressions within the CASE statement can get surprisingly complicated...

      Thanks. This is for autodocumenting an oracle system where I don't write all the SQL, but I still want it to look nice. I've got a new version i'll post up afer some more testing, re-arranged completly and its getting pretty nice. At least acceptable enough for this document (thats probably going to end up sitting in a safe and never looked at agian. lmao).


      ___________
      Eric Hodges
        Hi There, I am wondering where this stands? I sure could use a nice SQL Formatter and I saw that you were working on this. Any plans for a CPAN release? Or at least making your new code available? Thanks for letting me know! Eric J
Re: Help SQL::Tidy
by andreas1234567 (Vicar) on Feb 27, 2009 at 11:24 UTC
    It seems there now exists a SQL::Beautify that does largely what SQL::Tidy did.
    --
    No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://668678]
Approved by Corion
Front-paged by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2015-07-05 14:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (67 votes), past polls