Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: Regex: Identifying comments

by pvaldes (Chaplain)
on Aug 30, 2012 at 18:29 UTC ( #990816=note: print w/ replies, xml ) Need Help??


in reply to Re: Regex: Identifying comments
in thread Regex: Identifying comments

IMHO the basic problem with your sql files is that don't look really like sql files, thus anything can be a comment and anything can be wrong, even if perl is acting fine.

I don't know all databases in the whole world so don't take this advice as immutable, of course, but sql sentences normally aren't complete (and don't run) until you find a semicolon. Is also surprising that you could select a field named '--foo' when you can't even create a table like this:

create table (--foo int); # error create table ('--foo' int); # error create table ("--foo" int); # error create table (\-\-foo int); # big error

So you are probably trying to solve a nonexistent problem. Comments are always after a semicolon or in your own line. And it you have i.e this:

-- ; select * from mytable where foo = "--34"; -- hi ', hi'. "hi'

The complete line IS a comment here. I'm quoting lines that I don't want to run (temporally) all the time in sql files, it's not unusual to see something like this.

So you should be pointing your arrows to ";" instead to "'". You probably should check and sanitize first what's in your files, before to pass this files to perl


Comment on Re^2: Regex: Identifying comments
Select or Download Code
Re^3: Regex: Identifying comments
by remiah (Hermit) on Aug 30, 2012 at 21:58 UTC

    Thanks for reply, pvaldes.

    I noticed I misunderstood how to escape single quotes in sql. I thought of the case like this

    insert into foo (bar,boo) values('''a', 'b');
    And this is nothing surprise.
    select '--foo';
    I changed test data with real SQL with your advice. SQLite and Postgres accept these commands.
    drop table foo; create table foo( --tabel foo bar text, --fld name boo text --again ); --populate insert into foo (bar,boo) values('''a', 'b'); -- insert quote's sake insert into foo (bar,boo) values('c', 'd'); -- another line select 'test' || 'abc' as a ; -- maybe you use dual wi +th oracle? select 'test''s' || 'abc' as a ; -- maybe you use dual wi +th oracle? select '--foo';
    So far, so good.

      create table foo( --tabel foo bar text, --fld name boo text --again );

      mmh... this is an interesting example, Yes. Perfectly legitim third type of comments, and difficult to debug with a regex... maybe you can profit that the fact that the type of data are limited with something like:

      m/(\(|text|int|integer|char \(\d+\)),*\s*--(.*)$/

      select '--foo';

      This is not a very probable situation, but certainly is possible too. In any case this false comment is not after a semicolon, nor at the beginning of the line or inside a table, so if there are a ^\s*select in the same line you probably could safely ignore it. But then you could have something like this:

      select field from mytable where field = 'text, --foo important information here about to be lost';

      The safest actitude (although maybe a little paranoic) should be to isolate and examinate personally any case so special, the idea is: "if you found two - after a ' or a " and before a ";" in a line having the string "select" I want to see it personally"

      You can improve your regex if you check previously for troublesome data:

      select * from mitable where field1 like '%--%' or field2 like '%--%' or field3 like '%--%'... etc ;

        umm..
        No.

        You will not see my regex fault with your examples. My regex will stumble with this sql.

        update set bar = bar - 1 ; -- subtraction symbol may disappear.

        I expected to see SQL parser solution in this thread, like this

        my $p = SQLParser->new(type=>'mysql', sql=>$sql) or die SQLParser->error(); $p->prettyprint(1); $p->without_comment(1); print $p->sql;
        At first I looked SQL::Parser. It seems quite near for such tasks, but I couldn't find good solution to rip off comments. Do you know such module?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://990816]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (12)
As of 2015-07-07 12:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (88 votes), past polls