Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^4: Regex: Identifying comments

by pvaldes (Chaplain)
on Aug 30, 2012 at 23:42 UTC ( #990879=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Regex: Identifying comments
in thread Regex: Identifying comments

create table foo( --tabel foo bar text, --fld name boo text --again );

mmh... this is an interesting example, Yes. Perfectly legitim third type of comments, and difficult to debug with a regex... maybe you can profit that the fact that the type of data are limited with something like:

m/(\(|text|int|integer|char \(\d+\)),*\s*--(.*)$/

select '--foo';

This is not a very probable situation, but certainly is possible too. In any case this false comment is not after a semicolon, nor at the beginning of the line or inside a table, so if there are a ^\s*select in the same line you probably could safely ignore it. But then you could have something like this:

select field from mytable where field = 'text, --foo important information here about to be lost';

The safest actitude (although maybe a little paranoic) should be to isolate and examinate personally any case so special, the idea is: "if you found two - after a ' or a " and before a ";" in a line having the string "select" I want to see it personally"

You can improve your regex if you check previously for troublesome data:

select * from mitable where field1 like '%--%' or field2 like '%--%' or field3 like '%--%'... etc ;


Comment on Re^4: Regex: Identifying comments
Select or Download Code
Re^5: Regex: Identifying comments
by remiah (Hermit) on Aug 31, 2012 at 05:37 UTC

    umm..
    No.

    You will not see my regex fault with your examples. My regex will stumble with this sql.

    update set bar = bar - 1 ; -- subtraction symbol may disappear.

    I expected to see SQL parser solution in this thread, like this

    my $p = SQLParser->new(type=>'mysql', sql=>$sql) or die SQLParser->error(); $p->prettyprint(1); $p->without_comment(1); print $p->sql;
    At first I looked SQL::Parser. It seems quite near for such tasks, but I couldn't find good solution to rip off comments. Do you know such module?

      use strict; use warnings; while (<DATA>) { chomp; next if /^\s*--/; print $_,"\n" if !/--/; elsif (/--/ && !/--(.*?);/){ s/--(.*?)$//; print $_,"\n"} elsif (/--(.*?);\s*--(.*?)$/){ s/\s*--$2//; print $_,"\n"} else { print $_,"\n"} } __DATA__ select 'text' from foo; -- comment select '--Not comment' from foo; --But this is select q from z; -- as is this select '--Not this' + '--either' from foo; select 'qaws' + make from "a"; -- comment with 'a' quote select 'a' from 'b' with 'c' -- comment with 'a --' comment -- test comment (add1) select 'text\'s' from foo --escaped ... (add2) select 'text\'s' from foo --escaped' ... (add3) update set bar = bar - 1 ; -- subtraction symbol preserved create table ( -- fo field text, -- fufufu field int)

        I think you need to revisit this...

        syntax error at 991028.pl line 7, near "elsif" syntax error at 991028.pl line 10, near "elsif" syntax error at 991028.pl line 12, near ""\n"}" Execution of 991028.pl aborted due to compilation errors.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://990879]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2015-07-06 22:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (84 votes), past polls