in reply to Regex: Identifying comments

This should work for simple cases (no newlines inside single quotes, no single quotes in comments):
#!/usr/bin/perl use warnings; use strict; while (<DATA>) { my $last = (split /'/)[-1]; print $1 if $last =~ /(--.*)/; } __DATA__ select 'text' from foo --This is a comment select '--Not a valid comment' from foo --But this is select q from z -- as is this select '--This is not a valid comment' from foo select '--Not this' + '--either' from foo
To handle single quotes in comments, you might need to change the script to the following:
#!/usr/bin/perl use warnings; use strict; while (<DATA>) { my @items = split /'/; until (not @items or $items[0] =~ s/.*?--/--/) { shift @items for 1, 2; # Remove the quoted part, too. } print join "'", @items; } __DATA__ select 'text' from foo --This is a comment select '--Not a valid comment' from foo --But this is select q from z -- as is this select '--This is not a valid comment' from foo select '--Not this' + '--either' from foo select 'qaws' + make from "a" -- comment with 'a' quote select 'a' from 'b' with 'c' -- comment with 'a --' comment
To get the code instead of the comments, just invert the logic:
while (<DATA>) { chomp; my @code; my @items = split /'/; until (not @items or $items[0] =~ s/--.*//) { @items and push @code, shift @items for 1, 2; } print join("'", @code), @items ? (@code ? "'" : q() ) . "$items[0] +" : q(), "\n"; }
لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ