Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Regex to strip comments (match strings)

by tye (Cardinal)
on Oct 01, 2012 at 04:31 UTC ( #996579=note: print w/ replies, xml ) Need Help??


in reply to Regex to strip comments

s{ (\s*/[*].*?[*]/\s*) # $1: /* comment */ | \s*//[^\n]* # // comment ( # $2: something to keep: | '([^\\']+|\\.)*' # '\t' | "([^\\"]+|\\.)*" # "string" | /(?![/*]) # Non-comment / | [^'"/]+ # Other code ) | (.) # $3: A syntax error (unclosed ' or ") }{ if( defined $3 ) { warn "Ignoring syntax error ($3) at byte ", pos(), $/; } $1 ? ' ' : # "foo /*...*/bar" => "foo bar" defined $2 ? $2 : # Keep non-comment as-is defined $3 ? $3 # Keep syntax error as-is : '' # "foo // ...\n" => "foo\n" }gsex;

You just have to teach your regex to match things that might contain '/*' characters that don't represent comments. This mostly boils down to string literals. Though, if there is a chance of "// end-of-line" comments, then you have to match those as well. My code above strips them too.

(Updates made shortly after posting below:)

If you want to be defensive against mistakes in your regex or in your understanding of the syntax you are trying to parse, then you can add \G(?: and ) around the regex in order to prevent the possibility of it just skipping over unhandled stuff. You can then also specifically match "end of string" for similar reasons. I think the "(.)" case is simple enough that I have little worry of getting that part of the regex wrong and it serves the "misunderstood syntax" and "don't skip bits, including at end of string" purposes well enough.

- tye        


Comment on Re: Regex to strip comments (match strings)
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://996579]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (12)
As of 2015-07-28 22:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (260 votes), past polls