http://www.perlmonks.org?node_id=996580


in reply to Re: Regex to strip comments
in thread Regex to strip comments

String literals aren't parsed inside of comments, as your code seems to assume. It is only the string literals outside of comments where '/*' needs to be ignored. (And, despite the OP's claim, '*/' in a string literal isn't a problem.)

[Update: After posting, I see that your node, despite being earlier, is listed below my other node. So it must have a negative rep. FYI, I didn't down-vote it. I guess I figured posting the correction might well be punishment enough. :) ]

- tye        

  • Comment on Re^2: Regex to strip comments (out not in)

Replies are listed 'Best First'.
Re^3: Regex to strip comments (out not in)
by BrowserUk (Patriarch) on Oct 01, 2012 at 05:31 UTC
    (And, despite the OP's claim, '*/' in a string literal isn't a problem.)

    The OP didn't identify the language involved, so I took him at his word.

    Seems I don't have the 'I-know-better-than-the-OP' gene that you and several others around here have. I don't miss it.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

      (And, despite the OP's claim, '*/' in a string literal isn't a problem.)
      The OP didn't identify the language involved, so I took him at his word.

      No offense intended, but that's hilarious. If you took them at their word with regard to '*/', then you must not have with regard to '/*'.

      Seems I don't have the 'I-know-better-than-the-OP' gene that you and several others around here have. I don't miss it.

      Where did I say a specific language that I was assuming? I guess you have the "You know better than some" gene, since you aren't taking me at my word.

      That insult is so comical in the face of you presuming that this unspecified language accepts the following for string literals: "(?:\\\\|\\[abfnrt]|\\u[0-9a-fA-F]{4}|\\x[0-9a-fA-F]{2,4}|\\"|[^"])+?". That is an impressively specific set of features for an unspecified language.

      I tried to be quite polite. But I am not surprised that you found it so very hard to admit to even a simple mistake that you responded with an attack. Just saddened.

      The OP didn't say that they are using a language that parses string literals inside of block comments. They certainly didn't say that their language parses string literals only inside of block comments.

      The idea of there even existing a language where string literals are parsed only inside of block comments is quite humorous. But that is all that your regex tries to handle.

      But composing the regex 'backward' in that way is quite a simple mistake. The kind of mistake I make all the time. Most people do.

      I am sorry my correction caused you distress (or seemed to). That wasn't my intent. I would appreciate it if you would at least refrain from responding with another insult. (I thought pointing out the mistake was actually more polite than down-voting and not commenting, something you've complained about repeatedly.)

      - tye        

        I tried to be quite polite.

        Such a shame. You try sooo hard -- and yet still somehow always fail.

        I am not surprised that you found it so very hard to admit to even a simple mistake

        And it comes as no surprise to me that you would try to bluff, bluster and bore your way around that limited imagination with which you view the world, and which you try to impose on others.

        Here is a C file that compiles clean:

        C:\test>type junk.c char x[] = "/* comment */"; int n = 1; C:\test>cl /Wall -c junk.c Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64 Copyright (C) Microsoft Corporation. All rights reserved. junk.c

        And here it is with a part of it commented out:

        C:\test>type junk.c /* char x[] = "/* comment */"; */ int n = 1; C:\test>cl /Wall -c junk.c Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64 Copyright (C) Microsoft Corporation. All rights reserved. junk.c junk.c(2) : error C2001: newline in constant junk.c(2) : error C2059: syntax error : 'string' junk.c(3) : warning C4138: '*/' found outside of comment

        Just because the C designers made that mistake, it doesn't mean everyone has to. Maybe there is a language out there that allows a block comment to span an arbitrary chunk of valid code without breaking. Maybe, its even closer to you than you think.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        RIP Neil Armstrong