Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: Extracting C-Style Comments (Revisited)

by chipmunk (Parson)
on Feb 18, 2002 at 21:54 UTC ( #146264=note: print w/replies, xml ) Need Help??

in reply to Extracting C-Style Comments (Revisited)

There are two things that are tripping you up. The first is the greediness of [^"'/]+. The part of the regex that matches a regular expression looks for an equal sign or left paren followed by a slash; unfortunately, the equal sign or left paren has already been gobbled up! You could fix this by adding = and ( to the character class. On the other hand, since you're substituting in place, you don't even need that part of the regex. Just remove [^"'/]+ | and it should work fine.

The other problem is this curious regex in the JS: mystring.match(/[/\\*?"<>\:~|]/gi);. That regex would not be valid in Perl, because it contains an unescaped forward slash. Is it really valid in JavaScript? If so, you'll need to extend your regex so that it allows unescaped slashes within square brackets.

Replies are listed 'Best First'.
Re: Re: Extracting C-Style Comments (Revisited)
by Incognito (Pilgrim) on Feb 18, 2002 at 23:58 UTC
    Yes, the greediness of [^"'/]+ was definitely the problem... The new regular expression to strip of C-Style comments from a JavaScript file is:
    $strOutput =~ s{ # First, we'll list things we want # to match, but not throw away ( # Match a regular expression (they start with ( or =). # Then the have a slash, and end with a slash. # The first slash must not be followed by * and cannot contain # newline chars. eg: var "re = /\*/;" or "a = b.match (/x/);" (?: [\(=] \s* / (?: # char class contents \[ \^? ]? (?: [^]\\]+ | \\. )* ] | # escaped and regular chars (\/ and \.) (?: [^[\\\/]+ | \\. )* )* /[gi]* ) | # or double quoted string (?: "[^"\\]* (?:\\.[^"\\]*)*" [^"'/]* )+ | # or single quoted constant (?: '[^'\\]* (?:\\.[^'\\]*)*' [^"'/]* )+ ) | # or we'll match a comment. Since it's not in the # $1 parentheses above, the comments will disappear # when we use $1 as the replacement text. / # (all comments start with a slash) (?: # traditional C comments (?: \* [^*]* \*+ (?: [^/*] [^*]* \*+ )* / ) | # or C++ //-style comments (?: / [^\n]* ) ) }{$1}gsx;
    I'll do some further testing, but it looks like this huge regex will do the trick! Thanks and ++ to you chipmunk.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://146264]
[Corion]: Your Mother: I think that's because (in the west) the Nazi-Germans are recognized as universally evil. Of course, you could do some number games to calculate other measures of evil than "historic losers of second world war" to come up with other evils:)
[Corion]: I've heard "Troll" described as the new Punk, and in a way, it can be as destructive as living the Punk lifestyle, and you don't have to sit out in the cold...
[LanX]: Anti-Germans
[LanX]: Socrates was a Troll
[Your Mother]: It's very, very dangerous... Thinking that a group is intrinsically evil... buries the fact that all humans can be so deep that it starts to become likely they will be.
[Your Mother]: LanX++
[Corion]: (also the "troll for trolls sake" could seen be much like the "punk for punks beer")
[Your Mother]: I'm completely (historically anyway) a troll in real life. It's not fun online, you can't really win. :P
Happy-the-monk orders a Punk IPA.
[Your Mother]: Also, I'm too old to start fights with strangers anymore.

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (16)
As of 2018-03-19 13:45 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (240 votes). Check out past polls.