Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

RegEx - comments should not matches

by Chief of Chaos (Friar)
on Jul 13, 2004 at 11:35 UTC ( [id://373924]=perlquestion: print w/replies, xml ) Need Help??

Chief of Chaos has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I would be very glad, if you can share your wisdom with me.

I try to find a regex which matches not out-commented lines.
Comments are '//' and '/* ... */'.
E.g.:
doApply() <== match // doNotApply()<== NOT match /* doWhatEver() <== NOT match */

Thanks,
CoC

Replies are listed 'Best First'.
Re: RegEx - comments should not matches
by Corion (Patriarch) on Jul 13, 2004 at 11:42 UTC

    Weeding out comments is not as easy as you might think:

    println( "This starts a comment? // Or does it\n" ); <== match ? // /* print( "*/ What happens now ?\n" ); <== match ??

    You will need a parser for your language that understands the basics of that language at least well enough to know when the "comment starter markers" are within a string and when they actually apply. There is Text::Balanced and Regexp::Common, which have prefabricated regexes that attempt this task, and if they are not suitable for the language you are trying to parse, Parse::RecDescent or Parse::YAPP can be used to write your own parser.

      Thanks,
      i will try Text::Balanced and Regexp::Common.

      Sounds good :
      Regexp::Common::comment Provides regexes for comments of various languages (43 languages c +urrently).

      Greetings,
      CoC

      Regexp::Common::comment doesn't do proper tokenizing. IIRC, it will not catch the cases you presented correctly. I remember seeing a regex for catching C-style comments correctly (don't remember where, sorry), and it's quite ugly (but not as bad as the e-mail address regex).

      ----
      send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.

Re: RegEx - comments should not matches
by tomhukins (Curate) on Jul 14, 2004 at 00:21 UTC
Re: RegEx - comments should not matches
by Happy-the-monk (Canon) on Jul 13, 2004 at 11:39 UTC

    $string !~ m{^/[*/]}; # ^-beginning of the line, /- a slash [*/]-character class containing * and /

    See   perldoc perlre.

    Update:   !~   negates the match.

    Update2: Thanks to Corion who pointed out that my approach didn't catch multiple line comments.

    # a loop could do it like this:

    $block_comment = 0; while ( <INPUT> ) { if ( m{^//} ) { next; } elsif ( m{^/\*} ) { $block_comment = 1; next; } # start block elsif ( m{^\*/} ) { $block_comment = 0; next; } # end block elsif ( $block_comment) { next; } # inside block else { ... do something with the line, that is no comment ... } }

    Cheers, Sören

      Hi,
      if I process this, it would not match the
      commented line with doWhatEver().
      But thanks.
      CoC
      #!/usr/local/bin/perl -w while (<DATA>) { if ( $_ !~ m{^/[*/]} ) { print $_; } } __DATA__ doApply() // doNotApply() /* doWhatEver() */ Test4()

      will show :
      huxl:~>./tst.pl doApply() doWhatEver() */ Test4()
Re: RegEx - comments should not matches
by danielcid (Scribe) on Jul 13, 2004 at 12:28 UTC

    It looks like you are trying to read a C/C++ file.
    Be carefull that you can have "doApply() // comment" in
    the same line. I wrote a little example for you, that
    will only ignore the line if the comment is in the
    beginning of the line or after tabs/spaces.
    #!/usr/bin/perl use strict; use warnings; if(!@ARGV) { die "program file.c\n"; } open(FILE,"<$ARGV[0]")|| die "Impossible to open $ARGV[0]\n"; my $ig=0; while(<FILE>) { if($_ =~ /\*\//) { $ig=0; next; } next if(($_ =~ /(^\s+|^\t+|^)\/\//)||($_ =~ /(^\s+|^\t+|^)\/\* +.+\*\//)); $ig=1 if($_ =~ /\/\*/); next if($ig == 1); print "$_"; } exit;


    -DBC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://373924]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-04-19 13:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found