Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Splitting outside comments

by Sprad (Hermit)
on May 03, 2006 at 18:20 UTC ( [id://547235]=perlquestion: print w/replies, xml ) Need Help??

Sprad has asked for the wisdom of the Perl Monks concerning the following question:

I want to split a string on commas, but not commas that are inside /* C-style comments */. I've seen examples of how to do this with quotes (or other single characters), but I'm not sure how to handle it with a multi-character delimiter like /*.

I was thinking I could do something like this:

split(/,(?!$foo*\*\//, $str);
where $foo stands for "any character that isn't part of "/*". But I don't know how to say that.

---
A fair fight is a sign of poor planning.

Replies are listed 'Best First'.
Re: Splitting outside comments
by Roy Johnson (Monsignor) on May 03, 2006 at 18:37 UTC
    See perldoc -q comments for some cautionary explanation about why handling comments with a regex may not be a good idea (or at least may not be as simple as you expect). As well as an expression you can use to skip comments.

    Caution: Contents may have been coded under pressure.
Re: Splitting outside comments
by graff (Chancellor) on May 04, 2006 at 03:16 UTC
    Assuming that you want to retain the comment regions in the resulting array -- and assuming that the code does not pose any nasty challenges (e.g. quoted strings that contain just "/*" or just "*/", and similar perversions) -- something like this might get you started:
    my $code; { local $/; $code = <DATA>; } my @pieces = split m{/\*|\*/} $code; # split on comment delimiters; my $incomment = 0; my @csv = ( '' ); for ( @pieces ) { if ( m{/\*} ) { $incomment = 1; $csv[$#csv] .= $_; } elsif ( m{\*/} ) { $incomment = 0; $csv[$#csv] .= $_; } elsif ( $incomment ) { $csv[$#csv] .= $_; } else { my ( $first, @rest ) = split /,/, $_, -1; # don't truncate tra +iling commas $csv[$#csv] .= $first; push @csv, @rest if ( @rest ); } }
    (not tested)

    But the more I think about it, the more likely it seems that you really just want to delete comments first, and split whatever is left on commas. In which case, you should probably still slurp the whole input, in case comments are allowed to span multiple lines:

    { local $/; $code = <>; } $code =~ s{ */\*.*?\*/ *}{ }gs; # allow "." to match "\n" # now split on commas (and newlines?)
    (also not tested)

    If you're still struggling, it would be good to show us some sample input, what the resulting array should hold, and any real code you've actually tried so far. Your initial problem statement was not very detailed.

      I don't want to delete the comments, I need to have them as-is in the final array. I just don't want to consider commas within the comments when I do the split.

      Here's a typical input string (which I read in as a single line):

      attr_def_OID OID, varname vartype1, /* Explanation */ varname vartype2, /* Explanation, details */ varname vartype3, /* Explanation, details, details */ varname vartype4 /* Explanation */
      I want the resulting array to have 4 elements:
      1: varname vartype1 2: /* Explanation */ varname vartype2 3: /* Explanation, details */ varname vartype3 4: /* Explanation, details, details */ varname vartype4 /* Explanation */
      True, the comments don't get grouped with their intended lines, but that's not a problem for my application. I'm just doing some more text transformations, then joining them all back together. The important thing is that the comments come along for the ride.

      This is what I'm using now to do the split, and it's working for the cases I've tried it with so far:

      @elements = split(/,(?!(?:[^\/]|\/(?!\*))+\*\/)/, $str);

      ---
      A fair fight is a sign of poor planning.

Re: Splitting outside comments
by philcrow (Priest) on May 03, 2006 at 20:47 UTC
    It's usually a good idea to use Text::Balanced to strip comments before working on the real code.

    Phil

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://547235]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2025-05-21 01:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.