Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Splitting outside comments

by graff (Chancellor)
on May 04, 2006 at 03:16 UTC ( [id://547333]=note: print w/replies, xml ) Need Help??


in reply to Splitting outside comments

Assuming that you want to retain the comment regions in the resulting array -- and assuming that the code does not pose any nasty challenges (e.g. quoted strings that contain just "/*" or just "*/", and similar perversions) -- something like this might get you started:
my $code; { local $/; $code = <DATA>; } my @pieces = split m{/\*|\*/} $code; # split on comment delimiters; my $incomment = 0; my @csv = ( '' ); for ( @pieces ) { if ( m{/\*} ) { $incomment = 1; $csv[$#csv] .= $_; } elsif ( m{\*/} ) { $incomment = 0; $csv[$#csv] .= $_; } elsif ( $incomment ) { $csv[$#csv] .= $_; } else { my ( $first, @rest ) = split /,/, $_, -1; # don't truncate tra +iling commas $csv[$#csv] .= $first; push @csv, @rest if ( @rest ); } }
(not tested)

But the more I think about it, the more likely it seems that you really just want to delete comments first, and split whatever is left on commas. In which case, you should probably still slurp the whole input, in case comments are allowed to span multiple lines:

{ local $/; $code = <>; } $code =~ s{ */\*.*?\*/ *}{ }gs; # allow "." to match "\n" # now split on commas (and newlines?)
(also not tested)

If you're still struggling, it would be good to show us some sample input, what the resulting array should hold, and any real code you've actually tried so far. Your initial problem statement was not very detailed.

Replies are listed 'Best First'.
Re^2: Splitting outside comments
by Sprad (Hermit) on May 04, 2006 at 15:48 UTC
    I don't want to delete the comments, I need to have them as-is in the final array. I just don't want to consider commas within the comments when I do the split.

    Here's a typical input string (which I read in as a single line):

    attr_def_OID OID, varname vartype1, /* Explanation */ varname vartype2, /* Explanation, details */ varname vartype3, /* Explanation, details, details */ varname vartype4 /* Explanation */
    I want the resulting array to have 4 elements:
    1: varname vartype1 2: /* Explanation */ varname vartype2 3: /* Explanation, details */ varname vartype3 4: /* Explanation, details, details */ varname vartype4 /* Explanation */
    True, the comments don't get grouped with their intended lines, but that's not a problem for my application. I'm just doing some more text transformations, then joining them all back together. The important thing is that the comments come along for the ride.

    This is what I'm using now to do the split, and it's working for the cases I've tried it with so far:

    @elements = split(/,(?!(?:[^\/]|\/(?!\*))+\*\/)/, $str);

    ---
    A fair fight is a sign of poor planning.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://547333]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (3)
As of 2025-06-14 02:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.