http://www.perlmonks.org?node_id=839616

Kob has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

If I want to make sure that none of the comments written in a Perl source will make it to the compiled code, must I strip them manually or I can trust any compiler to auto-strip them for me?
Specifically, I am using now the ActiveState PDK to compile a lengthy Perl source under .NET environment.

I would be thankful for any words of wisdom on this.

Replies are listed 'Best First'.
Re: Stripping Comments from Source
by LanX (Saint) on May 12, 2010 at 11:34 UTC
    There are (unfortunately) no comments in the compiled code, you can check it with B::Deparse.

    But I think you have wrong understanding of "compiling" in perl and what you really want is to obfuscate your code.

    Actually (in most cases) B::Deparse will produce code without comments.

    You may also want to check if perltidy has an option to strip comments.

    UPDATE: here they are http://perltidy.sourceforge.net/perltidy.html#other_controls

    Cheers Rolf

Re: Stripping Comments from Source
by Kob (Novice) on May 12, 2010 at 13:28 UTC
    Rolf, thanks for the pointers. B::Deparse appears not to be useful here since the PDK compiles the code into an undocumented mash of things - not a pure intermediate code for sure. PerlNET keeps the complete source inside the compiled PerlNET exe or dll target, at least when building with debug information. This is visible when debugging a PerlNET application, as the debugger can actually show the full source code, calling it a "remote" file, without opening the perl source that was used for the build. I would be content if I could confirm that the non-debug PerlNET build only stores B code.

    perltidy looks promising and we are testing it right now. I will report back when done.

    sierpinski, thanks for the comment. As Rolf already indicated, a stripping script can easily handle simple formats, but it becomes fairly complicated to try to handle all cases. And yes, the code is complex, a lot of smarts in the algorithms that need commenting for maintenance, and the code goes through multiple revision/distribution cycles.

    JavaFan, the comments may contain material that is not appropriate for public view, either an explanation about a complex and proprietary business logic or personal comments among the dev team members. We need to verify a sanitized code before distribution.
    Now, since the debug build of PerlNet contains the source code, and its format is undocumented, I can not risk the content of the release build without a good reference - so I want to strip the comments.


    UPDATE: perltidy looks like a great tool, and has a delete all comments option (-dac). I just tried it on my otherwise working script and it gave a fatal parsing error. It seems the perltidy parser is not 100% the same as perl. Still, if all else fails I would consider manually simplifying the code to be compatible with perltidy.
      OK ....I have to admit that I have no clue how "PDK compilations" look like...

      IMHO processing the perlsources before piping them into another make/compilation process should be safe!

      HTH!

      Cheers Rolf

        Your comments were most helpful. Thank you.
Re: Stripping Comments from Source
by nagalenoj (Friar) on May 12, 2010 at 13:23 UTC
    I've tried using perltidy to strip comments and it worked for all my scripts.

    Perltidy can selectively delete comments and/or pod documentation. The command -dac or --delete-all-comments will delete all comments and all pod documentation, leaving just code and any leading system control lines.

Re: Stripping Comments from Source
by scorpio17 (Canon) on May 12, 2010 at 17:22 UTC

    Sounds like you need to come up with a special "best practice" kind of rule for your developers: internal "eyes only" comments not deemed fit for viewing by the unwashed masses must be designated with a special, easy to parse token.

    For example:

    # this comment will survive the cut ### secret internal memo - the boss is an idiot! (must delete)

    Then just strip all lines beginning with '###'.

Re: Stripping Comments from Source
by sierpinski (Chaplain) on May 12, 2010 at 12:30 UTC
    If you're skiled with Perl, any reason why you can't write a script (a Perl script, of course) to use a regex to strip out comments from your other Perl script?

    That's beside the points that were mentioned above, of course, but why would you bother putting comments in the script in the first place? Why are comments a problem in "compiled" code? How is your Perl going to be compiled within .NET?

    If this is something that you plan on doing more than a couple times (stripping comments out of a Perl script), I recommend a commentstrip.pl script that you can run on anything and everything that ends in .pl.
      If you're skiled with Perl, any reason why you can't write a script (a Perl script, of course) to use a regex to strip out comments from your other Perl script?

      It's not trivial to parse Perl even for a skilled programmer!

      e.g. regarding comments, you have to handle # delimiters and # in strings

      $a="/#/"; $a=~ s#/##g;

      (update) and these are only simple cases, even perltidy and PPI can fail in pathological cases. For a detailed discussion search for perl static parsing!

      Cheers Rolf

        Not to mention my own favorite syntax-highlighting-editor-confusers  $#array (highest index of @array) and the  $# special variable!

        # FIXME Why doesn't this change behaviour when I compile it? print <<'END_TEXT'; Just stripping the remaining chars starting with # will not work END_TEXT
Re: Stripping Comments from Source
by JavaFan (Canon) on May 12, 2010 at 12:01 UTC
    I would be thankful for any words of wisdom on this.
    Words of wisdom? Without knowing why you think it matters whether comments get "stripped" or not (I wonder, how do expect a compiler to treat comments, other than to ignore them?), no words of wisdom.