Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Comment Stripper script for unix

by hsinclai (Deacon)
on Jun 14, 2004 at 01:55 UTC ( #366388=sourcecode: print w/replies, xml ) Need Help??
Category: Utility Scripts
Author/Contact Info devel
invoke as "e" or "ee"
Comment stripper for unix, useful during system administration. Removes blank lines, writes output file, strips "#" or ";". Tries to preserve shell scripts.
Please see the POD
#!/usr/bin/perl -w

#   (invoke as e or ee)
#            Please see the POD for install and licensing details

use strict;

###### globals
my $version = "0.9";
my $comm;
my @stripped;
my $topline;

######  how we were called
chomp(my $us = qx!basename $0!);
if ( $us eq "ee" ) { $comm = ';'; } else { $comm = '#'; }

######  parse args
$#ARGV >= 2 && die("\n No more than 2 arguments\n\n"); 
defined $ARGV[0] || die(&usage($us));
my $ifile=$ARGV[0];
-e $ifile || die("\n Input file nonexistent.\n\n");

open(IFIL,"<$ifile") or die("problem opening input_file");
my @inputfile=<IFIL>;

######  main
if ( $us eq "ee" ) {
   $topline = shift(@inputfile);
   die(&pwarn($comm)) if $topline =~ /\#\!.*perl/i ;
} elsif ( $us eq "e" ) {
     $topline = shift(@inputfile);
     if ( $topline =~ /(\s+)\#\!/ ) {
       } else {

######  final output
if ( $ARGV[1] ) {
    open(OFIL,">$ARGV[1]") or die("problem creating output_file"); 
    for ( @stripped ) { print OFIL "$_\n"; }
    print "\n Done stripping $ifile\n     -\>  wrote output file \"$AR
} else {
    for ( @stripped ) { print "$_\n"; }
exit $?;

######  subs

sub stripper {
    for ( @_ ) {
        next if /^$comm|^(\s*)$comm|^(\s*)$/;
        $_ =~ s/$comm.*$//;
    return @stripped;

sub usage {
 print qq[
   Usage:   e filename [outputfilename]
            e strips comments and blank lines from an existing file.
            e to remove # comments, and ee to strip ; comments.
            See "perldoc"
   v$version                                        invo
+ked as \'$us\'


sub pwarn {
 print  qq[
 WARNING:   Input file "$ifile" looks like a Perl script
            The first line was:   $topline
            When invoked as \'$us\', strips out semicolons,
            which might not be very useful for looking at a Perl scrip
            If this assumption is wrong, remove the first line tempora



=head1 NAME

e (and ee), symbolic links to

=head1 VERSION

Version 0.9


 e   (, to be invoked as either "e" or "ee")

 e   args
ee   args


B<e> (invoked as "e" or "ee") is a small program to strip unix style c
+omments ( e.g., "#" or ";" ) from scripts and configuration files. It
+ might be
 useful during system administration. It is called "e" simply for brev

B<e> also removes blank lines, makes some effort not to destroy shell 
+scripts and shebangs, and tries to avoid mangling Perl scripts it enc

B<e> is meant to be run on Unix systems where #, #!, and ; are common 

B<e> requires at least one argument, a filename to be processed.

B<e> tries to detect if the first line of the input file contains the 
+#! character sequence, and tries to preserve it, assuming it might be
+ a shell 

B<e> will stop and warn you about removing semi-colons from a file it 
+thinks is a Perl script.


Install the main file,, somewhere in your path, then in the same 
+directory, do

  ln -s e
  ln -s ee

Use e or ee, depending on what character you want to strip.

Invoking directly breaks it.

If you already have an e or ee on your system, you may use other symbo
+lic links,
If you rename these files, you will have to adjust the main script acc


=over 4

=item B<e> I<input_filename>

Strips # comments and blank lines out of "filename" and sends the resu
+lt to your screen.

=item B<e> I<input_filename> [I<output_filename>] 

Same as above, but the result will be written to a new file "output_fi
+lename" in the current directory.

=item B<ee> I<input_filename> [I<output_filename>] 

Same as above, but semicolon as the comment character.


=head1 BUGS

Might not be able to preserve the shebang line in a shell script, when
+ the shebang line is preceded by one or more blank lines.


Does not remove C style comments.

Inefficiently written, so uses lots of memory when input files get lar

Cannot detect a "here" document, and will happily destroy the contents
+ of one when it encounters a comment character somewhere in there.

=head1 AUTHOR

Harold Sinclair
devel at hastek


Copyright Е2004 hastek. All rights reserved.

This program is free software; you can redistribute it and/or modify i
+t under the same terms as Perl itself.


Replies are listed 'Best First'.
Re: Comment Stripper script for unix
by Zaxo (Archbishop) on Jun 14, 2004 at 02:49 UTC

    I tried applying this script to itself. That was to check if significant uses of '#' were handled properly. The results were, uhhh . . . unfortunate.

    1. It stripped the shebang line, which doesn't look exotic at all.
    2. It did
      -if ( $us eq "ee" ) { $comm = ';'; } else { $comm = '#'; } +if ( $us eq "ee" ) { $comm = ';'; } else { $comm = '
      leaving an unclosed quote in the code.
    3. It did
      - die(&pwarn($comm)) if $topline =~ /\#\!.*perl/i ; + die(&pwarn($comm)) if $topline =~ /\
      leaving an open regex match.
    4. It did
      - if ( $topline =~ /(\s+)\#\!/ ) { + if ( $topline =~ /(\s+)\
      to the same effect.

    I think your e can only be applied in the simplest circumstances.

    Don't feel too bad, the saying goes, "Only perl can parse Perl." To do this sort of thing properly really does require a parser.

    After Compline,

      Don't feel too bad, the saying goes, "Only perl can parse Perl." To do this sort of thing properly really does require a parser.
      ... or take a look at perltidy, which does a really good job on perl code formatting and also has a switch for stripping comments.
      Whoa - that's terrible - obviously I didn't test it with Perl scripts enough - I only used it with config files and shell scripts really - way too hasty ...

      This plain doesn't work and should be removed from the code catacombs - you all are too kind! Or maybe moved to the "don't let this happen to you" section?

      I didn't know Perltidy removed comments, so thanks for that eserte.

Re: Comment Stripper script for unix
by Abigail-II (Bishop) on Jun 14, 2004 at 15:06 UTC
    #!/bin/bash # This is a comment. echo "# This is not a comment" echo \# and neither is this.
    echo " echo \
    Your program will strip she-bang lines unless such a line starts with whitespace. However, whitespace isn't optional. The first 2 bytes of the file need to be #!, the kernel isn't going to skip over whitespace (and whitespace certainly isn't mandatory). Furthermore, the base of your program is an extremely symplistic regex - it just removes anything on a line starting at the first #. Your program could as well have been:
    perl -nle 's/#.*//; print if /\S/'

    But my biggest question is, why do you think this is useful for system administration? I don't know any system administrator who wants to remove comments from his configuration files or from his shell scripts.


      This is an annoying trend that's driving me nuts where I work to.. Somehow they are justifying it in the name of security. ( Even to the point of stripping comments from all applications.)

        I tend to ask people to elaborate on that, and ask them to explain how this is helping security. I also might point out that $ > /secret/file works even better (sure, it has some side-effects, but isn't security important enough that we can justify some side-effects?)


      Hi Abigail,

      Your program will strip she-bang lines unless such a line starts with whitespace.
      Are you sure about that? The shebang line is not stripped, if it is the first line, which gets preserved and re-inserted back into the final output..
      update- you're totally right about that, I screwed it up..

      why do you think this is useful for system administration..

      Because removing commented lines lets you get a quick view only of active lines - in a file that might have only a few active lines among several screens of commented lines, e.g. a stock squid.conf file..

      Thanks for the feedback!
        Because removing commented lines lets you get a quick view only of active lines - in a file that might have only a few active lines among several screens of commented lines, e.g. a stock squid.conf file..
        Well, a simple grep -v ^\# will do that. If an "active" line has a trailing comment, it doesn't matter. It also doesn't explain why you want to remove comments from a shell script.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://366388]
[ambrus]: choroba: heh heh... I have such a doc bug report somewhere. fixed by now.
[Corion]: Once upon a time I had automatic tests for checking the synopsis, but I stopped doing that because the setup was too fragile on CPAN testers for extracting code from the SYNOPSIS.
[Corion]: Maybe I should move the extraction of the code from the SYNOPSIS section into the author tests, or something like that...
[choroba]: Corion Sounds reasonable
[Corion]: choroba: Yeah - I basically have the same for regenerating README and README.mkdown already, except that I do that in Makefile.PL, but I guess one or the other thing should somehow work ;)

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2017-02-27 12:12 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (383 votes). Check out past polls.