Comment Stripper script for unix
hsinclai
created: 2004-06-13 21:55:14
#!/usr/bin/perl -w

#     e.pl   (invoke as e or ee)
#            Please see the POD for install and licensing details

use strict;

###### globals
my $version = "0.9";
my $comm;
my @stripped;
my $topline;


######  how we were called
chomp(my $us = qx!basename $0!);
if ( $us eq "ee" ) { $comm = ';'; } else { $comm = '#'; }


######  parse args
$#ARGV >= 2 && die("\n No more than 2 arguments\n\n"); 
defined $ARGV[0] || die(&usage($us));
my $ifile=$ARGV[0];
-e $ifile || die("\n Input file nonexistent.\n\n");

open(IFIL,"<$ifile") or die("problem opening input_file");
my @inputfile=;
close(IFIL); 


######  main
if ( $us eq "ee" ) {
   $topline = shift(@inputfile);
   die(&pwarn($comm)) if $topline =~ /\#\!.*perl/i ;
   unshift(@inputfile,$topline);
   &stripper(@inputfile);
} elsif ( $us eq "e" ) {
     $topline = shift(@inputfile);
     if ( $topline =~ /(\s+)\#\!/ ) {
        &stripper(@inputfile);
        unshift(@stripped,$topline);
       } else {
        unshift(@inputfile,$topline);
        &stripper(@inputfile);
     }
  } 


######  final output
if ( $ARGV[1] ) {
    open(OFIL,">$ARGV[1]") or die("problem creating output_file"); 
    for ( @stripped ) { print OFIL "$_\n"; }
    print "\n Done stripping $ifile\n     -\>  wrote output file \"$ARGV[1]\"\n\n";
    close(OFIL);
} else {
    for ( @stripped ) { print "$_\n"; }
  }
exit $?;




######  subs

sub stripper {
    for ( @_ ) {
        chomp;
        next if /^$comm|^(\s*)$comm|^(\s*)$/;
        $_ =~ s/$comm.*$//;
        push(@stripped,$_);
    }
    return @stripped;
}

sub usage {
 print qq[
   Usage:   e filename [outputfilename]
            ································································
            e strips comments and blank lines from an existing file.
            e to remove # comments, and ee to strip ; comments.
            
            See "perldoc e.pl"
            ································································
            e.pl v$version                                        invoked as \'$us\'

]; 
exit(1);
}

sub pwarn {
 print  qq[
 WARNING:   Input file "$ifile" looks like a Perl script
            
            The first line was:   $topline
            When invoked as \'$us\', e.pl strips out semicolons,
            which might not be very useful for looking at a Perl script.
            If this assumption is wrong, remove the first line temporarily.


];
&usage;
exit(1);
}


__END__

=head1 NAME


e (and ee), symbolic links to e.pl



=head1 VERSION


Version 0.9



=head1 SYNOPSIS


 e   (e.pl, to be invoked as either "e" or "ee")

 e   args
ee   args




=head1 DESCRIPTION


B (invoked as "e" or "ee") is a small program to strip unix style comments ( e.g., "#" or ";" ) from scripts and configuration files. It might be
 useful during system administration. It is called "e" simply for brevity.

B also removes blank lines, makes some effort not to destroy shell scripts and shebangs, and tries to avoid mangling Perl scripts it encounters.

B is meant to be run on Unix systems where #, #!, and ; are common comments/patterns.

B requires at least one argument, a filename to be processed.

B tries to detect if the first line of the input file contains the #! character sequence, and tries to preserve it, assuming it might be a shell 
script.

B will stop and warn you about removing semi-colons from a file it thinks is a Perl script.




=head1 INSTALLATION


Install the main file, e.pl, somewhere in your path, then in the same directory, do

  ln -s e.pl e
  ln -s e.pl ee

Use e or ee, depending on what character you want to strip.

Invoking e.pl directly breaks it.

If you already have an e or ee on your system, you may use other symbolic links,
If you rename these files, you will have to adjust the main script accordingly.


=head1 EXAMPLES


=over 4

=item B I

Strips # comments and blank lines out of "filename" and sends the result to your screen.



=item B I [I] 

Same as above, but the result will be written to a new file "output_filename" in the current directory.


=item B I [I] 

Same as above, but semicolon as the comment character.

=back



=head1 BUGS

Might not be able to preserve the shebang line in a shell script, when the shebang line is preceded by one or more blank lines.



=head1 LIMITATIONS

Does not remove C style comments.

Inefficiently written, so uses lots of memory when input files get larger.

Cannot detect a "here" document, and will happily destroy the contents of one when it encounters a comment character somewhere in there.


=head1 AUTHOR

Harold Sinclair
devel at hastek


=head1 COPYRIGHT

Copyright ©2004 hastek. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


=cut

#EOF
Re: Comment Stripper script for unix
created: 2004-06-13 22:49:05

I tried applying this script to itself. That was to check if significant uses of '#' were handled properly. The results were, uhhh . . . unfortunate.

  1. It stripped the shebang line, which doesn't look exotic at all.
  2. It did
    -if ( $us eq "ee" ) { $comm = ';'; } else { $comm = '#'; }
    +if ( $us eq "ee" ) { $comm = ';'; } else { $comm = '
    leaving an unclosed quote in the code.
  3. It did
    -   die(&pwarn($comm)) if $topline =~ /\#\!.*perl/i ;
    +   die(&pwarn($comm)) if $topline =~ /\
    leaving an open regex match.
  4. It did
    -     if ( $topline =~ /(\s+)\#\!/ ) {
    +     if ( $topline =~ /(\s+)\
    to the same effect.

I think your e can only be applied in the simplest circumstances.

Don't feel too bad, the saying goes, "Only perl can parse Perl." To do this sort of thing properly really does require a parser.

After Compline,
Zaxo

Re^2: Comment Stripper script for unix
created: 2004-06-14 05:19:34
Don't feel too bad, the saying goes, "Only perl can parse Perl." To do this sort of thing properly really does require a parser.
... or take a look at perltidy, which does a really good job on perl code formatting and also has a switch for stripping comments.
Re^2: Comment Stripper script for unix
created: 2004-06-14 07:55:36
Whoa - that's terrible - obviously I didn't test it with Perl scripts enough - I only used it with config files and shell scripts really - way too hasty ...

This plain doesn't work and should be removed from the code catacombs - you all are too kind! Or maybe moved to the "don't let this happen to you" section?

I didn't know Perltidy removed comments, so thanks for that [eserte].




Re: Comment Stripper script for unix
created: 2004-06-14 11:06:40
Input:
#!/bin/bash

# This is a comment.

echo  "# This is not a comment"
echo  \# and neither is this.
Output:
echo  "
echo  \
Your program will strip she-bang lines unless such a line starts with whitespace. However, whitespace isn't optional. The first 2 bytes of the file need to be #!, the kernel isn't going to skip over whitespace (and whitespace certainly isn't mandatory). Furthermore, the base of your program is an extremely symplistic regex - it just removes anything on a line starting at the first #. Your program could as well have been:
perl -nle 's/#.*//; print if /\S/'

But my biggest question is, why do you think this is useful for system administration? I don't know any system administrator who wants to remove comments from his configuration files or from his shell scripts.

Abigail

Re^2: Comment Stripper script for unix
created: 2004-06-14 11:10:28

This is an annoying trend that's driving me nuts where I work to.. Somehow they are justifying it in the name of security. ( Even to the point of stripping comments from all applications.)

Re: Comment Stripper script for unix
created: 2004-06-14 11:34:45
I tend to ask people to elaborate on that, and ask them to explain how this is helping security. I also might point out that $ > /secret/file works even better (sure, it has some side-effects, but isn't security important enough that we can justify some side-effects?)

Abigail

Re^2: Comment Stripper script for unix
created: 2004-06-14 11:35:48
Hi Abigail,

Your program will strip she-bang lines unless such a line starts with whitespace.
Are you sure about that? The shebang line is not stripped, if it is the first line, which gets preserved and re-inserted back into the final output..
update- you're totally right about that, I screwed it up..

why do you think this is useful for system administration..

Because removing commented lines lets you get a quick view only of active lines - in a file that might have only a few active lines among several screens of commented lines, e.g. a stock squid.conf file..

Thanks for the feedback!
Re: Comment Stripper script for unix
created: 2004-06-14 13:22:05
Because removing commented lines lets you get a quick view only of active lines - in a file that might have only a few active lines among several screens of commented lines, e.g. a stock squid.conf file..
Well, a simple grep -v ^\# will do that. If an "active" line has a trailing comment, it doesn't matter. It also doesn't explain why you want to remove comments from a shell script.

Abigail

perlmonks.org content © perlmonks.org and Abigail-II, coreolyn, eserte, hsinclai, Zaxo

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03