Nico has asked for the wisdom of the Perl Monks concerning the following question:
Hey all! New to PerlMonks and Perl in general. I have been trying to create a script as a side project (personal knowledge, no monetary gain) that goes through about 30 files and pulls out lines that match a specific set of search criteria.
My code works, but I wanted to ask if there is a more "clean" way to do it.
1) I set some variables for the HTML report and the log file. Currently I don't have any HTML syntax being written, that will come later once I get everything nailed down.
2) I create the two files that I am going to be writing to.
3) I check to see if the file exists. If it does, I open the HTML file and the log file for appending. If it doesn't, I write to the log file
4) I pull out only the lines I want, and then write them to the HTML file.
5) I close the HTML file and the log file.
#!/usr/bin/perl
use warnings;
use strict;
use 5.010;
use Time::Local;
my $file_01_txt = "file_01.txt";
($sec, $min, $hour, $mday, $mon, $year, $j1,$j2, $j3) =localtime(time)
+;
$year = $year+1900;
$mon = $mon+1;
$outfolder = "REPORT_output\_$year\_$mon\_$mday\_\_$hour\_$min\_$sec";
mkdir $outfolder;
mkdir "$outfolder/log";
$html_file = <$outfolder/REPORT.html>;
$log_file = <$outfolder/log/log.txt>;
open($html_report, '>', $html_file) or die "Cannot open file $html_fil
+e";
open($log_report, '>', $log_file) or die "Cannot open file $log_file";
main();
sub main {
if (-e $file_01_txt) {
# We are going to open the log file for appending.
open($log_report, '>>', $log_file) or die "Cannot open $log_fi
+le.";
# We are going to open the HTML file for appending.
open($html_report, '>>', $html_file) or die "Cannot open file
+$html_file";
# If the file exists, we are going to open it, so we can read
+from it.
print $log_report "$file_01_txt has been found.\n";
open (my $fh, "<", $file_01_txt) or die "Cannot open file $fil
+e_01_txt";
print $html_report "\n";
print $html_report "#### Now Analyzing '$file_01_txt'\n";
print $html_report "\n";
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /search criteria 01/i) {
print $html_report "$line\n";
}
if ($line =~ /search criteria 02/i) {
print $html_report "$line\n";
}
if ($line =~ /search criteria 03/i) {
print $html_report "$line\n";
print $html_report "\n";
}
}
close $log_report;
close $html_report;
} else {
# We are going to open the log file for appending.
open($log_report, '>>', $log_file) or die "Cannot open $log_fi
+le.";
# We are going to log that the file has not been found.
print $log_report "$file_01_txt has not been found.\n";
close $log_report;
}
}
Is there a more clean way of opening, writing to, and closing the HTML and log files?
Right now, I have one "main" sub that holds all of the "if" statements for each file. Should I break those up into different subs?"
Thanks for any help you might be able to provide!
Nico
Re: Perl File Parsing - My Code Works, but it's Ugly!
by aaron_baugher (Curate) on May 31, 2015 at 18:02 UTC
|
Nothing huge, but some cleanup things:
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] [d/l] [select] |
|
Thank you for your response! I have no idea what I was thinking with those opens and closes. I only need to specify them once at the beginning of the script, and I can use them throughout the script.
I removed the junk variables and removed Time::Local. It works!
I will need to figure out how to error check my mkdir statements. But luckily I only have those two statements there so shouldn't be too hard
| [reply] |
|
mkdir $newdir or die "Unable to mkdir $newdir : $!";
That's true of most commands in Perl; they return true if successful and false on failure, so you can use "do_this() or show_error()" logic. (One exception is system, which returns the return value of the underlying command, which (on Unix at least) is zero on success, so you have to watch out for that.)
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] [d/l] |
Re: Perl File Parsing - My Code Works, but it's Ugly!
by stevieb (Canon) on May 31, 2015 at 16:43 UTC
|
Here is some example code that depicts one way you can eliminate the if() statements. Put your regexes into an array, and loop over them all at once per line. The $i variable is simply there to show that all regexes are being searched, and performs no functional use.
#!/usr/bin/perl
use warnings;
use strict;
my @regexes = qw(one two three);
my $line = "this is thREE";
my $i = 1;
for my $regex (@regexes){
if ($line =~ /$regex/i){
print "$i\n";
}
$i++;
}
Also, although just preference, I like to write my opens like the following. I feel it's cleaner. Note the use of $! in the die() statement; it explains *why* the file couldn't be opened. Also note the use of 'my'... you should be using this to define each variable. You'll have to do this when you implement use strict; as I stated in an earlier post.
open my $log_report, '>>', $log_file
or die "Can't open the file $log_file: $!";
-stevieb | [reply] [d/l] [select] |
|
| [reply] |
Re: Perl File Parsing - My Code Works, but it's Ugly!
by golux (Chaplain) on May 31, 2015 at 19:58 UTC
|
$outfolder = "REPORT_output\_$year\_$mon\_$mday\_\_$hour\_$min
+\_$sec";
The underscore character doesn't need to be escaped like this everywhere in a string. If you were doing so to separate it from the preceding variable (ie. "$year\_$mon" in order to interpolate $year instead of $year_), there's another, more common way to do so which is arguably a little easier to read:
$outfolder = "REPORT_output_${year}_${mon}_${mday}__${hour}_${
+min}_$sec";
This syntax is also often used to separate variables from other chars which could be part of the variable name; eg.:
my $msg = "Hello world";
my $color = 101; # Red background
print "\e[${color}m $msg \e[m\n"; # Embed $color in escape sequence
say
substr+lc crypt(qw $i3 SI$),4,5
| [reply] [d/l] [select] |
|
my $outfolder = join '_', 'REPORT', 'output', $year, $mon, $mday, '',
+$hour, $min, $sec;
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] [d/l] |
|
$outfolder = sprintf "REPORT_output_%d_%d_%d__%d_%d_%d",
$year, $mon, $mday, $hour, $min, $sec;
With the advantage that you could do something like:
$outfolder = sprintf "REPORT_output_%04d_%02d_%02d__%02d_%02d_%02d",
$year, $mon, $mday, $hour, $min, $sec;
giving
REPORT_output_2015_06_01__00_19_42
| [reply] [d/l] [select] |
|
I prefer the function strftime in POSIX for formatting dates. The only downside is that documentation for the format codes is "borrowed" from "C" and not included in perl's documentation.
use strict;
use warnings;
use POSIX 'strftime';
my $outfolder =
strftime "Report_Output_%Y_%m_%d__%H_%M_%S", localtime();
print $outfolder;
| [reply] [d/l] |
Re: Perl File Parsing - My Code Works, but it's Ugly!
by pme (Monsignor) on May 31, 2015 at 15:44 UTC
|
Hi Nico,
Welcome to the monastery!
Your code does not compile. Could you share your working code? | [reply] |
|
#!/usr/bin/perl
use 5.010;
use Time::Local;
my $file_01_txt = "file_01.txt";
($sec, $min, $hour, $mday, $mon, $year, $j1,$j2, $j3) =localtime(time)
+;
$year = $year+1900;
$mon = $mon+1;
$outfolder = "REPORT_output\_$year\_$mon\_$mday\_\_$hour\_$min\_$sec";
mkdir $outfolder;
mkdir "$outfolder/log";
$html_file = <$outfolder/REPORT.html>;
$log_file = <$outfolder/log/log.txt>;
open($html_report, '>', $html_file) or die "Cannot open file $html_fil
+e";
open($log_report, '>', $log_file) or die "Cannot open file $log_file";
main();
sub main {
if (-e $file_01_txt) {
# We are going to open the log file for appending.
open($log_report, '>>', $log_file) or die "Cannot open $log_fi
+le.";
# We are going to open the HTML file for appending.
open($html_report, '>>', $html_file) or die "Cannot open file
+$html_file";
# If the file exists, we are going to open it, so we can read
+from it.
print $log_report "$file_01_txt has been found.\n";
open (my $fh, "<", $file_01_txt) or die "Cannot open file $fil
+e_01_txt";
print $html_report "\n";
print $html_report "#### Now Analyzing '$file_01_txt'\n";
print $html_report "\n";
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /search criteria 01/i) {
print $html_report "$line\n";
}
if ($line =~ /search criteria 02/i) {
print $html_report "$line\n";
}
if ($line =~ /search criteria 03/i) {
print $html_report "$line\n";
print $html_report "\n";
}
}
close $log_report;
close $html_report;
} else {
# We are going to open the log file for appending.
open($log_report, '>>', $log_file) or die "Cannot open $log_fi
+le.";
# We are going to log that the file has not been found.
print $log_report "$file_01_txt has not been found.\n";
close $log_report;
}
}
That should work. I just ran it and everything came back as it should.
Please let me know if I need to give you anything else! | [reply] [d/l] |
Re: Perl File Parsing - My Code Works, but it's Ugly!
by stevieb (Canon) on May 31, 2015 at 16:31 UTC
|
This would be pretty trivial to clean up, but please replace your code in the original post with the code you posted in a below post.
Also, append in a different code tag section a snippet of the input file, and specify exactly what your 'search criteria' actually is, so we can test it for real.
One more thing, put:
use warnings;
use strict;
at the top of your script (under the shebang line), rerun it and fix any errors that are shown.
-stevieb | [reply] [d/l] |
|
use warnings;
use strict;
and received a bunch of error messages in regards to "explicit package name" requirements. For example:
Global symbol "$sec" requires explicit package name
Time to figure out what that means!
| [reply] [d/l] |
|
| [reply] |
|
|
Thank you for your reply!
I have updated the code in the original post, and added what you suggested.
I am going to re-run the script now and see what I get.
| [reply] |
|
Given the advice you took, no downvotes, handslaps or similar.... BUT FOR FUTURE REFERENCE:
Please, do NOT ever remove anything you've posted*1* because doing so can take away the context of replies you may have received (or, perhaps, are even being written as you make you change). Instead, use <strike> content to be deleted before closing the tag </strike> to disavow, show that you've changed something, etc. AND add a warning about your changes -- something like "NOTE: updated (did such and such" -- in a way that makes clear what you did and why.
And, of course, WELCOME to PM!
*1* There are exceptions but they are beyond the scope of this advice.
As a newcomer, you'll find these worth reading.
| [reply] |
|
|