Firstly, some notes on what you've provided.
- Subroutine definitions
-
You're using sub inside a while loop.
In rare cases, it may be appropriate to do this - this is not one of them.
With the code you have here, a better place would be after close (file1);.
- Subroutine calls
-
You've called your subroutine as grep_pattern;
this would be better as grep_pattern().
See perlsub for more details.
- Opening files
-
The three argument form of open is preferred.
If open() fails, $! holds the reason why - you can use this in your error messages.
- Feedback on your code
-
Perl will provide you with feedback when you make mistakes or attempt dangerous operations.
To get this feedback you need to use the strict and warnings pragmata.
To get more verbose messages, also use the diagnostics pragma.
- Maintenance
-
Your code will be easier to read and maintain if it is laid out in a consistent manner.
See perlstyle.
Here's a script which (hopefully) does everything you want.
It doesn't use a subroutine and avoids the intermediate files.
The hash %entry_seen keeps track, as the name suggests, of the entries you've seen.
#!/usr/bin/env perl
use strict;
use warnings;
my $out_file = q{result.list};
my $search_re = qr{hello};
my %entry_seen = ();
print q{Enter reference filename: };
chomp(my $ref_file = <STDIN>);
open my $fh_ref, q{<}, $ref_file or die qq{Can't open $ref_file: $!};
open my $fh_out, q{>}, $out_file or die qq{Can't open $out_file: $!};
while (defined(my $txt_file = <$fh_ref>)) {
chomp $txt_file;
open my $fh_txt, q{<}, $txt_file or do {
warn qq{! SKIPPING: $txt_file: $!};
next;
};
print qq{PROCESSING: $txt_file\n};
while (defined(my $txt_line = <$fh_txt>)) {
chomp $txt_line;
next if $txt_line !~ $search_re;
if (! $entry_seen{$txt_line}++) {
print $fh_out qq{[$txt_file] $txt_line\n};
}
}
close $fh_txt;
}
close $fh_out;
close $fh_ref;
Here's the contents of the various files used and an example run.
ken@ganymede: ~/tmp/PM_DIFF
$ cat a.txt
b.txt
c.txt
dummy.txt
d.txt
ken@ganymede: ~/tmp/PM_DIFF
$ cat b.txt
Hello
hello
hello, world
goodbye
ken@ganymede: ~/tmp/PM_DIFF
$ cat c.txt
hello world
hullo
shellow
ken@ganymede: ~/tmp/PM_DIFF
$ cat d.txt
hello, world
hello and goodbye
Hello, world
ken@ganymede: ~/tmp/PM_DIFF
$ cat result.list
ken@ganymede: ~/tmp/PM_DIFF
$ pm_diff.pl
Enter reference filename: not_a_file
Can't open not_a_file: No such file or directory at ./pm_diff.pl line
+13, <STDIN> line 1.
ken@ganymede: ~/tmp/PM_DIFF
$ cat result.list
ken@ganymede: ~/tmp/PM_DIFF
$ pm_diff.pl
Enter reference filename: a.txt
PROCESSING: b.txt
PROCESSING: c.txt
! SKIPPING: dummy.txt: No such file or directory at ./pm_diff.pl line
+19, <$fh_ref> line 3.
PROCESSING: d.txt
ken@ganymede: ~/tmp/PM_DIFF
$ cat result.list
[b.txt] hello
[b.txt] hello, world
[c.txt] hello world
[c.txt] shellow
[d.txt] hello and goodbye
ken@ganymede: ~/tmp/PM_DIFF
$
Update: s/valid to do this/appropriate to do this/
|