Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Bolt on where a match is not found to a print script

by Anonymous Monk
on Dec 05, 2017 at 11:19 UTC ( #1204943=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

The script works very well to match and print each time the number exists on a line in one or more of the files. My question is that I would like to be able to identify when there is no match at all for that number in any of the files (i.e. id didn't find the match in any of the files and im letting you know). To clarify, I still need to match and print every time the number is matched in every file, not just if it was matched once. also the line output where it matches is critical to be printed. ultimately this is just to show if the number was not matched anywhere in any of the files.

#!/perl/bin/perl use strict; use warnings; my @files = <c:/perl64/myfiles/*>; foreach my $file ( @files ) { open my $file_h, '<', $file or die "Can't open $file: $!"; while ( <$file_h> ) { print "$file $_" if /\b1203\b/; print "$file $_" if /\b1204\b/; print "$file $_" if /\b1207\b/; } }

Replies are listed 'Best First'.
Re: Bolt on where a match is not found to a print script
by hippo (Chancellor) on Dec 05, 2017 at 11:30 UTC

    Keep a tally:

    #!/perl/bin/perl use strict; use warnings; my @files = <c:/perl64/myfiles/*>; my $total = 0; foreach my $file ( @files ) { open my $file_h, '<', $file or die "Can't open $file: $!"; while ( <$file_h> ) { if (/\b120[347]\b/) { print "$file $_"; $total++; } } } print "No files matched.\n" unless $total;

    (Edited for typos)

      I think they wanted to check for each number separately. But this is most of the way there -- I'd just add a hash for the match, something like $matched{$1} = 1, and check that at the end.

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        Hmmm. Yes, you could well be right about each one requiring a separate flag, although that's not how I originally read the requirement. At least our anonymous brother has 2 marginally differing approaches to choose between. The whole thing does have something of an XY ring to it. Perhaps we'll find out.

Re: Bolt on where a match is not found to a print script
by QM (Parson) on Dec 05, 2017 at 11:44 UTC
    Something like this (untested):
    #!/usr/bin/env perl use strict; use warnings; my @files = <c:/perl64/myfiles/*>; # record matching regexes our %matched; my @nums = ('1203', '1204', '1207'); my $regex = '\b(' . join('|', @nums) . ')\b'; # update: fixed + to . for my $file ( @files ) { open my $file_h, '<', $file or die "Can't open $file: $!"; while ( <$file_h> ) { if (my ($match) = m/$regex/) { $matched{$match} = 1; print "$file $_"; } } } # Check all nums have been seen for my $num (@nums) { if (not exists($matched{$num})) { print "$num not found\n"; } }

    I would probably remove the @files and pass them on the command line (changing the for loop to while (<>)).

    You might also do something more generic around the numbers, and allow those to be passed on the command line as well -- but this will take some command line parsing, or one of the GetOpt modules.

    Update: Fixed capture problem, as noted by AnomalousMonk.

    Update 2: Fixed concatenation issue.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      my @nums = ('1203', '1204', '1207'); my $regex = '\b(?:' + join('|', @nums) + ')\b';

      No way is provided for  if (my ($match) = m/$regex/) { ... } to capture anything (and thus be true). Also, only a single match per line is assumed. (Also  + (addition) is used where  . (concatenation) is intended.)

      I would suggest something along the lines of

      c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @nums = ('1203', '1204', '1207'); my $regex = '\b(?:' . join('|', @nums) . ')\b'; print qq{'$regex'}; ;; my @matches = 'w 1207 x 1203 y 9999 z' =~ m{ $regex }xmsg; dd \@matches; " '\b(?:1203|1204|1207)\b' [1207, 1203]
      with the loop being (also untested):
      while ( <$file_h> ) { if (my @matches = m/$regex/g) { ++$matched{$_} for @matches; print "$file $_"; } }

      Update: Changed code example to include 9999 group.

      Update 2: Shorter, IMHO sweeter:

      c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @nums = qw(1203 1204 1207 1111); my ($regex) = map qr{ \b (?: $_) \b }xms, join ' | ', @nums; print $regex; ;; my %seen; for ('w 1207 x 1203 y 9999 z', 'w 11203 x 12033 y 112033 z', 'w 1207 x 1203 y 9999 z 1207 zz', ) { print qq{>$_<} if map ++$seen{$_}, m{ $regex }xmsg; } dd \%seen; ;; my $not_seen = join ' ', grep !$seen{$_}, @nums; print 'num(s) not seen: ', $not_seen || '(none)'; " (?msx-i: \b (?: 1203 | 1204 | 1207 | 1111) \b ) >w 1207 x 1203 y 9999 z< >w 1207 x 1203 y 9999 z 1207 zz< { 1203 => 2, 1207 => 3 } num(s) not seen: 1204 1111


      Give a man a fish:  <%-{-{-{-<

        Ah, yes, I started down one path, then changed tacks.

        I believe if you change this line:

        my $regex = '\b(?:' . join('|', @nums) . ')\b'; # update: fixed concat +enation issue

        to this:

        my $regex = '\b(' . join('|', @nums) . ')\b'; # update: fixed concaten +ation issue

        it does the job.

        For instance, from the debugger (which, note, doesn't play well with my):

        DB<1> $_ = "oh blah dee" DB<2> ($x) = m/(blah)/ DB<3> x $x 0 'blah'

        Update: Fixed concatenation issue

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of

Re: Bolt on where a match is not found to a print script
by shmem (Chancellor) on Dec 05, 2017 at 23:54 UTC

    This is how I would do it:

    # *perl* use strict; use warnings; use Getopt::Std; our ($opt_n, $opt_g); getopt('n'); if (! $opt_n || ! $opt_g) { print "usage: $0 -n numberlist -g globpattern\n" . " where numberlist are numbers separated by commas,\n" . " e.g. -n 1203,0815,22222\n"; exit; } my %matches; $matches{$_} = 0 for split /,/, $opt_n; $opt_n =~ s/,/|/g; my $re = qr{\b($opt_n)\b}; @ARGV = <$opt_g>; # set @ARGV to found files while(<>) { my @m; @m = $_ =~ /$re/g and @matches{@m} = (1) x @m and print "$ARGV $_" +; } my @nomatch = grep { ! $matches{$_} } keys %matches; if(@nomatch) { print "$_ not matched\n" for @nomatch; }
    C:\some\path>perl match.pl -n 1203,1204,1207 -g c:/perl64/myfiles/*

    Of course, TIMTOWTDI (there is more than one way to do it)

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1204943]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2021-05-11 14:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (117 votes). Check out past polls.

    Notices?