http://www.perlmonks.org?node_id=604548


in reply to Re^2: Search and delete lines based on string matching
in thread Search and delete lines based on string matching

How is your input data formatted?
  1. bin den mig
  2. bin
    den
    mig
  3. bin deg
    mig

Replies are listed 'Best First'.
Re^4: Search and delete lines based on string matching
by brut (Initiate) on Mar 13, 2007 at 14:46 UTC
    Its as in option 2..that is new line character after each string. I am facing a problem in the code you provided that it is not able to delete strings like bin[0] , bin 12 and bin234. Can you please help on this also.
      Ah, you specified that words had to be removed, not tokens that could be part of a word. For the token 'bin' which should be removed:
      1. foobin
      2. binary
      3. bin1
      If all of those should be deleted then you can change that pattern from:
      my $pattern = '\b(?:' . join('|', @tokens) . ')\b';
      To:
      my $pattern = join('|', @tokens);
      If you only want to match words that start with 'bin', and are followed only by non-alpha characters, then this:
      my $pattern = '^(?:' . join('|', @tokens) . ')[^a-zA-Z]*$';
      A revised copy that handles the deletion of tokens with a purely line based input:
      #!/usr/local/bin/perl use strict; use warnings; if (@ARGV != 3) { print "Usage: $0 <pattern file> <input file> <output file>\n"; exit; } my ($pattern_filename, $source_filename, $dest_filename) = @ARGV; open my $pattern_fh, '<', $pattern_filename or die "Failed to open $pa +ttern_filename: $!"; my @tokens = (); while (my $line = <$pattern_fh>) { chomp $line; push @tokens, $line; } my $pattern = '^(?:' . join('|', @tokens) . ')[^a-zA-Z]*$'; print "Search pattern: $pattern\n"; open my $infile, "<", $source_filename or die "Failed to open $source +_filename: $!"; open my $outfile,">>", $dest_filename or die "Failed to open $dest_f +ilename: $!"; while(my $line = <$infile>) { print "input : $line"; if ($line =~ /$pattern/) { next; } print "output: $line"; print $outfile $line; } close($infile); close($outfile);
      A reply falls below the community's threshold of quality. You may see it by logging in.
Re^4: Search and delete lines based on string matching
by brut (Initiate) on Mar 13, 2007 at 14:53 UTC
    Its as in option 2 that is new line after every string in both A and B. Also in your code the strings like bin\5\ , bin \43\ and bin\123\ (like array elements with element number in square brackets)are not getting deleted from B. Can you help on this?