Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Search and delete lines based on string matching

by ptum (Priest)
on Mar 13, 2007 at 14:00 UTC ( #604532=note: print w/ replies, xml ) Need Help??


in reply to Search and delete lines based on string matching

Since you posted as Anonymonk, you can't go back and edit your post, but next time, please use

<code>
tags.

What error are you seeing? You didn't tell us.

Is this a homework problem? We don't mind helping, but we're not particularly inclined to do your homework.

To solve a problem like this, I would generally read in the contents of file A into a hash, since you just want to use those words as a lookup. Then I would open files B and C, step through the contents of file B a line at a time, and, whenever the line of B contains a word in my hash, drop it on the floor -- otherwise, write that line to file C. I don't think that opening the file handles inside your loop is a good idea.

You're not really clear as to whether file B contains single words or longer strings -- if longer strings, then you might want to split the line into individual tokens (which can then be individually compared to your hash from file A) or (if the number of words in file A is small enough) you may prefer to build a regular expression by which you evaluate each string. A little more detail might help us to help you more effectively.


Comment on Re: Search and delete lines based on string matching
Re^2: Search and delete lines based on string matching
by Anonymous Monk on Mar 13, 2007 at 14:11 UTC
    Hey man Sorry for the untidy question. So I have single word strings in both files A and B and that too in a sorted manner. Like A will have bin hye B will have something like bin den mig So C shouldnt have all those things from A which are matched in B.. C should be den mig as bin was matched from A. This is no homework but for some work.Really will appreciate if you can provide the code for your solution: "To solve a problem like this, I would generally read in the contents of file A into a hash, since you just want to use those words as a lookup. Then I would open files B and C, step through the contents of file B a line at a time, and, whenever the line of B contains a word in my hash, drop it on the floor -- otherwise, write that line to file C. I don't think that opening the file handles inside your loop is a good idea. "

      Hmmmm. You didn't answer our question about what error you were seeing from your original code, and (based on the simplicity of the problem) I'm not entirely convinced it isn't homework. Generally, if you want help here at PerlMonks, it is better to show a little more effort, rather than just asking us to provide code. Even so, I'll help to steer you in the right direction with a few untested code snippets.

      Read the contents of file A into a hash:

      use strict; use warnings; my $fh; my $myfile = '/path/to/file/a'; unless (open($fh,"<",$myfile)) { die "Can't open $myfile: $!\n"; } my %delete_words = (); while (<$fh>) { chomp; $delete_words{$_}++; } close($fh);

      So now you have all the words in your delete list in the hash. Next you want to open file B for reading and file C for writing (in much the same way as we opened file A) and step through the lines of file B, one at a time. Each time you have a line of file B, you want to test whether it exists in your hash. If file B contained multiple words per line, you would have to jump through more hoops, but since your file B isn't very complicated, for each line in file B you can just do something like this:

      if (exists($delete_words{$_})) { # do nothing } else { # write to file C }

      That's really all there is to it, except you'll want to explicitly close files B and C.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://604532]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (9)
As of 2014-09-15 11:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (147 votes), past polls