Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comparing and deleting some words from file

by perlbeginner10 (Acolyte)
on Nov 09, 2005 at 07:21 UTC ( [id://506994]=perlquestion: print w/replies, xml ) Need Help??

perlbeginner10 has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys, I am beginner in Perl. Please help me with this problem: I want to compare two (similar) files, and if some words are same in both the files, delete from one of them. For example: File 1 contains: cancer, lung cancer, heart, abdomen, stomach... and File 2 contains: Jim, John, abdomen, Jack... I want to delete abdomen from File 2, and keep scanning for other words in the files. Thanks.
  • Comment on comparing and deleting some words from file

Replies are listed 'Best First'.
Re: comparing and deleting some words from file
by Roger (Parson) on Nov 09, 2005 at 07:42 UTC
    Let me describe a quick way of doing this...

    0. assumption - you will never modify file 1, because you will only delete from file 2;

    1. read the first file into a hash table, having each word as the hash key;

    2. create a third file;

    3. while scanning the second file, check the hash table built in step 1 for existance of the word;
    if the word exists, do not print to the third file;
    if the word does not exist, print the the third file;

    4. replace file 2 with the third file.

    #!/usr/bin/perl -w use strict; use IO::File; my %hash = (); my $f = IO::File->new("file1.txt", "r") or die "can not open file 1"; while (my $line = <$f>) { chomp $line; for my $word (split /\s*,\s*/, $line) { $hash{$word}++; } } my $f3 = IO::File->new("file3.txt", "w") or die "can not create file 3 +"; my $f2 = IO::File->new("file2.txt", "r") or die "can not open file 2"; while (my $line = <$f2>) { chomp $line; my @words = (); for my $word (split /\s*,\s*/, $line) { if (! exists $hash{$word}) { push @words, $word; } print $f3 join(",", @words), "\n"; } undef $f; undef $f2; undef $f3; # then replace file 2 with file 3...
Re: comparing and deleting some words from file
by Amar (Sexton) on Nov 09, 2005 at 10:26 UTC
    hi,
    Assumption: Each word is seperated by comma only

    The code below is basic as the seeker of this perl question is a beginner
    #!c:/perl/bin/perl.exe use strict; my ($file1, $file2, $file1_contents, $file2_contents,@file1_words, @fi +le2_words, $word, $file2_new_contents); $file1="C:/file1.txt"; $file2="C:/file2.txt"; open(FH1, "<$file1") || die "$!\n"; open(FH2, "<$file2") || die "$!\n"; $file1_contents .= $_ while(<FH1>); $file1_contents =~ s/\s+//g; @file1_words = split(/,/,$file1_contents); $file2_contents .= $_ while(<FH2>); @file2_words = split(/,/,$file2_contents); close(FH1); close(FH2); foreach $word (@file1_words) { @file2_words = grep {!/^\s*$word\s*$/g} @file2_words; } $file2_new_contents .= $_."," foreach(@file2_words); open(FH2, ">$file2") || die "$!\n"; print FH2 $file2_new_contents; close(FH2);

    Hope it is useful
    amar

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://506994]
Approved by Roger
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-20 04:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found