Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Filtering out stop words

by Eily (Monsignor)
on Feb 25, 2020 at 14:04 UTC ( #11113404=note: print w/replies, xml ) Need Help??


in reply to Filtering out stop words

Yet another way to do it, which might be way faster if you have very few words to check (in your example you have only one, but this might be done in a loop) is to do it the other way around. Collect the words to be tested first, and then check if the any of the words in your dictionary match:

my @words = get_words_to_check(); my %hash = map { $_ => 1 } @words; while (my $line = <>) { chomp $line; delete $hash{$line} if exists $hash{$line}; # The if exists isn't re +quired here, but it does make it look cleaner } my @good_words = grep { exists $hash{$_} } @words; # Keep the original + order my @good_words_2 = keys %hash; # Don't care about the original order

Or, if borth word lists are sorted, something like this might work:

my $index = 0; LINE: while (my $line = <>) { chomp $line; # While the word in the dictionary is past (or equal) to the word to + check while ($line ge $words[$index]) { # Store the word as OK, unless it is equal to the current dict wor +d push @good_words, $words[$index] unless $line eq $words[$index]; # Use the next word from the list of words to check, if any last LINE if ++$index == @words; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11113404]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2020-05-31 14:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If programming languages were movie genres, Perl would be:















    Results (174 votes). Check out past polls.

    Notices?