Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Yet another way to do it, which might be way faster if you have very few words to check (in your example you have only one, but this might be done in a loop) is to do it the other way around. Collect the words to be tested first, and then check if the any of the words in your dictionary match:

my @words = get_words_to_check(); my %hash = map { $_ => 1 } @words; while (my $line = <>) { chomp $line; delete $hash{$line} if exists $hash{$line}; # The if exists isn't re +quired here, but it does make it look cleaner } my @good_words = grep { exists $hash{$_} } @words; # Keep the original + order my @good_words_2 = keys %hash; # Don't care about the original order

Or, if borth word lists are sorted, something like this might work:

my $index = 0; LINE: while (my $line = <>) { chomp $line; # While the word in the dictionary is past (or equal) to the word to + check while ($line ge $words[$index]) { # Store the word as OK, unless it is equal to the current dict wor +d push @good_words, $words[$index] unless $line eq $words[$index]; # Use the next word from the list of words to check, if any last LINE if ++$index == @words; } }


In reply to Re: Filtering out stop words by Eily
in thread Filtering out stop words by IB2017

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others pondering the Monastery: (5)
    As of 2020-06-04 01:41 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      Do you really want to know if there is extraterrestrial life?



      Results (29 votes). Check out past polls.

      Notices?