Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^2: matching the words

by venky4289 (Novice)
on Feb 16, 2013 at 20:16 UTC ( #1019072=note: print w/replies, xml ) Need Help??

in reply to Re: matching the words
in thread matching the words

I have stored all the dictionary words into an array. . And all the words i need to search into another array. .. For example @array1=("fil_" , "t_xt"); words need to be filled @array2=("file","text","fils") words in dictionary file Now i need to match the elements of array1 with array2 and i should get the output as fil_ : file,fils T_xt : text Thanks for ur replies

Replies are listed 'Best First'.
Re^3: matching the words
by Kenosis (Priest) on Feb 16, 2013 at 21:10 UTC

    Yes, that helped! Here's one option that substitutes the "_" with ".+" for use in a regex (use "." if you want only one letter to match):

    use strict; use warnings; my @array1 = qw (fil_ t_xt _erl); my @array2 = qw (Merlin file text fils perl filled); for my $stem (@array1) { my $re = $stem; $re =~ s/_/.+/; /\b$re\b/ and print "$stem: $_\n" for @array2; }


    fil_: file fil_: fils fil_: filled t_xt: text _erl: perl

    Only whole words are matched, as word boundires (\b) are used in the regex which, if omitted, will also match substrings within the dictionary words.

    Hope this helps!

    Update: Below is an updated version which adapts BrowserUk's preferred solution:

    use strict; use warnings; my @array1 = qw (fil_ t_xt _erl); my @array2 = qw (Merlin file text fils perl filled); my $words = join ' ', @array2; for my $stem (@array1) { my $re = $stem; $re =~ s/_/./; print "$stem: $1\n" while $words =~ /\b($re)\b/g; }


    fil_: file fil_: fils t_xt: text _erl: perl

    If you want the "_" to be matched by more than one letter in the dictionary words, change the substitution to $re =~ s/_/\\S+/;.

      Rather than loading the real words as an array, I'd load them as a single whitespace delimited string.

      It is a couple of hundred times faster to invoke the regex engine, once, to search for one word in a string containing hundreds or thousands of words; than to invoke it hundreds or thousands of times to match against one word at a time:

      #! perl -slw use strict; use Benchmark qw[ cmpthese ]; our $words = do{ local( @ARGV, $/ ) = 'words.txt'; <> }; our @words = split ' ', $words; our $P //= 0.01; our @toLookFor = map { rand() > $P ? () : do { my $w = $_; my $p = int( rand length()-1 ); $w =~ s[.{$p}\K.][.]; $w; }; } @words; printf "Looking for %d terms amongst %d words\n", scalar @toLookFor, scalar @words; cmpthese 1, { a => q[ for my $re ( @toLookFor ) { m[^$re$] #and print "a:$re :: $_" for @words; } ], b => q[ $words =~ m[\b($_)\b] #and print "b:$_ :: $1" for @toLookFor; ], } __END__ C:\test>junk42 Looking for 1846 terms amongst 178691 words s/iter a b a 85.2 -- -95% b 3.94 2065% -- C:\test>junk42 -P=0.02 Looking for 3564 terms amongst 178691 words s/iter a b a 166 -- -95% b 7.83 2022% --

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        This makes much sense. Will add an adapted version to my original response. Greatly appreciate this!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1019072]
[shmem]: Lady_Aleena: fix it, or delete it ;-)
[Lady_Aleena]: shmem, I'm not sure how.
[shmem]: well, deleting is done with 'rm' :-P
[shmem]: you can break up the chain MP3::Tag->new($_)- >interpolate("\%S ") and test whether MP3::Tag->new($_) is true
[Lady_Aleena]: shmem, no to deletion.
[shmem]: can't help you there since that song is not on my bench
[Lady_Aleena]: shmem, I need to look at MP3::Tag to see how to get the date from one song.

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2017-04-23 22:14 GMT
Find Nodes?
    Voting Booth?
    I'm a fool:

    Results (432 votes). Check out past polls.