Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

For this matching, is there only one user per line? If a user is found on a line, is it then not ever found again? I've written samples for you which answer those questions differently. There may be ways to optimize each of these for your specific problem. It just depends on which problem you are solving.

There is a bug in your original code - you said keys %$users and then dereferenced the key directly like $user->{'Pattern'}. $user is a plain string so that is a symbolic reference. Using strict would have caught that bug for you. You meant to write $users->{ $user }{ 'Pattern' } which properly looks up the value named $user in the hash reference $users.

There is a potential bug depending on your data. The string "aa" matches "a" and "aa". If you ask for only the first match, then the more complete, perhaps more correct match will not be attempted if you stop. You may need to adjust your logic to account for the length of the match to see which pattern matched "better". None of my examples correct for this.

Each line may match multiple users and once found, are not looked for anymore. This may be be the fastest because it can reduce the search space by multiples with each iteration.

# Precompile all the patterns and store them into the key # CompiledPattern $_->{'CompiledPattern'} = qr/$_->{'Pattern'}/i for values %$users; my %unmatched_users; @unmatched_users{ keys %$users } = (); while ( my $line = <> ) { ... my @users = grep $line =~ $users->{$_}{'CompiledPattern'}, keys %unmatched_users; if ( @users ) { warn "Great, we found " . join( ', ', map $_->{'Pattern'}, @{$users}{ @users } ) . " user(s)!\n"; delete @unmatched_users{ @users }; } else { warn "$line didn't match any users.\n"; } }

Each line may match one user. Once a user is found, it is not looked for anymore. This may be be the fastest because it reduces the search space with each successful match and if any match is found, stops looking for any more.

use List::Util 'first'; # Precompile all the patterns and store them into the key # CompiledPattern $_->{'CompiledPattern'} = qr/$_->{'Pattern'}/i for values %$users; my %unmatched_users; @unmatched_users{ keys %$users } = (); while ( my $line = <> ) { ... my $user = first { $line =~ $users->{$_}{'CompiledPattern'} } keys %unmatched_users; if ( defined $user ) { warn "Great, we found pattern $user->{'Pattern'}!\n"; delete $unmatched_users{ $user }; } else { warn "$line didn't match any users.\n"; } }

Each line may match *one* user but users may be found on multiple lines. The search space remains constant.

# Precompile all the patterns and store them into the key CompiledPatt +ern $_->{'CompiledPattern'} = qr/$_->{'Pattern'}/i for values %$users; while ( my $line = <> ) { ... my $user = first { $line =~ $users->{$_}{'CompiledPattern'} } keys %$users; if ( $user ) { warn "Great, we found pattern $user->{'Pattern'}!\n"; } else { warn "$line didn't match any users.\n"; } }

Each line may match multiple users and users may be found on multiple lines. This is the worst case sample you already had.

# Precompile all the patterns and store them into the key # CompiledPattern $_->{'CompiledPattern'} = qr/$_->{'Pattern'}/i for values %$users; while ( my $line = <> ) { ... my @users = grep $line =~ $users->{$_}{'CompiledPattern'}, keys %$users; if ( @users ) { warn "Great, we found " . join( ', ', map $_->{'Pattern'}, @{$users}{ @users } ) . " user(s)!\n"; } else { warn "$line didn't match any users.\n"; } }

In reply to Re: Matching against list of patterns by diotalevi
in thread Matching against list of patterns by Eyck

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others cooling their heels in the Monastery: (5)
    As of 2014-12-27 16:08 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?





      Results (177 votes), past polls