Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Comparing a value to a list of numbers

by davido (Cardinal)
on Jan 29, 2021 at 17:29 UTC ( #11127651=note: print w/replies, xml ) Need Help??


in reply to Comparing a value to a list of numbers

The hashtable lookup is by far the fastest approach if you can spare the time to build it in the first place. You wouldn't use the hash approach if you are only doing a search one time on the data set. But if you are doing it many times, the hash approach quickly shines.

Other viable approaches include regex alternation, any, and grep. Observe:

#!/usr/bin/env perl use strict; use warnings; use List::Util qw(any); use Benchmark qw(cmpthese); my @numlist = map {int(rand(100_000))} 1..10_000; my %numlist_table; @numlist_table{@numlist} = (); my $numlist_re = do { my $numlist_str = join('|', @numlist); qr/^(?:$numlist_str)$/; }; my $target = 42; sub sgrep { return grep {$target == $_} @numlist; } sub table { return exists $numlist_table{$target}; } sub alternation { return $target =~ m/$numlist_re/; } sub sany { return any {$target == $_} @numlist; } cmpthese(-3, { grep => \&sgrep, table => \&table, alternation => \&alternation, any => \&sany, });

The results:

Rate grep any alternation table grep 4055/s -- -28% -100% -100% any 5641/s 39% -- -100% -100% alternation 3440637/s 84745% 60895% -- -93% table 48177809/s 1187946% 853994% 1300% --

However, if you consider the time it takes to build the alternation regex, and to build the hash, they are less efficient options for small data sets, or for small numbers of lookups. For large datasets with large numbers of lookups, the hash is a good answer (unless memory is tight).


Dave

Replies are listed 'Best First'.
Re^2: Comparing a value to a list of numbers
by LanX (Cardinal) on Jan 29, 2021 at 17:40 UTC
    Well, there is another question lurking behind, which is IMHO not too tangential

    What if large ranges are involved? The OP showed at least a little one with 41-56.

    I'm sure a binary search would outperform a hash then.

    (Of course not if using random keys like you did)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

    update

    ) Well I need to be more precise: Large ranges are sometimes not practicable with hashes because of the memory complexity. This could lead to heavy swapping operations.

    Having a little slower code with binary search O(ln) is often better than hash lookup with O(1)

Re^2: Comparing a value to a list of numbers
by LanX (Cardinal) on Jan 30, 2021 at 02:30 UTC
    Hi Dave,

    > my $target = 42;

    I think using a fixed number if some routines use a linear search might not lead to accurate results.

    But I have to admit, I don't know yet how to best combine a random $target with Benchmark

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11127651]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2021-06-13 03:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (54 votes). Check out past polls.

    Notices?