Re^5: Hash versus chain of elsifs

G'day mldvx4,

I agree with others that a hash is likely to be more efficient than a chain of elsifs. Having said that, as a general rule-of-thumb, you should Benchmark: Perl may have already optimised what you're trying to do (so you'd be both wasting your time and bloating your code); different algorithms may be more or less efficient depending on the data (e.g. number of strings, individual length of strings, total size of data); and so on. Don't guess; benchmark.

"Any other performance and style tips or pointers welcome."

When asked for sample code; provide code that we can run and output that shows it runs correctly. If you can't get your code to produce the desired output, indicate what you expected and show what you actually got (including all error and warning messages verbatim between <code>...</code> tags). I suggest you read "SSCCE".
Your package name should probably only contain one 'n', i.e. JunkSites.
Always put use strict; and use warnings; at the top of your code.
Don't use $a or $b as general variables. They're special. See "$a".
Use state, instead of my, to declare persistent variables. Note that state was introduced in Perl v5.10.
The code you show for sub KnownJunkSite {...} looks very wrong. See my example code below for what I think is closer to what you're after.

Example code:

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

my @test_sites = qw{x.com y.com www.z.com www.y.com};
check_junk($_) for @test_sites;

sub KnownJunkSite {
    my ($key) = @_;

    state $is_junksite = { map +($_, 1), qw{
        x.com www.x.com z.com www.z.com
    } };

    return exists $is_junksite->{$key} ? 1 : 0;
}

sub check_junk {
    my ($key) = @_;

    say "$key: ", KnownJunkSite($key);
}
[download]

Output:

x.com: 1
y.com: 0
www.z.com: 1
www.y.com: 0
[download]

— Ken

Comment on Re^5: Hash versus chain of elsifs Select or Download Code


Perl: the Markov chain saw
	PerlMonks