Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

General Advice

by yoda54 (Monk)
on Jun 27, 2013 at 01:11 UTC ( #1040885=perlquestion: print w/replies, xml ) Need Help??
yoda54 has asked for the wisdom of the Perl Monks concerning the following question:


I'm writing a search to go through an array of data, I'm looking for any suggestions on algorithms, structuring, or any language advice to make my code better. Also, what is a better way to build my regexes?


search(["bobcat", "boomerang", "beer", "bat"], ["-bo"], ["b"]); sub search { my ($data, $negative_criteria, $positive_criteria = @_; my $temp_array; my $results; my $match_regex = join "|", @$positive_criteria; # remove trailing "-" which denotes negated criteria my $negative_regex = [map { s/^-//, $_ } @$negative_criteria]; $negative_regex = join "|", @$negative_criteria; # filter through all matches to array foreach(@$data) { if ($_ =~ m/$match_regex/) { push @$temp_array, $_; } } # remove negative criteria and build new array foreach(@$temp_array) { if ($_ !~ m/$negative_regex/) { push @$results, $_; } } my $count = 0; foreach(@$results) { print "$count - $_\n"; $count++; } return $results; } Output: 0 - beer 1 - bat

Replies are listed 'Best First'.
Re: General Advice
by hbm (Hermit) on Jun 27, 2013 at 01:31 UTC

    use strict; use warnings; catches one error - a missing paren.

    I'd build $negative_regex in one pass:

    my $negative_regex = join '|', map { s/^-//, $_ } @$negative_criteria;

    And test both regexes in one pass:

    foreach(@$data) { push @$results, $_ if /$match_regex/ && !/$negative_regex/; }

      use strict; use warnings; catches one error - a missing paren.

      No , perl catches syntax errors

Re: General Advice
by NetWallah (Canon) on Jun 27, 2013 at 04:57 UTC
    One regex to rule them all !
    Uses negative look-ahead assertion.
    use strict; use warnings; search(["bobcat", "boomerang", "beer", "bat"], ["-bo"], ["b"]); sub search { my ($data, $negative_criteria, $positive_criteria )= @_; my $re="^(?!" . join("|", map{ s/^-//;$_ }@$negative_criteria) . ") +" . join "|", @$positive_criteria; $re=qr|$re|; my @results = grep { m/$re/ } @$data; for (0..$#results){ print $_+1 . " - $results[$_]\n"; } return \@results; }

                 "The trouble with the Internet is that it's replacing masturbation as a leisure activity."
            -- Patrick Murray

Re: General Advice
by hdb (Monsignor) on Jun 27, 2013 at 07:56 UTC

    I recommend using grep in this case. One thing to observe on everything proposed so far: the use of s/// within sub search has side effects outside of the the sub as it modifies the input! If it is an anonymous array reference it does not make any difference, but otherwise it might create problems later. See the difference between the line commented out below and the following one.

    use strict; use warnings; sub search { my ( $data, $neg, $pos ) = @_; my $regp = join "|", @$pos; # my $regn = join "|", map { s/^-//, $_ } @$neg; # warning: si +de effects outside of sub search! my $regn = join "|", map { /^-?(.*)/ } @$neg; # no side effect +s my @results = grep { /$regp/ && !/$regn/ } @$data; return \@results; } my $neg = ["-bo"]; my $results = search(["bobcat", "boomerang", "beer", "bat", "aaa" ], $ +neg, ["b"]); print "@$results\n"; print "@$neg\n";
Re: General Advice
by kcott (Chancellor) on Jun 27, 2013 at 06:51 UTC

    G'day yoda54,

    Try to match shorter patterns before longer ones. While your example only shows single items for your positive and negative criteria, your code suggests multiple items are possible. You haven't indicated the source of the criteria but, if qw{book and lots of other patterns then b} was valid positive criteria, you'd fail to match "book" in any of the $data items only to later successfully match "b" in all of them.

    ... join '|' => sort { length $a <=> length $b } @$xxxxxx_criteria;

    You've shown interpolating double quotes in many places where you don't actually want to interpolate anything. They may not be an issue in your minimal example; however, consider what happens if you're searching for, say, email addresses:

    ... "" ... # Oops!

    If the items in your criteria lists are mutually exclusive, consider wrapping your alternations in (?> ... ) to avoid backtracking. See perlre - Extended Patterns and perlre - Backtracking.

    -- Ken

      Thank you!
Re: General Advice
by clueless newbie (Chaplain) on Jun 27, 2013 at 11:56 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1040885]
Approved by Athanasius
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2018-03-18 06:16 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (228 votes). Check out past polls.