Re: Finding values position in array

in reply to Finding values position in array

You could also refer to the very complete Getting Matching Items From An Array.

At first I said, "of course GrandFather is right", but upon further testing the situation is not so clear. The optimizations in grep give it an edge over the C style for if you have to look approximately 1/2 way into the array. This did not surprise me as much as the poor performance of a binary search. ( code borrowed from Binary search). Though I think after struggling to coax grep to return a list value into a scalar context I would avoid that solution unless you really want multiple values returned.

oko1 gives an implementation of grep but I was thinking more something like: grep {$array[$_] eq $value} (0..$#array) as shown in the code below. Which, as I said, may be just quick enough.

#!/usr/bin/perl

use Benchmark 'cmpthese';
use warnings;

@array = (  'a'..'zz' );
$value = 'mz';
my $res;
my $position;

print "for(): ".traditional().
      "\tforeach: ".for_each().
      "\tmap: ".maptohash().
      "\tgrep_if: ".grep_if().
      "\tbinary search: ".bin_search().
"\n";

cmpthese ( 1000, { for_loop =>  '$res = traditional()',
                       map => '$res = maptohash()',
                  for_each => '$res = for_each()',
                   grep_if => '$res = grep_if()',
                bin_search => '$res = bin_search()',
                }
 );

sub traditional{
    for ($position=0; $position <= $#array; $position++) {
        last if ($array[$position] eq $value);
    }
    return $position;
}

sub for_each{
    for  our $position ( 0..$#array ) {
        last if ($array[$position] eq $value) ;
    }
    return $position;
}

sub maptohash{
    %indexed = map{ $array[$_], $_ } (0..$#array);
    return $indexed{$value};
}

sub grep_if{
    my  @res = grep {$array[$_] eq $value} (0..$#array);
   return @res[0];
}

sub bin_search {
    return BinSearch( $value, \&cmpFunc, \@array );
}

sub BinSearch
{
my ($target, $cmp) = @_;
my @array = @{$_[2]};

my $posmin = 0;
my $posmax = $#array;

return -0.5 if &$cmp (0, \@array, $target) > 0;
return $#array + 0.5 if &$cmp ($#array, \@array, $target) < 0;

while (1)
  {
  my $mid = int (($posmin + $posmax) / 2);
  my $result = &$cmp ($mid, \@array, $target);

  if ($result < 0)
    {
    $posmin = $posmax, next if $mid == $posmin && $posmax != $posmin;
    return $mid + 0.5 if $mid == $posmin;
    $posmin = $mid;
    }
  elsif ($result > 0)
    {
    $posmax = $posmin, next if $mid == $posmax && $posmax != $posmin;
    return $mid - 0.5 if $mid == $posmax;
    $posmax = $mid;
    }
  else
    {
    return $mid;
    }
  }
}

sub cmpFunc
{
my ($index, $arrayRef, $target) = @_;
my $item = $$arrayRef[$index];

return $item cmp $target;
}


 %perl foo.pl                                                         
+    [10:49pm]
Scalar value @res[0] better written as $res[0] at foo.pl line 47.
for(): 363      foreach: 363    map: 363        grep_if: 363    binary
+ search: 363
             Rate        map   for_loop    grep_if bin_search   for_ea
+ch
map         200/s         --       -69%       -72%       -81%       -8
+8%
for_loop    650/s       224%         --        -8%       -39%       -6
+0%
grep_if     707/s       253%         9%         --       -33%       -5
+7%
bin_search 1058/s       428%        63%        50%         --       -3
+6%
for_each   1641/s       719%       153%       132%        55%         
+--
[download]

s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}

Comment on Re: Finding values position in array Select or Download Code

Replies are listed 'Best First'.
Re^2: Finding values position in array by Not_a_Number (Prior) on Apr 14, 2008 at 11:52 UTC
You should maybe point out that your subs are not actually equivalent. If `$value` occurs more than once in `@array`, then `maptohash` will return the index of the last occurrence, while the others will return that of the first occurrence.	[reply] [d/l] [select]
Re^3: Finding values position in array by starbolin (Hermit) on Apr 14, 2008 at 17:55 UTC
No need to as Not a Number has done it for me ;-)> Also significant is in some application is that the run-times of map_to_hash and a binary search are constant over the solution set where as any of the iterators will be significantly faster/slower depending on $value. I was thinking all this as I was coding, and that is part of the fun of this whole programming business, but I was just trying to get something to run with some kind of relevance to the OP's question. I find myself using grep or map more and more and almost never use the C style for() loop. I like the programming flexibility that mapping the indexes into a hash gives me even though that is the slowest as far as the run-times. As an example, the run-times of my current project are almost totally dominated by the time required to seek to the files. I am just about finished rolling all the file accesses into their own loops using predominately greps and maps. Something like this only using a HOH `@file_handles = map { some_code_to_open_files } @huge_filelist; @wanted_files = grep { some_file_access } @file_handles; @wanted_data = grep { some_more_file_access } @wanted_files; @return_codes = map { close $_ } @file_handles;` [download] This has the effect of eliminating a lot of nested parenthesis and makes the code look 'flat'. It also forces the procedural code outside of the loops and the data access code inside the loops. This drives the code naturally toward an OO style and will allow me to re-factor what was a very procedural script into OO code. What's more I've tested the results and the run-times are identical. I don't claim credit for this approach. Most of this was not my own idea. I got the inspiration from reading "Intermediate Perl". s//----->\t/;$~="JAPH";s//\r<$~~/;{s\|~$~-\|-~$~\|\|\|s \|-$~~\|$~~-\|\|\|s,<$~~,<~$~,,s,~$~>,$~~>,, $\|=1,select$,,$,,$,,1e-1;print;redo}	[reply] [d/l]
Re^3: Finding values position in array by starbolin (Hermit) on Apr 15, 2008 at 00:54 UTC
No need to as Not a Number has done it for me ;-)> Also significant is some application is that the run-times of map_to_hash and a binary search are constant over the solution set where as any of the iterators will be significantly faster/slower depending on $value. I was thinking all this as I was coding, and that is part of the fun of this whole programming business, but I was just trying to get something to run with some kind of relevance to the OP's question. I find myself using grep or map more and more and almost never use the C style for() loop. I like the programming flexibility that mapping the indexes into a hash gives me even though that is the slowest as far as the run-times. As an example, the run-times of my current project are almost totally dominated by the time required to seek to the files. I am just about finished rolling all the file accesses into their own loops using predominately greps and maps. Something like this only using a HOH `@file_handles = map { some_code_to_open_files } @huge_filelist; @wanted_files = grep { some_file_access } @file_handles; @wanted_data = grep { some_more_file_access } @wanted_files; @return_codes = map { close $_ } @file_handles;` [download] This has the effect of eliminating a lot of nested parenthesis and makes the code look 'flat'. It also forces the procedural code outside of the loops and the data access code inside the loops. This drives the code naturally toward an OO style and will allow me to re-factor what was a <st>very</st> procedural script into OO code. What's more I've tested the results and the run-times are identical. I don't claim credit for this approach. Most of this was not my own idea. I got the inspiration from reading "Intermediate Perl". s//----->\t/;$~="JAPH";s//\r<$~~/;{s\|~$~-\|-~$~\|\|\|s \|-$~~\|$~~-\|\|\|s,<$~~,<~$~,,s,~$~>,$~~>,, $\|=1,select$,,$,,$,,1e-1;print;redo}	[reply] [d/l]

In Section Seekers of Perl Wisdom