http://www.perlmonks.org?node_id=680214


in reply to Finding values position in array

You could also refer to the very complete Getting Matching Items From An Array.

At first I said, "of course GrandFather is right", but upon further testing the situation is not so clear. The optimizations in grep give it an edge over the C style for if you have to look approximately 1/2 way into the array. This did not surprise me as much as the poor performance of a binary search. ( code borrowed from Binary search). Though I think after struggling to coax grep to return a list value into a scalar context I would avoid that solution unless you really want multiple values returned.

oko1 gives an implementation of grep but I was thinking more something like: grep {$array[$_] eq $value} (0..$#array) as shown in the code below. Which, as I said, may be just quick enough.

#!/usr/bin/perl use Benchmark 'cmpthese'; use warnings; @array = ( 'a'..'zz' ); $value = 'mz'; my $res; my $position; print "for(): ".traditional(). "\tforeach: ".for_each(). "\tmap: ".maptohash(). "\tgrep_if: ".grep_if(). "\tbinary search: ".bin_search(). "\n"; cmpthese ( 1000, { for_loop => '$res = traditional()', map => '$res = maptohash()', for_each => '$res = for_each()', grep_if => '$res = grep_if()', bin_search => '$res = bin_search()', } ); sub traditional{ for ($position=0; $position <= $#array; $position++) { last if ($array[$position] eq $value); } return $position; } sub for_each{ for our $position ( 0..$#array ) { last if ($array[$position] eq $value) ; } return $position; } sub maptohash{ %indexed = map{ $array[$_], $_ } (0..$#array); return $indexed{$value}; } sub grep_if{ my @res = grep {$array[$_] eq $value} (0..$#array); return @res[0]; } sub bin_search { return BinSearch( $value, \&cmpFunc, \@array ); } sub BinSearch { my ($target, $cmp) = @_; my @array = @{$_[2]}; my $posmin = 0; my $posmax = $#array; return -0.5 if &$cmp (0, \@array, $target) > 0; return $#array + 0.5 if &$cmp ($#array, \@array, $target) < 0; while (1) { my $mid = int (($posmin + $posmax) / 2); my $result = &$cmp ($mid, \@array, $target); if ($result < 0) { $posmin = $posmax, next if $mid == $posmin && $posmax != $posmin; return $mid + 0.5 if $mid == $posmin; $posmin = $mid; } elsif ($result > 0) { $posmax = $posmin, next if $mid == $posmax && $posmax != $posmin; return $mid - 0.5 if $mid == $posmax; $posmax = $mid; } else { return $mid; } } } sub cmpFunc { my ($index, $arrayRef, $target) = @_; my $item = $$arrayRef[$index]; return $item cmp $target; } %perl foo.pl + [10:49pm] Scalar value @res[0] better written as $res[0] at foo.pl line 47. for(): 363 foreach: 363 map: 363 grep_if: 363 binary + search: 363 Rate map for_loop grep_if bin_search for_ea +ch map 200/s -- -69% -72% -81% -8 +8% for_loop 650/s 224% -- -8% -39% -6 +0% grep_if 707/s 253% 9% -- -33% -5 +7% bin_search 1058/s 428% 63% 50% -- -3 +6% for_each 1641/s 719% 153% 132% 55% +--


s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}

Replies are listed 'Best First'.
Re^2: Finding values position in array
by Not_a_Number (Prior) on Apr 14, 2008 at 11:52 UTC

    You should maybe point out that your subs are not actually equivalent.

    If $value occurs more than once in @array, then maptohash will return the index of the last occurrence, while the others will return that of the first occurrence.

      No need to as Not a Number has done it for me ;-)>

      Also significant is in some application is that the run-times of map_to_hash and a binary search are constant over the solution set where as any of the iterators will be significantly faster/slower depending on $value. I was thinking all this as I was coding, and that is part of the fun of this whole programming business, but I was just trying to get something to run with some kind of relevance to the OP's question.

      I find myself using grep or map more and more and almost never use the C style for() loop. I like the programming flexibility that mapping the indexes into a hash gives me even though that is the slowest as far as the run-times.

      As an example, the run-times of my current project are almost totally dominated by the time required to seek to the files. I am just about finished rolling all the file accesses into their own loops using predominately greps and maps. Something like this only using a HOH

      @file_handles = map { some_code_to_open_files } @huge_filelist; @wanted_files = grep { some_file_access } @file_handles; @wanted_data = grep { some_more_file_access } @wanted_files; @return_codes = map { close $_ } @file_handles;

      This has the effect of eliminating a lot of nested parenthesis and makes the code look 'flat'. It also forces the procedural code outside of the loops and the data access code inside the loops. This drives the code naturally toward an OO style and will allow me to re-factor what was a very procedural script into OO code. What's more I've tested the results and the run-times are identical.

      I don't claim credit for this approach. Most of this was not my own idea. I got the inspiration from reading "Intermediate Perl".


      s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}

      No need to as Not a Number has done it for me ;-)>

      Also significant is some application is that the run-times of map_to_hash and a binary search are constant over the solution set where as any of the iterators will be significantly faster/slower depending on $value. I was thinking all this as I was coding, and that is part of the fun of this whole programming business, but I was just trying to get something to run with some kind of relevance to the OP's question.

      I find myself using grep or map more and more and almost never use the C style for() loop. I like the programming flexibility that mapping the indexes into a hash gives me even though that is the slowest as far as the run-times.

      As an example, the run-times of my current project are almost totally dominated by the time required to seek to the files. I am just about finished rolling all the file accesses into their own loops using predominately greps and maps. Something like this only using a HOH

      @file_handles = map { some_code_to_open_files } @huge_filelist; @wanted_files = grep { some_file_access } @file_handles; @wanted_data = grep { some_more_file_access } @wanted_files; @return_codes = map { close $_ } @file_handles;

      This has the effect of eliminating a lot of nested parenthesis and makes the code look 'flat'. It also forces the procedural code outside of the loops and the data access code inside the loops. This drives the code naturally toward an OO style and will allow me to re-factor what was a <st>very</st> procedural script into OO code. What's more I've tested the results and the run-times are identical.

      I don't claim credit for this approach. Most of this was not my own idea. I got the inspiration from reading "Intermediate Perl".


      s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}