Gentlemen,
1st of all - huge thank you to everybody. It’s as always a pleasure to be here and have your questions answered. I *love* this community of wise, intelligent and professional hackers:-), (-- although I am not too much into Perl:-)- (sorry))..
A few general remarks:
1. I am sorry if my initial post caused a bit of frustration for some of you. It wasn’t intended to be. My point was:
I expect a higher quality of answers in FAQ than in general posts. And the more F is a Q, the higher quality the A (I assume) should be. That’s all.
It’s just my opinion of course, but I think many people treat “FAQ” as the only source of truth...
2. Thanks a lot for the two _binary search_ answers. Both seem to work, I’ll do some corner-case and performance testing and compare them to hash implementation as well, -- and probably post the results here (if nobody minds).
2-a). I’ve made two changes in find_int_in_array :
A)
#my ( $arref, $targ ) = @_; # args are array ref, int value
my ( $targ, $arref ) = @_; # args are: (int, array ref)
this is just a “style” thing, easier to remember (int in arr, not arr in int), plus - easier to test together with other impl-s by the same driver;
B)
Changed
my $nextidx = $asize / 2;
my $nextinc = $nextidx / 2;
to:
my $nextidx = int($asize / 2);
my $nextinc = int($nextidx / 2);
(Otherwise, for an array of 5 elements, it’d return an index = 2.5 ;-...)).
3. Re: using <code> v.s <pre>:
I do know the rules and tried not to, but,,, on my screen, it’s either <code> is too small (cannot read), or with the font size increase, everything else’s too big and bold..... Maybe smb. can review this policy? (Minor thing of course).
4. ($#array + 1) vs. scalar(@array) :
You’re going to laugh, but I did a performance test..:). The results is: $#array is ~10-15% faster than the other one. Intuitively, that’s what one would expect (my wild guess is that $#array is probably an internal counter that is always kept up-to-date, and the scalar(@array) gets calculated).
....This is a pure academical question of course:-)... It’s hard to imagine an application that would suffer from using the 2nd approach.
5. Using a hash vs. Binary search.
Again, hash works just fine, except for three implications:
- memory;
- slows down for huge amounts of data (vs. bisearch which is always linear);
- complex:-)
Enough said about ## 1 & 2; let me explain the 3d one:
... Imagine an application that updates an array rarely and looks up often. Say you have 10,000,000 customers / books / whatever that get added / published only once a minute, but information is requested 1000 per second.
So, re-building the hash for lookup 1000 times a second is clearly not an option. Right?
A solution to that might be (I’ll think in some pseudo-language now):
class Array_with_Hash:
@the_array = ()
% the_hash = ();
init(@a)
{
put (a =>> the_array);
recreate_hash (the_array, the_hash);
}
pop(elem):
{
array.pop(el)
update_hash(el);
}
...
Well, if you know for sure that push and pop is *all* that you do, that might be possible. However, creating a generic class like this might be really tricky.
- array must be locked (no access to it directly, -- only by this class’s methods;
- hash values probably have to be arrays_of_indexes...
- implementing sort(), slice-and-dice etc. might take some time;
- it’s quite easy to “forget” about something...
- ...
This only means that the hash approach only works if
- cost(recreating the hash every time) == nothing
- OR
- cost(programming time) == nothing..:)
.
In other words, -- thanks again for the binary search in Perl! (And it should be in FAQ, too...:-)...
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.