<?xml version="1.0" encoding="windows-1252"?>
<node id="1015415" title="Re^2: Array vs. Hash for sparsely integer-indexed data" created="2013-01-25 16:48:12" updated="2013-01-25 16:48:12">
<type id="11">
note</type>
<author id="171588">
BrowserUk</author>
<data>
<field name="doctext">
&lt;blockquote&gt;&lt;i&gt;Also not to be excluded entirely is ... uhh ... an array of integers that is searched sequentially each time. &lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;That is a surprise winner for my 'Worst Advice of the Month (January, 2013)' prize. 

&lt;blockquote&gt;&lt;i&gt;... set up a few short test-runs ...&lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;I'd suggest you take your own advice before doling it out to others. 

&lt;P&gt;&lt;b&gt;NB: The following uses 1/10 the number of values and 1/10 the number of lookups than the tests above for arrays, hashes and bit vectors, because it takes so long to complete.&lt;/b&gt;:&lt;code&gt;
@a = map $_*20, 1 .. 1e4;;
say total_size \@a;;
320176

$found=0; say time; for my $tgt ( 1 .. 20e4 ) { $tgt == $_ and ++$found and last for @a; }; say time; say $found;;
1359149323.02034
1359149541.80419
10000
&lt;/code&gt;

&lt;p&gt;Using that as the basis for estimation, the total size will be 3.2MB -- 12 times that required by the bitvector solution.

&lt;P&gt;And the lookups that took hashes 1 second; arrays 2/5ths of a second and bitvectors 1/3rd of a second; would take: &lt;b&gt;25 days!&lt;/b&gt;

&lt;p&gt;It is really hard to see any circumstance when this would be a viable solution.

&lt;div class="pmsig"&gt;&lt;div class="pmsig-171588"&gt;
&lt;hr /&gt;
&lt;font size=1 &gt;
&lt;div&gt;With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'&lt;/div&gt;
&lt;div&gt;Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.&lt;/div&gt;
&lt;div&gt;"Science is about questioning the status quo. Questioning authority". &lt;/div&gt;
&lt;div&gt;In the absence of evidence, opinion is indistinguishable from prejudice.
&lt;/div&gt;
&lt;/font&gt;

&lt;/div&gt;&lt;/div&gt;
&lt;!--
@a = map $_*20, 1 .. 1e5;;
say total_size \@a;;
3200176

$found=0; say time; for my $tgt ( 1 .. 20e4 ) { $tgt == $_ and ++$found and last for @a; }; say time; say $found;;
1359151579.12203
1359153746.14252
10000
--&gt;</field>
<field name="root_node">
1015388</field>
<field name="parent_node">
1015401</field>
</data>
</node>
