comment on

I have been thinking about the pros and cons of using hash or array.

So I made myself some data with a script approximating something like orbitals example, using the rand()-funktion on CD#, pdf-name, unknown digits in parenthesis and the different strings in brackets.
I know it's not the real thing but for what I trying to find out it seems OK to me.

The two approaches I've tested is either using a hash, straightforward and easy on the programmer, see my previous post. Or using an array and tucking the filepos to the right side, see petrals post.

The tests were run on a measly Pentium 133 with 64MB RAM and the results are as of the table below(YMWV).

The empty cells indicate heavy use of virtual memory.
Impatience being a virtue I only gave it double the expected time before I hit Ctrl-C.

kLines Filesize Hash Array

MB sec sec

100 6,16 11 12

150 9,24 16 18

175 10,78 19 22

190 11,704 22 24

200 12,32 23 25

210 12,936 24 26

220 13,552 25 27

230 14,168 62 29

250 15,4 31

300 18,48 37

350 21,56 44

400 24,64 107

500 30,8

My interpretation is that if you got the memory then the hash method is slightly faster, but using the array method will take you about twice as far in term of possibel file sizes.

Thinking about it, it seems rather logical considering that the hash is both key and value whilst array is value only.

It's also nice to see that both variations behave linearly with increasing volume until VM sets in, just as it's written in A Fresh Look at Efficient Perl Sorting by Uri Guttman and Larry Rosler.

In reply to Re: Slow at sorting? by guha
in thread Slow at sorting? by orbital

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Pathologically Eclectic Rubbish Lister
	PerlMonks