Hashes and arrays can both be implemented as in memory data structures, or on disk data structures. Therefore it is perfectly reasonable to talk about what each looks like in memory and on disk.
In Perl the two options don't even look different for hashes, you just add an appropriate tie. The difference is slightly larger for arrays because you don't want to use the built-in sort on a large array that lives on disk.
As for my duplicates comment, I am not denying that a hash is better than an array whether or not there are duplicates. However the speed of accessing a hash is basically independent of how many duplicates there are in your incoming data structure. (There are subtle variations depending on, for instance, whether you are just before or after a hash split, but let's ignore that.) The speed of doing a merge sort that eliminates duplicates ASAP varies greatly depending on the mix of duplicates in your incoming data structure. Therefore the array solution improves relative to the hash as you increase the number of duplicates. This doesn't make the array solution
In fact in the extreme case where you have a fixed number of distinct lines in your data structure, the array solution improves from O(n log(n)) to O(n). The hash solution does not improve, it is O(n) regardless.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||