Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: What makes an array sorted and a hash unsorted?

by jethro (Monsignor)
on Jun 04, 2009 at 12:14 UTC ( #768415=note: print w/ replies, xml ) Need Help??


in reply to What makes an array sorted and a hash unsorted?

That means that hashes are just as sorted as arrays in theory

In theory (i.e. computer science theory) there are no arrays in perl
> perl -we '$a[4]=1; print shift @a; ' Use of uninitialized value within @a in print at -e line 1.

(ok, we could define UNDEF as a defined/actual/existing value of an array, but then why does scalar(@a) report 5 when there is a value in $a[37865]? )

In theory perls closest thing to arrays is an array, since for example $value==pop(push @a,$value)) (a fundamental property of theoretical arrays) while pop and push for hashes isn't even defined (even though someone could define it somehow in perl5.11 just to make a point ;-)

In practice perls data structures are as shifty as the language itself and you can implement hashes in arrays as well as arrays in hashes. So the sentence "arrays are ordered and hashes are not" (I hope we all concur by now that 'sorted' was the wrong word) is merely a short form for "arrays provide built-in infrastructure to manipulate ordered values in an efficient manner, while hashes do not"

After all, when we try to dissuade others from using hashes in some situations, it is not because it is impossible with hashes, but inefficient or awkward. So we can't compare arrays and hashes without looking at the costs to do operations or their built-in access methods. Because without that, i.e. "in theory" perl hashes and arrays are equivalent/just storage/interchangeable

UPDATE: changed a really idiotic length(@a) to scalar(@a).


Comment on Re: What makes an array sorted and a hash unsorted?
Select or Download Code
Re^2: What makes an array sorted and a hash unsorted?
by ikegami (Pope) on Jun 04, 2009 at 15:03 UTC

    but then why does length(@a) report 5 when there is a value in $a[37865]? )

    Because the string "37866" has five characters in it. You want 0+@a, scalar(@a) or similar to get the number of entries in the array.

    Did you think you were proving Perl arrays are sparse? That's not the case. Quite the opposite, Perl usually allocates more memory than needed to allow future growth.

    $ perl -MDevel::Peek -e'$a[37865]=1; Dump \@a,1; push @a,2; Dump \@a,1 +' SV = RV(0x8176690) at 0x814ed9c SV = PVAV(0x8153c64) at 0x814f6a8 FILL = 37865 <-- Index of last used ele MAX = 37865 <-- Index of last allocated ele SV = RV(0x8176690) at 0x814f684 SV = PVAV(0x8153c64) at 0x814f6a8 FILL = 37866 <-- Index of last used ele MAX = 65531 <-- Index of last allocated ele

    (Output trimmed to the relevant.)

    So we can't compare arrays and hashes without looking at the costs to do operations or their built-in access methods.

    Indeed. They each have their strengths.

      My only excuse for the mistake is I'm just editing some old c sources and working with char arrays is damaging to the mind, I even made the same mistake there and used length() instead of strlen(). Naturally I meant scalar(@a)

      Did you think you were proving Perl arrays are sparse?

      No, I was trying to prove that a perl array is not what a computer science book would call an array. In such a book an idealized data type 'array' wouldn't have holes. And after thinking about it, it wouldn't have a push or pop operator, that would be the data type 'stack'

        No, I was trying to prove that a perl array is not what a computer science book would call an array.

        I don't see why you think they aren't. Perl arrays are a continuous series of equally sized records, allowing for instant addressing of any element. That definitely fits the traditional formal definition.

        In such a book an idealized data type 'array' wouldn't have holes.

        Perl arrays aren't sparse. There are no holes. Every element exists. (Those three statements are synonymous.)

        And after thinking about it, it wouldn't have a push or pop operator, that would be the data type 'stack'

        Stacks (and queues, and ...) can be implemented using an array. They're not mutually exclusive.

        The presence of utility functions to manipulate a data structure doesn't change the data structure.

Re^2: What makes an array sorted and a hash unsorted?
by herveus (Parson) on Jun 04, 2009 at 17:25 UTC
    Howdy!

    In practice perls data structures are as shifty as the language itself and you can implement hashes in arrays as well as arrays in hashes. So the sentence "arrays are ordered and hashes are not" (I hope we all concur by now that 'sorted' was the wrong word) is merely a short form for "arrays provide built-in infrastructure to manipulate ordered values in an efficient manner, while hashes do not"

    Hashes and arrays are reasonably well defined fundamental data types in Perl. I'd say "clearly defined", but this thread casts doubt on that assertion. Only at a conceptual level can you really assert that equivalency in the second sentence. Arrays and hashes are very strong data types in Perl. You cannot cast one into the other. You can use certain operations on each to produce instances of the other (conflating lists with arrays here). Those operators are not type casters, however.

    In your code example, you first put a value in to the array at index 4. This causes indexes 0 through 3 to spring into existence, but, naturally, they have no defined value. When you then shift off the first value to print it, you get index 0 (the "first" value) and, being undefined, triggers the warning that you asked to have happen. It is unclear what that was meant to demonstrate.

    Later you speak of push/pop and how they are not defined for hashes. Of course. Push and pop (and shift and unshift) are all curried versions of splice, which is the general purpose function for mucking about with arrays. Since a hash isn't (and can't be coerced into) an array, splice can't do anything with one. Likewise, keys() and values() and each() are not defined for arrays. The two data types are very distinct. I'm not sure what you are trying to say there. It makes no sense.

    yours,
    Michael
      Your second-to-last sentence is part of what I am trying to say. Ikegami (as I understand his post) proposes that the ordering of keys intrinsic to an array isn't a difference to a hash because a hash can simulate that (and because internally there is an ordering in a hash anyway). My argument is that
      a) what perl does internally is beside the point and
      b) if we argue that a hash is as ordered as an array because we can make it so, then all arguments about differences of data types become meaningless because every data type can simulate every other data type.
        Howdy!

        I posted some sample code that demonstrates that inserting a key into a hash can change the order of the keys that were present before that insertion. That shows that the keys are inherently unordered. If you us one of the modules out there to hack the inner workings of a hash to control the order of keys you get when calling keys(), it's not a plain hash anymore.

        Through the magic of tie and overload and XS, you can subvert and pervert the behavior of things, but that doesn't change the fundamental data types of scalar, hash, and array. Claiming that a hash is ordered is silly, on a par with claiming that pi is actually equal to 3. Being ordered implies a level of stability. If key X comes "before" key Y now, then orderedness implies that no sequence of insertions and deletions that do not invove X or Y will change that. It turned out to be trivial to demonstrate that that condition does not hold.

        yours,
        Michael

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://768415]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (9)
As of 2014-07-28 18:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (205 votes), past polls