The stupid question is the question not asked | |
PerlMonks |
The Concept of Referencesby leriksen (Curate) |
on Apr 14, 2005 at 06:19 UTC ( [id://447659]=perlmeditation: print w/replies, xml ) | Need Help?? |
Some people have a hard time understanding references in Perl - I know, because I did. However, over time, and because I have ended up using references a lot in the code I write, I feel a lot more comfortable with them. Here's how I explain references to new perl users.
Picture how your code may be laid out in memory. A statement like might get laid out like follows SYMBOL TABLE MEMORY Address ________________ ___________ ... $count->| 10 | 1000 ________________ ___________ | $count | 1000 | | | 1001 ________________ ___________ ... | | 1002 ________________ ___________Here Perl has indicated that the contents of the variable $count are stored at memory location 1000. It stores that information in the symbol table, and in memory, at the location 1000, it writes the value 10. When Perl wants to access the value associated with $count, it looks in the symbol table first, then it looks at the memory location indicated. Now suppose we add some code like this ... The resultant memory layout could be something like this SYMBOL TABLE MEMORY Address ________________ ___________ ... $count->| 10 | 1000 ________________ ___________ | $count | 1000 | $copy ->| 10 | 1001 ________________ ___________ | $copy | 1001 | | | 1002 ________________ ___________ ... | | 1003 ________________ ___________That is, $copy is a new entry in the symbol table, and the symbol table entry indicates that the place in memory to hold $copy's value is different to the location of $count. Once the symbol table entry is created, Perl looks at the value in memory for $count (by looking in the symbol table for where to look in memory) and places the same value in $copy's memory location. So the values are the same, but the locations are different. But neither of these are references. Lets see how some code that uses references might end up getting laid out in memory. The resultant memory layout could be something like this SYMBOL TABLE MEMORY Address ________________ ___________ ... $count->| 10 | 1000 ________________ ___________ | $count | 1000 | $copy ->| 10 | 1001 ________________ ___________ | $copy | 1001 | $ref ->| 1000 | 1002 ________________ ___________ | $ref | 1002 | | | 1003 ________________ ___________ ... | | 1004 ________________ ___________So the symbol table entry for $ref looks pretty much the same - it indicates where $ref's values will be stored in memory. And what is that value ? Well, because we said initialise it to be a reference to $count, it stores $count's memory address. If we printed this out we'd see something like this This says that $ref is a reference to a scalar stored at 3e8 (which 1000 written in hex). So what does that get us ? Well, $ref is a reference a scalar stored at 1000, so to get the value stored at that location we need to 'dereference' $ref - and we do this by the following code Now that may seem a hard way to access $count's value. After all we can just use $count instead of $$ref. But we can take references to other things - arrays, hashes, subroutines etc. Lets look at how an array reference could be done. Lets start with our symbol table and memory map layouts.
SYMBOL TABLE MEMORY Address ________________ ___________ ... @array->| 10 | 1000 ________________ ___________ | @array | 1000 | | 20 | 1001 ________________ ___________ ... | 30 | 1002 ________________ ___________ | | 1003 ___________So here we see that the symbol table says that @array is stored at memory location 1000, and the memory map shows how the initial values might be laid out - in this case memory location 1000 is where the first element goes, 1001 is the second element etc. Later we have some code like this
SYMBOL TABLE MEMORY Address ________________ ___________ ... @array->| 10 | 1000 ________________ ___________ | @array | 1000 | | 20 | 1001 ________________ ___________ | @copy | 1003 | | 30 | 1002 ________________ ___________ ... @copy ->| 10 | 1003 ________________ ___________ | 20 | 1004 ___________ | 30 | 1005 ___________ | | 1006 ___________This all seems pretty much what we would expect. What about a reference ?
SYMBOL TABLE MEMORY Address ________________ ___________ ... @array->| 10 | 1000 ________________ ___________ | @array | 1000 | | 20 | 1001 ________________ ___________ | @copy | 1003 | | 30 | 1002 ________________ ___________ | $ref | 1006 | @copy ->| 10 | 1003 ________________ ___________ ... | 20 | 1004 ________________ ___________ | 30 | 1005 ___________ $ref ->| 1000 | 1006 ___________ | | 1007 ___________So $ref is just like before - it hold where in memory @array is stored. And we can see this in a print statement This time we see that $ref is a reference to an array - $ref knows what it is referring to. And we can dereference it too - the most common notations for this are or I like the first notation - the arrow seems to read "go to the referenced value at index 1". YMMV. What about hashes ?
SYMBOL TABLE MEMORY Address ________________ ___________ ... %hash ->| a | 1000 ________________ ___________ | %hash | 1000 | | 1 | 1001 ________________ ___________ ... | b | 1002 ________________ ___________ | 2 | 1003 ___________
SYMBOL TABLE MEMORY Address ________________ ___________ ... %hash ->| a | 1000 ________________ ___________ | %hash | 1000 | | 1 | 1001 ________________ ___________ | $ref | 1004 | | b | 1002 ________________ ___________ ... | 2 | 1003 ________________ ___________ $ref ->| 1000 | 1004 ___________ I wont go into references to other things (subs, filehandles etc) but the concept is the same. So what do we _do_ with references ? Well, they let us make more complicated data structures for one. Arrays are only allowed to hold scalar values - if we want an entry in an array to be another array, we cant do this But we can do this Hence we can make arrays of arrays(AoA's), hashes of hashes (HoH's) or AoHoAoHoHoA.... Secondly they let us be more efficient about using large arrays or hashes in data structures. For example, say @array had ten million elements - something like this would be inefficient This code copies @ten_million's contents to the function func - all ten million of them. A reference saves us all that overhead - just the location in memory goes to the subroutine.
What happens if we try to manipulate the reference ? Look at that ! $ref is not a reference anymore - the 'ARRAY' word has disappeared, now its just an ordinary number. So you can take a reference to something, but you cant manipulate it to refer to something else. Languages like C let you do this - its called pointer manipulation and its the cause of more core dumps and corruption than I can count. Its cool, but very very easy to get wrong... So there is a conceptual guide to references - HTH ! And one last point - not one single thing I have described here is what _actually_ happens in Perl - I believe the concepts are correct, but the implementation is vastly more complicated. ...it is better to be approximately right than precisely wrong. - Warren Buffet Janitored by Arunbear - added readmore tags, as per Monastery guidelines
Back to
Meditations
|
|