What's in a Reference?

Xiong has asked for the wisdom of the Perl Monks concerning the following question:

Please ignore this node, which has been superceded by What's Really in a Reference?. This is not a reflection on any replies, only on the original post.

Original content of this node was so badly written that the fumes nearly tore my head off when I read it over in the light of day. It confused and outraged some readers (and rightly so). I apologize for the entire mess, especially for the senile trip down Machine Memory Lane, for which I cannot even figure out my original motivation for inflicting on the Monastery.

Having rewritten the node in what I hope to be an improvement, I shoveled the steaming dungheap into Bit Hell, since NodeReaper was too busy drinking piña coladas to touch it. However, tye has demanded its return; so, here it is.

--- Scrap Heap Below ---

Let's define, for demo purposes, an odd little computer with byte-size memory words but only four of them. Since there are only four words, two bits suffice to represent any memory location:
|----DECIMAL  --|       |------BINARY-------|
ADDRESS     VALUE       ADDRESS       VALUE
=======     =====       =======     =========

     0         2            00      0000 0010
     1         7            01      0000 0111
     2         3            10      0000 0011
     3        42            11      0010 1010
[download]
It's clear that address 0 contains the value 2, which is a valid address. Address 1 contains the value 7, which is not. Address 2 contains the value 3, also a valid address; address 3 contains 42.

Since the value 3 in address 2 is itself a valid address, it's meaningful to say, at some level of abstraction, that address 2 contains 42 indirectly. Even address 0 may be said to contain 42 through double indirection.

Right away we run the risk of confusion. If we leave off the English word 'address', then we obtain peculiar statements such as '0 contains 2'. We want a notation such as '$0 contains 2'. The sigil '$' means 'address'. There is no need for a special notation for 2; 2 is 2. That's what is really in that memory location (for some value of 'really'). Nothing is contained in $0 except twoness.

We can abbreviate this to '$0 is 2' at some risk. We have to expand '$...is...' to mean 'address...contains the value...'.

We can then say '$2 is 3'. But then how do we express the relationship between address 0 and the value 3?

Explicitly, we can say '$0 is a pointer to $2 and $2 is 3'. We're advertising the fact that we consider the value 2, stored in $0, to be itself an address. We might shorten this to '$$0 is 3'.

We can continue the process, saying '$$$0 is 42'. No problem.

Now, please don't be mislead into thinking I'm writing Perl here. This is just theoretical notation that happens to coincide.

Perl has a much more complex structure, besides running (you hope) on a machine with more memory. Besides the actual value data shown above, there is also metadata stored. We don't speak of addresses because we don't want to know anything about internal machine state. We do still speak of values.

When we write an expression such as:
$foo    = 88;
[download]
... we are, in some sense, creating a link from 'foo' to 88. It's not considered a symbolic reference; but in some way 'foo' represents the address, pointer, or handle by which we can grab ahold of 88. If we wrote only 'foo' then we would mean some representation of the string 'foo'. These are very different:
$3      = 42;
3       = 42;
[download]
The first is valid Perl; the second raises an exception. (Let's please ignore the fact that $0, $1, $2, etc. are special.)

Now scalar variables like $foo and $3 are so common that we just call them, well, 'variables'. We say 'The variable $3 contains the value 42' or just '$3 is 42'.

We are able, in some way, to store '3' itself in another variable, say $2. We're not really storing machine addresses, at least not purely, any more than we're storing the pure machine value 42 or 0010 1010. But we understand that, somehow, we can use $$2 instead of $3.

One Perl syntax for this is:
$3      = 42;
$2      = \$3;
[download]
... and the associated jargon is take a reference to $3. To go the other way:
print $$2;
[download]
... which prints 42. The associated jargon is dereference $2. Well and good.

Now for the hard part. This node exists to ask and answer one question; I need to ask that replies stick to this, since otherwise the noise (and even wise, off-topic enlightenment) will drown out the simple answer. This is a question about jargon|terminology, not in any way about Perl syntax|code. There are many Perlish ways to enreference and dereference; no need to go over them here. This question is all about the jargon and it can be answered correctly with exactly one (for some value of 'one') word.

In Perl, 42 is called the value of the variable $3. Again, $2 is called a reference to the variable $3.

Q: Fill in the blank: 3 is the ________ of $2. Do not choose the word 'value', for it may be confused with 42.

2010-07-15:

This script takes some references and dumps them with Devel::Peek:
my $A        ;
my $B        ;
my $C        = 42;

$A              = \$B;
$B              = \$C;

say '$A:'; Dump( $A );
say '$B:'; Dump( $B );
say '$C:'; Dump( $C );

__END__

Output: 

$A:
SV = RV(0x8a060b4) at 0x8a060a8
  REFCNT = 1
  FLAGS = (PADMY,ROK)
  RV = 0x8bc9a38
  SV = RV(0x8bc9a44) at 0x8bc9a38
    REFCNT = 2
    FLAGS = (PADMY,ROK)
    RV = 0x8bc9a48
    SV = IV(0x8bc9a44) at 0x8bc9a48
      REFCNT = 2
      FLAGS = (PADMY,IOK,pIOK)
      IV = 42
$B:
SV = RV(0x8bc9a44) at 0x8bc9a38
  REFCNT = 2
  FLAGS = (PADMY,ROK)
  RV = 0x8bc9a48
  SV = IV(0x8bc9a44) at 0x8bc9a48
    REFCNT = 2
    FLAGS = (PADMY,IOK,pIOK)
    IV = 42
$C:
SV = IV(0x8bc9a44) at 0x8bc9a48
  REFCNT = 2
  FLAGS = (PADMY,IOK,pIOK)
  IV = 42
[download]
From this, I think it's clear that somebody is filling in the blank with RV. Certainly, the values shown above for RV are examples of what I want to talk about.

But I'd like to see something clearer; there's usually more than one way to talk about something. 'RV' and even 'reference value' do not seem quite to be clear and unambiguous.

In replies here, I see 'address', 'referent', and 'thingy'.

Back in the Bad Old Days, if memory location B held, as a value, another memory location C (which held some value, perhaps 42), we would say the 'pointer' B held an 'address'. We also spoke of 'indirect addressing'. But I think this is not quite correct and perhaps misleading in Perl.

I like referent; but the word is so similar to 'reference' that it's nearly as risky in quick conversation as 'reference value'.

Note: I'm not trying to discuss reference counts or what happens when various identifiers go out of scope. In the example above, all variables are in scope for the life of the script; then the world burns. In another example, $C might go out of scope while $B still held, indirectly, the value 42. That's all another topic. I just want to know, no matter what else in scope, a good name for RV.

---cut---

Comment on What's in a Reference? Select or Download Code

Replies are listed 'Best First'.
Re: What's in a Reference? by BrowserUk (Patriarch) on Jul 15, 2010 at 07:48 UTC
Q: Fill in the blank: 3 is the ________ of $2. "3" is nothing to do with $2! "3" is not the same as $3, much less "\$3". \$3 is the value contained in $2; and that value is a reference to $3. Your confusion comes in two parts: by naming your variables with numbers; and dropping the '\'; reference taking operator; You create an apparent anomaly that doesn't exist. It's akin to the old trick to confuse kids. Counting on your fingers 1, 2, 3, 4, 5, ... 10. Then going backward. 10, 9, 8, 7, 6 plus 5 equals 11.	[reply]
Re: What's in a Reference? by almut (Canon) on Jul 15, 2010 at 07:19 UTC
`use Devel::Peek; my $x = 42; my $y = \$x; Dump $y; __END__ SV = RV(0x63fde0) at 0x604fc0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,ROK) RV = 0x604fa0 <--- SV = IV(0x62de60) at 0x604fa0 REFCNT = 2 FLAGS = (PADBUSY,PADMY,IOK,pIOK) IV = 42` [download] I'd call it an address. But in contrast to a C pointer, a Perl reference also maintains other info, and doesn't allow you to directly fiddle with the address. (Note that the `SV ... at 0x604fa0` is not part of the reference itself — Devel::Peek just dumps the referenced target, too, for convenience.)	[reply] [d/l] [select]
Re^2: What's in a Reference? by ikegami (Patriarch) on Jul 15, 2010 at 07:30 UTC
Different question than last night. He's asking for the relation between the following and $y. `SV = RV(0x63fde0) at 0x604fc0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,ROK) RV = 0x604fa0 SV = IV(0x62de60) at 0x604fa0 REFCNT = 2 FLAGS = (PADBUSY,PADMY,IOK,pIOK) IV = 42 <---` [download]	[reply] [d/l]
Re^3: What's in a Reference? by almut (Canon) on Jul 15, 2010 at 07:40 UTC
Well, I don't know what was being discussed last night, but I understand the question kind of like this: "A C reference/pointer holds an address, what would you call the analogous thing which a Perl reference holds?" As the subject says: "What's in a reference?" Anyhow, only Xiong call tell what he really meant.	[reply]
Re^4: What's in a Reference? by ikegami (Patriarch) on Jul 15, 2010 at 08:05 UTC
Re^5: What's in a Reference? by Anonymous Monk on Jul 15, 2010 at 09:59 UTC
Re: What's in a Reference? by cdarke (Prior) on Jul 15, 2010 at 07:46 UTC
3 is the referent of $2, or, if you like, thingy, which is what the Camel book called it before going all academic on us. Call it what you like, I like the name Kevin. However, I'm not sure I follow you all the way. My understanding is that the reference is to the value, not the variable. The variable is just a name tag to that value. When a value has a name then it increases its reference count (values can be anonymous). When that name drops out of scope the reference count is decremented, but the value is retained if something else references it. For example: `my $ref; { my $x = 42; $ref = \$x; } print "$$ref\n";` [download] The value still exists and has a reference to it, but the name, $x, does not.	[reply] [d/l]
Re^2: What's in a Reference? (language) by tye (Sage) on Jul 15, 2010 at 19:05 UTC
My understanding is that the reference is to the value, not the variable. You are demonstrating that, like "value", the term "variable" is not uniquely defined. There are more than 2 levels of abstraction and "value" is typically used for several of the lower layers and "variable" for several of the upper layers. Context often further clarifies which layer(s) are being discussed and many discussions are simultaneously valid when applied to more than one layer so the layer often doesn't need to be precisely defined. But you are assigning the term "value" to a layer of abstraction that is, IME, much more commonly referred to as "variable". If "variable" only applied to the name, then we'd never need to use the term "variable name". And we'd never talk in Perl of "anonymous variables". Such is more often spelled "anonymous array" or "anonymous hash". But "anonymous array variable" makes more sense than "anonymous array value" because an anonymous array is, in fact, variable (and thus should be considered "a variable"). You can push to an anonymous array: `push @{ ['anonymous','array'] }, 'variable';` [download] If I throw in a named scalar variable to hold a reference to that anonymous array, then I could even keep the anonymous array around in order to demonstrate that its list of values has varied: `my $aRef= [ 'anonymous', 'array' ]; print "@$aRef\n"; push @$aRef, 'variable'; print "@$aRef\n";` [download] Not that I object to the term "anonymous array value". But, to me, it just means that you are using an anonymous array variable in a way where you only care about the value. A symbolic reference is to a variable by name or "a reference to a variable name" (and which underlying variable is referenced can thus change). Real references are references to the variable underlying zero or more variable names. Here is a list of lots of layers of abstraction as examples: 3, the (numeric) concept (anything that == 3) `my $x= 3; my $y= '3.0'; print '$x and $y have the same value' if $x == $y; # read "if" as "because"` [download] "3", the string (anything that eq "3") `my $x= 3; my $y= "3"; print '$x and $y have the same value' if $x eq $y;` [download] 3 as an IV Any 4 (or 8) bytes containing the local 2's complement representation of 3. `my $x= 3; my $y= pack "j", 3;` [download] The IV part of $x and the string buffer of $y each contain the same value (and in the same format, but stored quite differently in terms of how you can access them). a scalar value (conceptual) `my $x= 3; my $y= 3; my $z= "$y";` [download] In most contexts, it is valid to assert that $x, $y, and $z all contain the same "scalar value" (or, of course, just "the same value"). This "conceptual scalar value" should be the default abstraction layer that one starts at when seeing "value" used in a Perl-related conversation. In most conceptual situations, the slight differences don't matter. This goes hand-in-hand with the fact that in most Perl coding situations, the slight difference also don't matter. That was the design goal, IMHO. an SV (contents) Depending on context, you might get away with saying "$x and $y represent two SVs that are the same" or "$x and $z represent two SVs that are the same" (using the $x, $y, and $z from the prior layer description). You are more likely to be corrected if you try to say "$y and $z represent two SVs that are the same". In some ways, this is a particularly messy abstraction layer, IMHO. There are a lot of things stored in a perl SV struct (actually, the term "SV" is used to refer to a whole family of different C 'struct's). If somebody resorts to this abstraction layer, then they are likely just trying to point out some subtle difference between the implementation of two scalar values without getting bogged down in the specific struct member names or specific bits involved. Actually, I think this abstraction layer is more likely to be used to say "$x and $y contain very similar scalar values but their SVs contain differences". In case you were wondering, at a (relatively) high level, the main differences between $x's, $y's, and $z's SVs are: $y's only contains an IV (integer), $z's only contains a PV (string), and $x's contains both an IV and a PV. Note how $x's SV was changed merely by copying the stringification of $x's value into $z. In most contexts, you don't consider that $x's value changed when `my $z= "$x";` was run. You have to get deep enough to be worrying about what are mostly implementation details for this change to $x's SV to matter. I say "mostly" because there are relatively rare practical situations where such details can matter (such as when computing `$x \| $y`). an SV (instance) `use vars qw< $x >; my $y; x= \$y;` [download] Now $x and $y represent the same SV. They refer to the same SV (but they aren't what we normally call "references" in Perl, so this phrasing should be clarified when it can't be avoided). The best term for this abstraction layer is "aliases". $x and $y are aliases of/for/to the same variable (or of/for/to the same SV). You usually create aliases by calling a function or using for, map, or grep. This is the point at which we switch from "values" to "variables". a variable `use vars qw< $x >; { my $x= 'one'; x= \$x; for my $x ( $x ) { $x= 'two'; } } print $x;` [download] The above code uses three different variables. All three variables are named $x. All three variables end up being aliases to the same SV. But each of the three variables have different scopes and different life times. a variable instance `sub blog { my( $x )= @_; print "\$x = $x\n"; if( $x ) { blog( $x - 1 ); print "\$x still $x\n" } } blog( 1 ); __END__ $x = 1 $x = 0 $x still 1` [download] The above code uses one lexical variable, $x. But since the scope where that variable is declared gets entered more than once, we end up with multiple instances of that lexical variable. You could also get away with talking about this code using two different variables, both named $x, at least in some situations. a variable name `use vars qw< $x >; $x= 'glo'; { my $x= 'lex'; my $y= 'x'; print $$y; # 'glo' print eval '$'.$y; # 'lex' }` [download] When we consider list values and list variables, we add more possible abstraction layers (some of which are even less important). Similarly, when talking about Perl references, there are a few more possible abstraction layers. Most of the abstraction layers I listed above will be referred to simply as either "value" or "variable" with no further explicit clarification in a lot of conversations. We don't coin separate words to uniquely label each abstraction layer. Most of the abstraction layers aren't important enough to do such for them. Note that insisting that "value" or "variable" can only be validly used to refer to just one specific abstraction layer is a pretty silly proposition (and just leads to not understanding people and documentation and not being understood by people). And not all concepts even need nouns. Coining a new noun does no good for those who haven't read one's manifesto where one coined it. So rather than conducting a poll on what noun to use for "the value of the address stored in a Perl reference that indicates which variable the reference refers to", just say "$x and $y refer to the same variable" (or a negated version, if appropriate). That way you'll actually be understood and won't have to keep rehashing the coining process in order to try to get each new person to understand your coined noun. This is as bad an idea as talking about "the second transitus" (or whatever was recently proposed in another thread) instead of "the passed-in subdirectory name". - tye	[reply] [d/l] [select]
Re: What's in a Reference? by ikegami (Patriarch) on Jul 15, 2010 at 07:18 UTC
it's meaningful to say, at some level of abstraction, that address 2 contains 42 indirectly. No. Given `$bar = 42; $foo = \$bar;` [download] one might say "foo is 42" in some contexts. But if you remove the abstraction and start talking of addresses, the statement becomes false. There isn't a 42 at address 2. 3 is the ________ of $2 You filled in too many words. "3 is referenced by $2" or "$2 references 3". Address (of $3) Update: Misread.	[reply] [d/l]
Re: What's in a Reference? by ikegami (Patriarch) on Jul 15, 2010 at 18:02 UTC
A couple of extra notes. Back in the Bad Old Days, if memory location B held, as a value, another memory location C (which held some value, perhaps 42), we would say the 'pointer' B held an 'address'. It's not language dependent. A pointer or reference contains an address by definition. We also spoke of 'indirect addressing'. But I think this is not quite correct and perhaps misleading in Perl. The concept simply doesn't apply since one never works with addresses directly in Perl. Even if you tried to stretch the metaphor beyond its breaking point, I don't see how one could say Perl ever uses indirect addressing. Dereferencing is done explicitly by the user. `my $ref = \$var; print($var); # print access $var directly print($$ref); # print access $var directly, doesn't even see $ref` [download] I like referent $2 references $3. In that relationship, $2 is a referer, and $3 is a referent. 3 and 42 are not involved. Nothing references 3, so it's not the referent in any reference. Update: Added last quote and its reply.	[reply] [d/l]
Re^2: What's in a Reference? by Xiong (Hermit) on Jul 15, 2010 at 18:11 UTC
~~This is a what-is-your-jargon node. I'm looking for a single word in reply. Is your choice address?~~	[reply]
Re^3: What's in a Reference? by ikegami (Patriarch) on Jul 15, 2010 at 18:25 UTC
The post to which you replied contains my comments on the large addition you made to the OP today. That addition contains no questions. My post explores three weaknesses in that addition.	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: What's in a Reference? by Anonymous Monk on Jul 15, 2010 at 07:20 UTC
Do not choose the word 'value', for it may be confused with 42. I choose Value :P referent , ie that which is referenced, ie target	[reply]
Re: What's in a Reference? by petecm99 (Pilgrim) on Jul 15, 2010 at 13:34 UTC
synonym ;)	[reply]
Re: What's in a Reference? by FloydATC (Deacon) on Jul 15, 2010 at 21:10 UTC
I tend to think of it as simply the target of a reference. -- Time flies when you don't know what you're doing	[reply]
Re^2: What's in a Reference? by ikegami (Patriarch) on Jul 15, 2010 at 23:02 UTC
I would consider $3 the target of the reference, not 3.	[reply]
Re: What's in a Reference? by rowdog (Curate) on Jul 15, 2010 at 23:48 UTC
Q: Fill in the blank: 3 is the ________ of $2. Do not choose the word 'value', for it may be confused with 42. Nothing. On the other hand, \$3 is the value of $2 and, IMHO, talking about it as anything other than "the value of $2 is a reference to $3" is confusing. As for the 42 argument, I would say that $2 may evaluate to 42 but the value is still \$3.	[reply]
Re: What's in a Reference? by ikegami (Patriarch) on Jul 16, 2010 at 17:13 UTC
`my $pest = 'flea'; $dog = \$pest; $cat = \$pest; $bob = \$dog; $sue = \$cat;` [download] $pest ___(1)___ 'flea' $dog is a reference to $pest $dog ___(2)___ 'flea' $bob is a reference to $dog $bob is a ___(3)___ to $pest $bob ___(4)___ 'flea' 'flea' is the value of the variable $pest $pest is the ___(5)___ of $dog 'flea' is the ___(6)___ of $dog $dog is the ___(7)___ of $bob $pest is the ___(8)___ of $bob 'flea' is the ___(9)___ of $bob 'flea' is the ___(10)___ of $dog $dog and $cat are ___(11)___ {~similarity} $bob and $sue are ___(12)___ {~similarity} only what to call stuff when speaking English to you You're ask for what I would say, but you provide sentence structures I wouldn't use. I would say: The pest is a flea The dog has fleas Bob's dog's pest Bob's dog has fleas The pest of the dog Fleas have infested the dog The dog is owned by Bob The pest of Bob's dog Fleas infest Bob's dog Fleas infest the dog The dog and the cat have fleas Bob's and Sue's pets have fleas You, on the other hand, might say, $pest is 'flea' $pest contains 'flea' $pest holds 'flea' $pest's value is 'flea' [ $dog has no direct relationship to 'flea' ] [ $bob has no direct relationship to $pest ] [ $bob has no direct relationship to 'flea' ] $pest is ~~the~~ referenced by of $dog [ 'flea' has no direct relationship to $dog ] $dog is ~~the~~ referenced by of $bob [ $pest has no direct relationship to $bob ] [ 'flea' has no direct relationship to $bob ] [ 'flea' has no direct relationship to $dog ] $dog and $cat are equal $bob and $sue are both lexicals containing a reference	[reply] [d/l]
Re^2: What's in a Reference? by Xiong (Hermit) on Jul 16, 2010 at 18:09 UTC
I understand that in several cases there is no direct relationship. I'm asking for words to describe the indirect relationships. In particular, I want words that fill the blanks without altering surrounding grammar. I understand that you might avoid the grammar I've chosen, in which case you may not want to supply a word to fill that blank. In (12), I'm looking for a word or short phrase describing the similarity of $bob and $sue. Since $$$bob and $$$sue both evaluate to 'flea', $bob and $sue are not unrelated and, in some sense, are "the same". However, they are not "the same" in the same way as $dog and $cat. I would like to be able to make the distinction explicit and precise. Thank you for your effort.	[reply]
Re^3: What's in a Reference? by ikegami (Patriarch) on Jul 16, 2010 at 18:53 UTC
I understand that in several cases there is no direct relationship. I'm asking for words to describe the indirect relationships And I provided as much. In particular, I want words that fill the blanks without altering surrounding grammar. You asked what you "should call stuff", and your fill in the blanks prevented that from being answered. In (12), I'm looking for a word or short phrase describing the similarity of $bob and $sue. Like I said, both their pets both have fleas.	[reply]


Welcome to the Monastery
	PerlMonks