Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

What's in a Reference?

by Xiong (Hermit)
on Jul 15, 2010 at 06:51 UTC ( [id://849713]=perlquestion: print w/replies, xml ) Need Help??

Xiong has asked for the wisdom of the Perl Monks concerning the following question:

Please ignore this node, which has been superceded by What's Really in a Reference?. This is not a reflection on any replies, only on the original post.

Original content of this node was so badly written that the fumes nearly tore my head off when I read it over in the light of day. It confused and outraged some readers (and rightly so). I apologize for the entire mess, especially for the senile trip down Machine Memory Lane, for which I cannot even figure out my original motivation for inflicting on the Monastery.

Having rewritten the node in what I hope to be an improvement, I shoveled the steaming dungheap into Bit Hell, since NodeReaper was too busy drinking piņa coladas to touch it. However, tye has demanded its return; so, here it is.

--- Scrap Heap Below ---

Let's define, for demo purposes, an odd little computer with byte-size memory words but only four of them. Since there are only four words, two bits suffice to represent any memory location:

|----DECIMAL --| |------BINARY-------| ADDRESS VALUE ADDRESS VALUE ======= ===== ======= ========= 0 2 00 0000 0010 1 7 01 0000 0111 2 3 10 0000 0011 3 42 11 0010 1010

It's clear that address 0 contains the value 2, which is a valid address. Address 1 contains the value 7, which is not. Address 2 contains the value 3, also a valid address; address 3 contains 42.

Since the value 3 in address 2 is itself a valid address, it's meaningful to say, at some level of abstraction, that address 2 contains 42 indirectly. Even address 0 may be said to contain 42 through double indirection.

Right away we run the risk of confusion. If we leave off the English word 'address', then we obtain peculiar statements such as '0 contains 2'. We want a notation such as '$0 contains 2'. The sigil '$' means 'address'. There is no need for a special notation for 2; 2 is 2. That's what is really in that memory location (for some value of 'really'). Nothing is contained in $0 except twoness.

We can abbreviate this to '$0 is 2' at some risk. We have to expand '$...is...' to mean 'address...contains the value...'.

We can then say '$2 is 3'. But then how do we express the relationship between address 0 and the value 3?

Explicitly, we can say '$0 is a pointer to $2 and $2 is 3'. We're advertising the fact that we consider the value 2, stored in $0, to be itself an address. We might shorten this to '$$0 is 3'.

We can continue the process, saying '$$$0 is 42'. No problem.

Now, please don't be mislead into thinking I'm writing Perl here. This is just theoretical notation that happens to coincide.

Perl has a much more complex structure, besides running (you hope) on a machine with more memory. Besides the actual value data shown above, there is also metadata stored. We don't speak of addresses because we don't want to know anything about internal machine state. We do still speak of values.

When we write an expression such as:

$foo = 88;

... we are, in some sense, creating a link from 'foo' to 88. It's not considered a symbolic reference; but in some way 'foo' represents the address, pointer, or handle by which we can grab ahold of 88. If we wrote only 'foo' then we would mean some representation of the string 'foo'. These are very different:

$3 = 42; 3 = 42;

The first is valid Perl; the second raises an exception. (Let's please ignore the fact that $0, $1, $2, etc. are special.)

Now scalar variables like $foo and $3 are so common that we just call them, well, 'variables'. We say 'The variable $3 contains the value 42' or just '$3 is 42'.

We are able, in some way, to store '3' itself in another variable, say $2. We're not really storing machine addresses, at least not purely, any more than we're storing the pure machine value 42 or 0010 1010. But we understand that, somehow, we can use $$2 instead of $3.

One Perl syntax for this is:

$3 = 42; $2 = \$3;

... and the associated jargon is take a reference to $3. To go the other way:

print $$2;

... which prints 42. The associated jargon is dereference $2. Well and good.

Now for the hard part. This node exists to ask and answer one question; I need to ask that replies stick to this, since otherwise the noise (and even wise, off-topic enlightenment) will drown out the simple answer. This is a question about jargon|terminology, not in any way about Perl syntax|code. There are many Perlish ways to enreference and dereference; no need to go over them here. This question is all about the jargon and it can be answered correctly with exactly one (for some value of 'one') word.

In Perl, 42 is called the value of the variable $3. Again, $2 is called a reference to the variable $3.

Q: Fill in the blank: 3 is the ________ of $2. Do not choose the word 'value', for it may be confused with 42.

2010-07-15:

This script takes some references and dumps them with Devel::Peek:

my $A ; my $B ; my $C = 42; $A = \$B; $B = \$C; say '$A:'; Dump( $A ); say '$B:'; Dump( $B ); say '$C:'; Dump( $C ); __END__ Output: $A: SV = RV(0x8a060b4) at 0x8a060a8 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x8bc9a38 SV = RV(0x8bc9a44) at 0x8bc9a38 REFCNT = 2 FLAGS = (PADMY,ROK) RV = 0x8bc9a48 SV = IV(0x8bc9a44) at 0x8bc9a48 REFCNT = 2 FLAGS = (PADMY,IOK,pIOK) IV = 42 $B: SV = RV(0x8bc9a44) at 0x8bc9a38 REFCNT = 2 FLAGS = (PADMY,ROK) RV = 0x8bc9a48 SV = IV(0x8bc9a44) at 0x8bc9a48 REFCNT = 2 FLAGS = (PADMY,IOK,pIOK) IV = 42 $C: SV = IV(0x8bc9a44) at 0x8bc9a48 REFCNT = 2 FLAGS = (PADMY,IOK,pIOK) IV = 42

From this, I think it's clear that somebody is filling in the blank with RV. Certainly, the values shown above for RV are examples of what I want to talk about.

But I'd like to see something clearer; there's usually more than one way to talk about something. 'RV' and even 'reference value' do not seem quite to be clear and unambiguous.

In replies here, I see 'address', 'referent', and 'thingy'.

Back in the Bad Old Days, if memory location B held, as a value, another memory location C (which held some value, perhaps 42), we would say the 'pointer' B held an 'address'. We also spoke of 'indirect addressing'. But I think this is not quite correct and perhaps misleading in Perl.

I like referent; but the word is so similar to 'reference' that it's nearly as risky in quick conversation as 'reference value'.

Note: I'm not trying to discuss reference counts or what happens when various identifiers go out of scope. In the example above, all variables are in scope for the life of the script; then the world burns. In another example, $C might go out of scope while $B still held, indirectly, the value 42. That's all another topic. I just want to know, no matter what else in scope, a good name for RV.

---cut---

Replies are listed 'Best First'.
Re: What's in a Reference?
by BrowserUk (Patriarch) on Jul 15, 2010 at 07:48 UTC
    Q: Fill in the blank: 3 is the ________ of $2.

    "3" is nothing to do with $2!

    "3" is not the same as $3, much less "\$3".

    \$3 is the value contained in $2; and that value is a reference to $3.

    Your confusion comes in two parts:

    • by naming your variables with numbers;
    • and dropping the '\'; reference taking operator;

    You create an apparent anomaly that doesn't exist.

    It's akin to the old trick to confuse kids. Counting on your fingers 1, 2, 3, 4, 5, ... 10. Then going backward. 10, 9, 8, 7, 6 plus 5 equals 11.

Re: What's in a Reference?
by almut (Canon) on Jul 15, 2010 at 07:19 UTC
    use Devel::Peek; my $x = 42; my $y = \$x; Dump $y; __END__ SV = RV(0x63fde0) at 0x604fc0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,ROK) RV = 0x604fa0 <--- SV = IV(0x62de60) at 0x604fa0 REFCNT = 2 FLAGS = (PADBUSY,PADMY,IOK,pIOK) IV = 42

    I'd call it an address.  But in contrast to a C pointer, a Perl reference also maintains other info, and doesn't allow you to directly fiddle with the address.

    (Note that the SV ... at 0x604fa0 is not part of the reference itself — Devel::Peek just dumps the referenced target, too, for convenience.)

      Different question than last night. He's asking for the relation between the following and $y.
      SV = RV(0x63fde0) at 0x604fc0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,ROK) RV = 0x604fa0 SV = IV(0x62de60) at 0x604fa0 REFCNT = 2 FLAGS = (PADBUSY,PADMY,IOK,pIOK) IV = 42 <---

        Well, I don't know what was being discussed last night, but I understand the question kind of like this:

        "A C reference/pointer holds an address, what would you call the analogous thing which a Perl reference holds?"  As the subject says: "What's in a reference?"

        Anyhow, only Xiong call tell what he really meant.

Re: What's in a Reference?
by cdarke (Prior) on Jul 15, 2010 at 07:46 UTC
    3 is the referent of $2, or, if you like, thingy, which is what the Camel book called it before going all academic on us.

    Call it what you like, I like the name Kevin.

    However, I'm not sure I follow you all the way. My understanding is that the reference is to the value, not the variable. The variable is just a name tag to that value. When a value has a name then it increases its reference count (values can be anonymous). When that name drops out of scope the reference count is decremented, but the value is retained if something else references it. For example:
    my $ref; { my $x = 42; $ref = \$x; } print "$$ref\n";
    The value still exists and has a reference to it, but the name, $x, does not.
      My understanding is that the reference is to the value, not the variable.

      You are demonstrating that, like "value", the term "variable" is not uniquely defined. There are more than 2 levels of abstraction and "value" is typically used for several of the lower layers and "variable" for several of the upper layers. Context often further clarifies which layer(s) are being discussed and many discussions are simultaneously valid when applied to more than one layer so the layer often doesn't need to be precisely defined.

      But you are assigning the term "value" to a layer of abstraction that is, IME, much more commonly referred to as "variable". If "variable" only applied to the name, then we'd never need to use the term "variable name".

      And we'd never talk in Perl of "anonymous variables". Such is more often spelled "anonymous array" or "anonymous hash". But "anonymous array variable" makes more sense than "anonymous array value" because an anonymous array is, in fact, variable (and thus should be considered "a variable"). You can push to an anonymous array:

      push @{ ['anonymous','array'] }, 'variable';

      If I throw in a named scalar variable to hold a reference to that anonymous array, then I could even keep the anonymous array around in order to demonstrate that its list of values has varied:

      my $aRef= [ 'anonymous', 'array' ]; print "@$aRef\n"; push @$aRef, 'variable'; print "@$aRef\n";

      Not that I object to the term "anonymous array value". But, to me, it just means that you are using an anonymous array variable in a way where you only care about the value.

      A symbolic reference is to a variable by name or "a reference to a variable name" (and which underlying variable is referenced can thus change). Real references are references to the variable underlying zero or more variable names.

      Here is a list of lots of layers of abstraction as examples:

      3, the (numeric) concept (anything that == 3)
      my $x= 3; my $y= '3.0'; print '$x and $y have the same value' if $x == $y; # read "if" as "because"
      "3", the string (anything that eq "3")
      my $x= 3; my $y= "3"; print '$x and $y have the same value' if $x eq $y;
      3 as an IV

      Any 4 (or 8) bytes containing the local 2's complement representation of 3.

      my $x= 3; my $y= pack "j", 3;

      The IV part of $x and the string buffer of $y each contain the same value (and in the same format, but stored quite differently in terms of how you can access them).

      a scalar value (conceptual)
      my $x= 3; my $y= 3; my $z= "$y";

      In most contexts, it is valid to assert that $x, $y, and $z all contain the same "scalar value" (or, of course, just "the same value"). This "conceptual scalar value" should be the default abstraction layer that one starts at when seeing "value" used in a Perl-related conversation. In most conceptual situations, the slight differences don't matter. This goes hand-in-hand with the fact that in most Perl coding situations, the slight difference also don't matter. That was the design goal, IMHO.

      an SV (contents)

      Depending on context, you might get away with saying "$x and $y represent two SVs that are the same" or "$x and $z represent two SVs that are the same" (using the $x, $y, and $z from the prior layer description). You are more likely to be corrected if you try to say "$y and $z represent two SVs that are the same".

      In some ways, this is a particularly messy abstraction layer, IMHO. There are a lot of things stored in a perl SV struct (actually, the term "SV" is used to refer to a whole family of different C 'struct's). If somebody resorts to this abstraction layer, then they are likely just trying to point out some subtle difference between the implementation of two scalar values without getting bogged down in the specific struct member names or specific bits involved.

      Actually, I think this abstraction layer is more likely to be used to say "$x and $y contain very similar scalar values but their SVs contain differences".

      In case you were wondering, at a (relatively) high level, the main differences between $x's, $y's, and $z's SVs are: $y's only contains an IV (integer), $z's only contains a PV (string), and $x's contains both an IV and a PV.

      Note how $x's SV was changed merely by copying the stringification of $x's value into $z. In most contexts, you don't consider that $x's value changed when my $z= "$x"; was run. You have to get deep enough to be worrying about what are mostly implementation details for this change to $x's SV to matter. I say "mostly" because there are relatively rare practical situations where such details can matter (such as when computing $x | $y).

      an SV (instance)
      use vars qw< $x >; my $y; *x= \$y;

      Now $x and $y represent the same SV. They refer to the same SV (but they aren't what we normally call "references" in Perl, so this phrasing should be clarified when it can't be avoided).

      The best term for this abstraction layer is "aliases". $x and $y are aliases of/for/to the same variable (or of/for/to the same SV). You usually create aliases by calling a function or using for, map, or grep.

      This is the point at which we switch from "values" to "variables".

      a variable
      use vars qw< $x >; { my $x= 'one'; *x= \$x; for my $x ( $x ) { $x= 'two'; } } print $x;

      The above code uses three different variables. All three variables are named $x. All three variables end up being aliases to the same SV. But each of the three variables have different scopes and different life times.

      a variable instance
      sub blog { my( $x )= @_; print "\$x = $x\n"; if( $x ) { blog( $x - 1 ); print "\$x still $x\n" } } blog( 1 ); __END__ $x = 1 $x = 0 $x still 1

      The above code uses one lexical variable, $x. But since the scope where that variable is declared gets entered more than once, we end up with multiple instances of that lexical variable. You could also get away with talking about this code using two different variables, both named $x, at least in some situations.

      a variable name
      use vars qw< $x >; $x= 'glo'; { my $x= 'lex'; my $y= 'x'; print $$y; # 'glo' print eval '$'.$y; # 'lex' }

      When we consider list values and list variables, we add more possible abstraction layers (some of which are even less important). Similarly, when talking about Perl references, there are a few more possible abstraction layers.

      Most of the abstraction layers I listed above will be referred to simply as either "value" or "variable" with no further explicit clarification in a lot of conversations. We don't coin separate words to uniquely label each abstraction layer. Most of the abstraction layers aren't important enough to do such for them.

      Note that insisting that "value" or "variable" can only be validly used to refer to just one specific abstraction layer is a pretty silly proposition (and just leads to not understanding people and documentation and not being understood by people).

      And not all concepts even need nouns. Coining a new noun does no good for those who haven't read one's manifesto where one coined it.

      So rather than conducting a poll on what noun to use for "the value of the address stored in a Perl reference that indicates which variable the reference refers to", just say "$x and $y refer to the same variable" (or a negated version, if appropriate). That way you'll actually be understood and won't have to keep rehashing the coining process in order to try to get each new person to understand your coined noun.

      This is as bad an idea as talking about "the second transitus" (or whatever was recently proposed in another thread) instead of "the passed-in subdirectory name".

      - tye        

Re: What's in a Reference?
by ikegami (Patriarch) on Jul 15, 2010 at 07:18 UTC

    it's meaningful to say, at some level of abstraction, that address 2 contains 42 indirectly.

    No. Given

    $bar = 42; $foo = \$bar;

    one might say "foo is 42" in some contexts. But if you remove the abstraction and start talking of addresses, the statement becomes false. There isn't a 42 at address 2.

    3 is the ________ of $2

    You filled in too many words. "3 is referenced by $2" or "$2 references 3".

    Address (of $3)

    Update: Misread.

Re: What's in a Reference?
by ikegami (Patriarch) on Jul 15, 2010 at 18:02 UTC

    A couple of extra notes.

    Back in the Bad Old Days, if memory location B held, as a value, another memory location C (which held some value, perhaps 42), we would say the 'pointer' B held an 'address'.

    It's not language dependent. A pointer or reference contains an address by definition.

    We also spoke of 'indirect addressing'. But I think this is not quite correct and perhaps misleading in Perl.

    The concept simply doesn't apply since one never works with addresses directly in Perl. Even if you tried to stretch the metaphor beyond its breaking point, I don't see how one could say Perl ever uses indirect addressing. Dereferencing is done explicitly by the user.

    my $ref = \$var; print($var); # print access $var directly print($$ref); # print access $var directly, doesn't even see $ref

    I like referent

    $2 references $3. In that relationship, $2 is a referer, and $3 is a referent. 3 and 42 are not involved. Nothing references 3, so it's not the referent in any reference.

    Update: Added last quote and its reply.

      This is a what-is-your-jargon node. I'm looking for a single word in reply. Is your choice address?

        The post to which you replied contains my comments on the large addition you made to the OP today. That addition contains no questions. My post explores three weaknesses in that addition.
        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: What's in a Reference?
by Anonymous Monk on Jul 15, 2010 at 07:20 UTC
    Do not choose the word 'value', for it may be confused with 42.

    I choose Value :P referent , ie that which is referenced, ie target

Re: What's in a Reference?
by petecm99 (Pilgrim) on Jul 15, 2010 at 13:34 UTC
    synonym ;)
Re: What's in a Reference?
by FloydATC (Deacon) on Jul 15, 2010 at 21:10 UTC
    I tend to think of it as simply the target of a reference.

    -- Time flies when you don't know what you're doing
      I would consider $3 the target of the reference, not 3.
Re: What's in a Reference?
by rowdog (Curate) on Jul 15, 2010 at 23:48 UTC
    Q: Fill in the blank: 3 is the ________ of $2. Do not choose the word 'value', for it may be confused with 42.

    Nothing. On the other hand, \$3 is the value of $2 and, IMHO, talking about it as anything other than "the value of $2 is a reference to $3" is confusing.

    As for the 42 argument, I would say that $2 may evaluate to 42 but the value is still \$3.

Re: What's in a Reference?
by ikegami (Patriarch) on Jul 16, 2010 at 17:13 UTC
    my $pest = 'flea'; $dog = \$pest; $cat = \$pest; $bob = \$dog; $sue = \$cat;
    • $pest ___(1)___ 'flea'
    • $dog is a reference to $pest
    • $dog ___(2)___ 'flea'
    • $bob is a reference to $dog
    • $bob is a ___(3)___ to $pest
    • $bob ___(4)___ 'flea'
    • 'flea' is the value of the variable $pest
    • $pest is the ___(5)___ of $dog
    • 'flea' is the ___(6)___ of $dog
    • $dog is the ___(7)___ of $bob
    • $pest is the ___(8)___ of $bob
    • 'flea' is the ___(9)___ of $bob
    • 'flea' is the ___(10)___ of $dog
    • $dog and $cat are ___(11)___ {~similarity}
    • $bob and $sue are ___(12)___ {~similarity}

    only what to call stuff when speaking English to you

    You're ask for what I would say, but you provide sentence structures I wouldn't use. I would say:

    1. The pest is a flea
    2. The dog has fleas
    3. Bob's dog's pest
    4. Bob's dog has fleas
    5. The pest of the dog
    6. Fleas have infested the dog
    7. The dog is owned by Bob
    8. The pest of Bob's dog
    9. Fleas infest Bob's dog
    10. Fleas infest the dog
    11. The dog and the cat have fleas
    12. Bob's and Sue's pets have fleas

    You, on the other hand, might say,

    1. $pest is 'flea'
      $pest contains 'flea'
      $pest holds 'flea'
      $pest's value is 'flea'
    2. [ $dog has no direct relationship to 'flea' ]
    3. [ $bob has no direct relationship to $pest ]
    4. [ $bob has no direct relationship to 'flea' ]
    5. $pest is the referenced by of $dog
    6. [ 'flea' has no direct relationship to $dog ]
    7. $dog is the referenced by of $bob
    8. [ $pest has no direct relationship to $bob ]
    9. [ 'flea' has no direct relationship to $bob ]
    10. [ 'flea' has no direct relationship to $dog ]
    11. $dog and $cat are equal
    12. $bob and $sue are both lexicals containing a reference

      I understand that in several cases there is no direct relationship. I'm asking for words to describe the indirect relationships.

      In particular, I want words that fill the blanks without altering surrounding grammar. I understand that you might avoid the grammar I've chosen, in which case you may not want to supply a word to fill that blank.

      In (12), I'm looking for a word or short phrase describing the similarity of $bob and $sue. Since $$$bob and $$$sue both evaluate to 'flea', $bob and $sue are not unrelated and, in some sense, are "the same". However, they are not "the same" in the same way as $dog and $cat. I would like to be able to make the distinction explicit and precise.

      Thank you for your effort.

        I understand that in several cases there is no direct relationship. I'm asking for words to describe the indirect relationships

        And I provided as much.

        In particular, I want words that fill the blanks without altering surrounding grammar.

        You asked what you "should call stuff", and your fill in the blanks prevented that from being answered.

        In (12), I'm looking for a word or short phrase describing the similarity of $bob and $sue.

        Like I said, both their pets both have fleas.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://849713]
Approved by Corion
Front-paged by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2024-04-19 09:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found