gaal has asked for the wisdom of the Perl Monks concerning the following question:
When is the return value of the ref builtin "REF"? What does this say about the argument, and why would I care? Is it possible to concoct a variable with this ref value from pure Perl?
What the ref builtin returns "depends on the type of thing the reference is a reference to", says perlfunc. But according to all the discussions I've seen, a reference in Perl is a scalar. I can print ref \"something" to support this.
Presumably, when using the API, I can create an RV and that would return REF. The following is from sv.c:
switch (SvTYPE(sv)) {
/* [snip] */
case SVt_RV: /* [snip IV, PV, etc., all falling through] */
case SVt_PVBM: if (SvROK(sv))
s = "REF";
else
s = "SCALAR"; break;
case SVt_PVLV: s = SvROK(sv) ? "REF"
/* tied lvalues should appear to be
* scalars for backwards compatitbility */
: (LvTYPE(sv) == 't' || LvTYPE(sv) == 'T')
? "SCALAR" : "LVALUE"; break;
case SVt_PVAV: s = "ARRAY"; break;
case SVt_PVHV: s = "HASH"; break;
/* [snip CODE etc.] */
(Spookiness about "backwards compatibility" deliberately left in.)
So where is this REF thing useful? Where do you even get one? How come my pure perl reference above was not an RV (if it were, I wouldn't get SCALAR as a result of the builtin)?
Re: ref eq "REF"
by Joost (Canon) on Oct 18, 2004 at 10:12 UTC
|
ref() returns "REF" for a reference to a reference:
> perl -e 'print ref \\"test"'
REF
In general, it's not smart to rely on any specific value for ref() beside true or false, except when you really have to, i.e. for serialization. The value of ref() has changed for some types (regexes (?), filehandles...), and AFAIK there is no guarantee it won't change again.
edit: after reading perlfunc on it, it appears you can count on the return values given there (which do not cover all perl data types):
SCALAR
ARRAY
HASH
CODE
REF
GLOB
LVALUE
and the package name for a blessed object (though you shouldn't use that)
| [reply] [d/l] |
|
Ah yes, I realize now that there was no reason to expect
ref \"string"
to return anything but SCALAR, since the referred-to object was, indeed, not a reference. Thanks for the clarification.
I'm still not sure what use REF is, though, exect perhaps as an unreliable "there is more to dereference here" indicator.
| [reply] [d/l] |
|
What's unreliable about it? ref tells you "the type of thing the reference is a reference to." (quote from ref). You're used to using references to hashes and arrays (I assume), so it makes sense to you that ref(\%hash) returns HASH, since \%hash is obviously a reference to a hash. So isn't it perfectly logical (and reliable) that ref(\\$anything) returns REF, since the reference \\$anything is a reference to a reference?
| [reply] [d/l] [select] |
|
|
|
|
|
|
I'm sure Data::Dumper couldn't have been written without ref to a reference returning REF.
| [reply] |
|
|
Re: ref == "REF"
by Prior Nacre V (Hermit) on Oct 18, 2004 at 10:25 UTC
|
SCALAR is a reference to a scalar; HASH is a reference to a hash; and so on (ARRAY, CODE, ...).
REF is a reference to any of those, e.g.
$ perl -wle 'use strict; my $x = 1; my $y = \$x; my $z = \$y; print "Y
+: ", ref($y), " Z: ", ref($z)'
Y: SCALAR Z: REF
$
I vaguely recall using this feature some time ago when testing some deep copying of a complex data structure although I can't remember the exact details. The basic idea was to continually dereference to a scalar while ref() returned REF (e.g. $deref = $$ref); then to deference to the appropriate datatype (e.g. @array = @$ref if ref($ref) eq 'ARRAY').
I can't provide any help with the internals.
| [reply] [d/l] [select] |
Re: ref == "REF"
by demerphq (Chancellor) on Oct 18, 2004 at 19:32 UTC
|
I think part of the problem in this thread (and its one that ambrus has tried to touch on) is that there are important distinctions between "SV", and the collequial term "scalar" in Perl, compunded by the fact that somehow Larry let ref() get used for things that it probably shouldn't have and over time p5p have let it get quite confusing.
Now to one aspect of the direct question, which is essentially is it useful to know that a given ref once dereffed needs not be further dereffed to use? IMO the answer is yes, and that SCALAR overall is useful as well. (When writing a data serialization module I found knowing it was a SCALAR versus a REF was useful. But this is a special case.)
But there is the other aspect, which is essentially does the behaviour of ref make sense when applied to refs to things normally considered to be scalar values? IMO the answer is no. But not because of SCALAR or REF. But more because of GLOB, LVALUE, and "Regexp". The fact that the objects blessedness gets thrown in there makes ref even less useful.
Overall ref() is a broken metaphor sitting on top of mechanism that have been stretched so far past their original design intentions that expecting it to be completely consistent doesn't make a lot of sense.
I have every hope and expectation that Perl6 clears these matters up.
---
demerphq
First they ignore you, then they laugh at you, then they fight you, then you win.
-- Gandhi
Flux8
• Update:
Minor edits.
| [reply] [d/l] |
Re: ref == "REF"
by Anonymous Monk on Oct 18, 2004 at 16:11 UTC
|
It makes sense that a reference to a reference returns REF and not just scalar.
If a reference to a referene would return 'SCALAR' then it will be hard to tell the difference between $x = "\3"; and $x = \\%hash;
About Data::Dumper, have you checked the Dumper.xs file aswell?
| [reply] [d/l] [select] |
|
By that logic, you'd expect NUMBER and STRING return values to ref (or even INTEGERR, UNSIGNED, and DOUBLE)!
I don't think it's harder. If you write recursive code to handle deep references, you just find out you have a reference one level deeper. Hey, maybe we should also have a REFREF value, too, and REFREFREF, ...
As for Dumper.xs, I hadn't looked there, because my question was about Perl usage, which XS is not; it uses the perl API and uses SvTYPE directly. *That* bit of code never sees what ref would return on a variable.
| [reply] [d/l] [select] |
|
$a = "xyz";
$b = \$a;
$c = \$b;
$d = $c;
while (ref $d eq "REF") { $d = $$d; }
$, = " & ";
print ($a, $b, $c, $d, $dd);
If you look at the output then $d will have the same value of $d
What is different if there was no REF type, and you would need to have a reference (for whatever purpose) at the end of the while, then you will need to create a new reference which will not refer to the same scalar as $b:
$a = "xyz";
$b = \$a;
$c = \$b;
$d = $c;
while (ref $d eq "SCALAR" or ref $d eq "REF") { $d = $$d; }
$, = " & ";
$e = \$d;
print ($a, $b, $c, $d, $dd, $e);
$dd is ofcourse undefined, and $e refers to the scalar $d, not to $b. Which means you went one step to far in the iteration.
I don't know wheter this is useful or useless since I haven't really needed it yet, but if I need it I sure will be happy to have it. Else that second loop would need some extra code...
| [reply] [d/l] [select] |
|
|
|
|
Re: ref == "REF"
by ambrus (Abbot) on Oct 18, 2004 at 18:05 UTC
|
This does not make sense imo, but ref can return (at least) four different values for an (unblessed) scalar reference: SCALAR is the most common, but GLOB, REF, LVALUE can also appear:
DB<1> x ref(\($x=5))
0 'SCALAR'
DB<2> x ref(\($x=\$x))
0 'REF'
DB<3> x ref(\*x)
0 'GLOB'
DB<5> x ref(\substr("foo",1,1))
0 'LVALUE'
Update: I didn't say why it doesn't make sense. Here's why: ref($x) can change even if $x itself does not change. Look:
perl -we '$x = \$y; for $z ( 2, \5, *u, vec(2,1,1) ) { $y = $z; warn r
+ef($x), " ", $x; }'
And the result is:
SCALAR SCALAR(0x813c4d0) at -e line 1.
REF REF(0x813c4d0) at -e line 1.
GLOB GLOB(0x813c4d0) at -e line 1.
SCALAR SCALAR(0x813c4d0) at -e line 1.
| [reply] [d/l] [select] |
|
Returning SCALAR for GLOBs would definitely be incorrect. GLOBs contain scalars, not the other way around.
LVALUE has a better case for utlity than REF, because this is how you find out if something *is* an lvalue. (I think? Do you know another way?) For references, you already have the ref builtin itself. See my comment to Anonymous Monk.
| [reply] [d/l] |
|
Returning SCALAR for GLOBs would definitely be incorrect. GLOBs contain scalars, not the other way around.
Not exactly.
A glob is really a kind of a scalar value, just like an integer or a string is.
(Update 3:) The confusion comes from this:
a glob has certain fields: SCALAR, ARRAY, HASH, GLOB (not really a field...), etc. This does not imply that there is a separate data type for each, that it can point to.
It's just pure coincidence that perl happens to have scalar, array, hash, and code as the four most important data types, and these are the four most well-known elements of a glob too.
Let's discuss this more (but I might be wrong,
as I don't know perl internals as much as some other
monks here, so correct me please).
It is clear, that the four most-well known elements
of a glob are distinct data types:
*foo{SCALAR} is a reference to a scalar,
*foo{ARRAY} is a ref to an array,
*foo{HASH} is a ref to a hash,
and *foo{CODE} is a ref to a code.
If $x is a ref to a scalar for exmaple,
you can not make it become a ref to an array without
changing $x itself.
On the other hand, *FOO{GLOB} (which is really the same as *FOO)
is really a reference to a scalar, and you can see it is
because of this:
$x = \$y; $y = *_; print $x; $y = 5; print $x;
prints
GLOB(0x813bda0)
SCALAR(0x813bda0)
You can not store anything in $y that would make $x a ref to an array for instance.
(Update: minor reformatting of the last few paragraphs.)
I've left out IO handles, FORMATs, (Update 4:) Regexps, and packages from this discussion because I don't really know them well.
IO handles is a fifth kind of data type I think, but I'm not sure.
(A bit off-topic:)
Also note that I these four (or more) types are not the types an expression can have. That's something very different. The value of a perl expression is a scalar, a list, or nothing depending on its run-time context (which depends on the callers of the expression, but not the expression itself). It's just plain wrong that scalar and list contexts are sometimes associated with scalar and array data types. For example, the scalar glob element
*FOO{SCALAR} is magically linked with
$FOO, which returns the content of the scalar field of *FOO in scalar context, or a list of a single element (the contents of the scalar field) in list context.
Things are not so simple with *FOO{ARRAY} which has four constructs linked to it:
@FOO, $#FOO, $FOO{EXPR}, @FOO{EXPR}, \@FOO
which more or less return values related to *FOO{ARRAY}. Note that \@FOO is not just the \ operator applied to @FOO. The \ operator normally returns a list of the references of the elements of the list it gets as argument (ie a list of scalar refs), or a single scalar ref in scalar context.
(Most of these constructs can also be used in lvalue context, such as
\$FOO{EXPR}, push @CODE, EXPR, $FOO{EXPR} = EXPR etc,
which makes things even more complicated.)
What shows that these are indeed associated with the*FOO{ARRAY} array reference is that you can put any array reference instead of FOO in any of the above construcs (but you have to add a brace unless the expression yielding the reference starts with $).
The same goes with hashes and subroutines, the special expressions associated with them are
%FOO, $FOO{EXPR}, @FOO{EXPR}, and \%FOO with hashes;
and FOO(EXPR), &FOO(EXPR), &FOO, \&FOO, FOO EXPR (if declared), FOO INDIROB EXPR (if prototyped so), FOO {STMTS} EXPR (if prototyped so);
(I might have missed some).
Update:
I forgot to answer this one:
LVALUE has a better case for utlity than REF, because this is how you find out
if something *is* an lvalue. (I think? Do you know another way?) For
references, you already have the ref builtin itself. See my comment to
No. Actually, most (or all?) scalar references are lvalues. For exmple $x = \$y; is an lvalue reference as you can say $$x = 5;.
(Update 3: trying to add some more context to the second and third paragraph to make it cleaner.)
| [reply] [d/l] [select] |
|
|
Re: ref == "REF"
by adamk (Chaplain) on Oct 19, 2004 at 01:53 UTC
|
I'm not sure what everyone is so confused about...
A SCALAR is a reference to a scalar.
A REF is a reference to a reference.
A SCALAR is a reference to data.
A REF is a reference to a pointer (of sorts).
Personally, I use it all time for writing Decorator classes.
Go take a look at Object::Destroyer in CPAN, it wraps around another object containing circular references and makes sure it gets destroyed normally.
my $Destroyer = bless \$Object, 'Object::Destroyer';
So if you Scalar::Util::reftype $Destroyer, you get 'REF'. Which is as it should be because unlike $Object, which could be B<anything> of any type, $Destroyer just points directory to the object it is Decorating
| [reply] [d/l] |
|
|