Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Scalars, Lists, and Arrays

by ton (Friar)
on Apr 13, 2001 at 03:52 UTC ( #72247=perlquestion: print w/ replies, xml ) Need Help??
ton has asked for the wisdom of the Perl Monks concerning the following question:

I am posting this because I think I finally get it, and I want to check with the senior monks to see if I'm right. Also, I want other monks to share in my epiphany. :)

When I first started learning Perl, I had a heck of a time distinguishing between scalar and array context. After I got that distinction, I had an even harder time figuring out the difference between array and list context. Here's some code that demonstrates the differences:

use strict; my @array = ('foo', 'bar', 'baz'); my $temp; $temp = @array; print "$temp\n"; $temp = ('foo', 'bar', 'baz'); print "$temp\n"; ($temp) = @array; print "$temp\n"; ($temp) = @array[0..2]; print "$temp\n"; $temp = @array[0..2]; print "$temp\n";
Think these all print the same thing? Think again. On running the above script, you will get
3 baz foo foo baz
as your output. Here's why:
  1. This assignment has a scalar on the left hand side. This forces the array to be evaluated in a scalar context, which gives the length of the array.
  2. This assignment has a scalar on the left hand side. This forces the list to be evaluated in a scalar context, which gives the last element of the list.
  3. This assignment has a list on the left hand side. @array is thus evaluated in an array context. When Perl sets assigns one array to another, it sets the first element of one array equal to the first element of the other array, the second element equal to the next element, etc. Extra values are thrown away, so $temp is now equal to the first value of @array, or 'foo'.
  4. This assignment also has a list on the left hand side, and an array slice (containing all the elements of the original array) on the right. The behavior is the same as in 3.
  5. This assignment has a scalar on the left and the array slice (a.k.a. a list) on the right. This looks like a combination of 1. and 4., so you would think that the answer would be 3 or 'foo'. It turns out to be 'baz', because lists assigned to scalar context return the last element of the list.
So my current understanding is this:

  1. The difference between a list and an array is that an array is an allocated chuck of memory, whereas a list is a bunch of scalars that are the result of an expression.
  2. Arrays and lists act the same in list context.
  3. Arrays and lists act very differently in scalar context; the former return their size, and the latter return their last element.
This last difference can be very subtle; for instance, these two subroutines do NOT return the same value (set their results equal to scalars to see how):
sub func1 { return('foo', 'bar' 'baz'); } sub func2 { my @array = ('foo', 'bar', 'baz'); return @array; }
Is my understanding right? Anything else about lists that I should know about?

-Ton
-----
Be bloody, bold, and resolute; laugh to scorn
The power of man...

Comment on Scalars, Lists, and Arrays
Select or Download Code
Re: Scalars, Lists, and Arrays
by Dominus (Parson) on Apr 13, 2001 at 04:41 UTC
    Says Ton:
    Is my understanding right? Anything else about lists that I should know about?
    It seems to me that there's a major point that you are missing.

    There is a difference between an expression and its value. An expression is a sequence of characters in the source code. When the program is run, Perl evaluates the expression, and the result is a value. A value is not part of the program source code; it only exists inside the computer's memory at run time. As a result, it's hard to talk about value. We say things like this:

    "The value of the expression (localtime) is the list (36,38,2,2,3,69,3,91,0)"
    But this is not really correct, because (36,38,2,2,3,69,3,91,0) is not a list. It is an expression whose value happens to be a list. What we really mean here is that the value of the expression (localtime) is the same as the value of the expression (36,38,2,2,3,69,3,91,0).

    We like to say that the expression (36,38,2,2,3,69,3,91,0) is a list expression, because it has a certain simple form, and its value is a list.

    Lots of people are confused about this, because it is rarely explained clearly, and because most people do not use the terminology consistently. That is why you see beginners asking questions like this:

    My program reads in a line from a file, adds 1, and prints it out. The line is: 037. The manual says that Perl considers numbers that begin with 0 to be octal constants. Why did my program print out 38 instead of 040?
    The answer is that expressions that contain constants beginning with 0 represent octal constants. But there is no such thing as an octal value. Values aren't octal or decimal or binary; they're hidden inside the computer, and it is None of Your Business how they are represented. In the beginner's example, Perl is converting a string value to a numeric value, and to do that it always interprets the string value as a decimal numeral.

    Now, the thing I think you may have missed about context is that it applies to expressions, and never to values. Every time you say "the list on the right-hand side" you are making a mistake. In scalar context, there is no list.

    @a is an array expression. In a list context, its value is the list of elements from the array. In scalar context, its value is the length of the array.

    (1,2,3) is not a list. It is a comma expression. In list context, its value is a list of the values of the items separated by the commas. In scalar context, its value is the value of the last item.

    @a[0..2] is not a list. It is not "a.k.a. a list". It is an array slice expression. In list context, its value is a list of the array elements you selected. In scalar context, its value is the value of the last array element you selected.

    localtime() is not a list. It is a localtime expression. In list context, its value is a list of the current seconds, minutes, hours, days, and so on. In scalar context, its value is a string representing the current time.

    This is why you have to understand that context applies to expressions, not to values. If you think context applies to values, you get the wrong answer for localtime(). You would say (as you did in (5) above) "Oh, localtime's value is a list of seconds, minutes, hours, and so on, and since it's a list in scalar context, the result is the last element from the list, which is the $dst value." Wrong wrong wrong!

    In scalar context, localtime doe not produce a list. No Perl expression produces a list in scalar context.

    Your explanation has a bunch of oddities. In (3) you talk about Perl assigning one array to another. But there's only one array there; the ($temp) is not an array.

    In the section titled my current understanding:

    2.Arrays and lists act the same in list context.
    3.Arrays and lists act very differently in scalar context; the former return their size, and the latter return their last element.
    This is really missing it, because it's impossible to have a list in scalar context; there is no such thing. You can have a list expression, but that isn't the same as an array slice expression, or a localtime expression, or any other expression whose value might be a list if it were in list context. But the context effect applies to the expression itself, not to its value, because all contextual effects occur at compile time, never at run time, and at compile time there are no values; the values do not appear until run time, after all the contextual effects are resolved.

    Perl5-porters got email about a year ago from a guy who was reporting the following bug:

    Following is an account of what could be a bug in perl.
      $ perl -e 'print "count is ", scalar(undef, undef), "\n"'
        count is
    Note that count is undefined here, though it should have been 2.
    
    This guy had the same idea you did: (undef, undef) is a list of two items, and scalar() should count the items. But scalar does no such thing. It changes the way the way the program is compiled, so that the comma operator returns its second operand, instead of constructing a list value.

    Hope this helps.

      Dominus,

      Thank you for the explanation. Let me see if I have this right:

      1. There are two contexts in which an expression can be evaluated: scalar and list.
      2. There are three major types of variables in perl: scalars, arrays, and hashes. 'Scalar' variable type and 'scalar' context are not the same thing, although scalar variables usually impose scalar context.
      3. Expressions are either constants, the values of variables (either lvalue or rvalue), or the results of functions which may take other expressions as arguments.
      So my principle error lies in trying to compare lists and arrays, which are different types of things. The confusion is partly caused by 'scalar' being both a context and a expression variable, and thus capable of being compared with either. Hence, comparisons are not transitive!

      Is this right?

      -Ton
      -----
      Be bloody, bold, and resolute; laugh to scorn
      The power of man...

        I've also found it important to thump the desk when I'm teaching about this topic, over two very specific things:
        1. A list never gets produced in a scalar context.
        2. What a construct does produce in a scalar context must be learned, not derived by any general rule.
        So, if you learn about what @foo does in a scalar context, that tells you nothing about what @foo[3..5] does in a scalar context, or even what ($x,$y,$z) does in a scalar context. You must learn them each individually.

        As it turns out, there is a bit of consistency in it (it's not just random madness), but that's available only after the fact. {grin}

        -- Randal L. Schwartz, Perl hacker

        There are other contexts as well. Camel III lists three of them. They are:
        • Boolean context, such as when we test for the existence of values in an array.
          while (@files) { my $file = shift @files; unlink $file or warn "Can't delete $file: $!\n; }
        • Void context. It is unclear from Camel III what this context does. The example Camel III gives is a statement consisting of the quoted string "Camel Lot"; Apparently the -w option on the perl interpreter does not like this context, and returns a warning on it. Maybe it's best to avoid this one altogether.

        • Interpolative context, which occurs inside quoted strings when one includes a variable name. The substitution operator s/// and some other expressions also occur in interpolative context.

          Boolean context is ubiquitous, so let's give this little brother of list and scalar its due!

      Excellent answer.

      I find that trying to nail down "list" to mean something very specific like you have done just runs into problems down the road, though. For example, you say "(1,2,3) is not a list" but that contradicts parts of the Perl documentation. I understand what you mean in the context of this discussion but I think it is important for people to come away from the discussion and realize that many of the things that you say about "lists" above won't match things that a lot of other people (and documentation and books) are going to say about "lists" (because they are using a different definition of "list" than you).

      See (tye)Re: @_0 vs. $_0, (tye)Re2: tr doesn't *really* use a list, (tye)Re3: tr doesn't *really* use a list, (tye)Re2: List context or not?, and (tye)Re3: List context or not? for my thoughts on "lists" in Perl.

              - tye (but my friends call me "Tye")
        The Perl documentation has two different things that both end up being casually called "list": "list values" and "list literals". A "list literal" is that comma-ish thing that turns into the comma operator in a scalar context (last value wins). A "list value" is the result from executing an operator in a list context. And yes, a list literal yields a list value in a list context, hence leading to the casual use of "list" for both.

        In an ideal world, all naked "list" mentions would be expanded to either "list value" or "list literal" as appropriate. But most people generally figure it out by context (gah!) anyway.

        -- Randal L. Schwartz, Perl hacker

      Hrm, I just noticed a clear mistake in your node.

      But scalar does no such thing. It changes the way [...] the program is compiled, so that the comma operator returns its second operand, instead of constructing a list value.

      No, the comma operator is actually compiled exactly the same way. What changes at compile time is that a scalar op-node is added which changes what the other op-nodes do at run time, preventing them from building a list of scalar values on the stack (well, for most operations that can produce a list of scalar values on the stack, but not for all of them, slices being the most obvious exception).

      But the context effect applies to the expression itself, not to its value, because all contextual effects occur at compile time, never at run time, and at compile time there are no values; the values do not appear until run time, after all the contextual effects are resolved.

      No, all contextual effects certainly cannot occur at compile time. A simple counter-example is:

      sub contextSensitive { (foo(),bar()) } contextSensitive(); my $s= contextSensitive(); my @a= contextSensitive();

      That code doesn't compile two or three different versions of the contextSensitive() subroutine, one for each possible context. It compiles the (foo(),bar()) code exactly one way and at run time the behavior is changed depending on what context the sub was used in.

      In fact, my testing shows that most contextual effects occur at run time. Any contextual effects happening at compile time would be optimizations requiring extra code so I'm not surprised that I couldn't find any cases of that. It would also require some static analysis of the parsed code to even determine if the context can be known at compile time. This type of static analysis is pretty rare in perl.

      - tye        

        > No, the comma operator is actually compiled exactly the same way.
        Please reread the sentence you quoted. I did not say the comma operator is compiled differently. I said the scalar operator changes the way the program is compiled. And it does change the way the program is compiled, by causing the insertion of an additional op node into the optree.

        Then I said that the change to the way the program was compiled causes the comma operator to return its second operand, instead of constructing a list value. As you explained, the additional op node has precisely this effect.

        Any contextual effects happening at compile time would be optimizations requiring extra code so I'm not surprised that I couldn't find any cases of that. It would also require some static analysis of the parsed code to even determine if the context can be known at compile time. This type of static analysis is pretty rare in perl.
        But that analysis is exactly what happens - the syntax tree is descended and every operation applies context (or doesn't) to its operands, marking then as having list, scalar, or void context as determined by the operation. Then anything left unmarked gets context from the call stack at runtime.
        --
        A math joke: r = | |csc(θ)|+|sec(θ)|-||csc(θ)|-|sec(θ)|| |
        Online Fortune Cookie Search
        Office Space merchandise
Re: Scalars, Lists, and Arrays
by kha0z (Scribe) on Apr 13, 2001 at 23:37 UTC
    I found Figure 1-1 on page 11 of the Camel book (3rd Ed.) to give a nice visual explanation on arrays and associative arrays. It aslo helps to see the containment of scalar values within such arrays.

    Good Hunting,
    kha0z

      yet another observation: perl arrays and lists are different in the sense that enclosing a ( ) around a bunch of numbers will concatenate them when used under a comma operator, while when used under a [ ] they retain their identity. anything like: my @data = ( @xvalues, @yvalues ) concatenates xvalues and yvalues. if xvalues/yvalues is an array then they get concatenated. however if they are lists [ ], they retain their identity.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://72247]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (8)
As of 2014-10-31 12:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (217 votes), past polls