http://www.perlmonks.org?node_id=450740

So, I've been writing a bit of Python recently, mostly just to be more multi-lingual, and last night I shifted gears to writing some serious Perl for the first time in about a week. I bumped up against some nasty weirdness as the result of my brain not having fully transitioned from Python mode to Perl mode, and I thought I'd share my thoughts as they seem to elucidate the dangers of some of Perl's more promiscuous inclinations.

Now, in Perl, you use () to construct an array, and [] to construct an array but return the reference to it. In Python, () constructs a tuple, whereas [] constructs a list, the former being the same as the latter except that it is immutable. So, in Python, imagine the following two lines of code...

foo = (1, 2, 3) bar = [1, 2, 3]

In the first case, you get a tuple assigned to foo, and in the second case a list assigned to bar. There's nothing confusing about this. Now consider the following Perl code...

@foo = (1, 2, 3); $bar = [1, 2, 3]; $baz = (1, 2, 3); @biz = [1, 2, 3];

The first gives you an array as you'd expect, and the second is similar but you end up with a reference to such an array. So far, so good... In the third case, you end up with a variable that holds the length of the array on the RHS. In the fourth case, you end up with an array of just one element, and that element is a reference to the array that was on the RHS. Ugh...

Not having mentally transitioned from Python, I fell for the trap of the fourth case. I was perplexed to have an array whose length always seemed to be one. So, what is Perl's justification for having these latter two cases? In the third case, wouldn't it be a hell of a lot safer to have the "length" function have a list version that calculated the length of an array? Sure, the while($i<@foo) idiom is nice, but is it worth it? In the fourth case, why isn't this just an outright syntax error? I can't imagine ever intentionally wanting to do that.

This segues nicely into my second gripe of the day. Perl's built-in "length" function is downright silly. Its intended purpose is to take a string and return its length. The instructions for its usage go so far as to explicitly say that you should not try to use it to calculate the length of an array. So, why is length(@foo) perfectly acceptable syntax? Furthermore, since it is valid syntax, why does it do something so thoroughly dumbfounding? It turns out that length(@foo) coerces @foo to a scalar, which is to say its length, and then calculates the string length of the returned integer. If your array has length 0-9, the result will be 1, length 10-99 will yield 2, etc. Gah! Since in Python you do len(lst) to calculate the length of the list variable lst, back in Perl land last night I was carelessly doing length(@foo) and getting extremely erratic results. Why can't Perl's "length" function be reasonable? Either provide a list version of it, or make such an invocation a syntax error.

Mind you, I love Perl, and I find Python needlessly aggravating in many ways. Perl is easily my favorite language. However, it would clearly seem to have dropped the ball in these regards.

Replies are listed 'Best First'.
Re: Some Insights from a Traveler Between Languages
by TimToady (Parson) on Apr 23, 2005 at 19:28 UTC
    In the first case, you get a tuple assigned to foo, and in the second case a list assigned to bar. There's nothing confusing about this.

    Except you've just forced the new user to learn the difference between mutable and immutable. Perl largely succeeds in hiding this difference from new users.

    In the third case, you end up with a variable that holds the length of the array on the RHS.

    Only by accident, in this case. Perl 5 is returning the final value of the list, on the assumption that the user meant to use a C-style comma operator, which is a bad assumption in this case. Perl 6 does not have the C-style comma operator, so you automatically get a reference to the list into $baz. If you use that reference in numeric context, you'll get the length, but if you use it as if it were a tuple, it'll behave as a tuple, and if you use it as a boolean, it'll still behave reasonably.

    @biz will still end up with a single element, since it is still assumed that if you use [...] in list context, you really mean it. But if you want more Pythonly semantics for all of those, just use the new := binding operator instead of =, since Python really only has binding, not assignment.

    Perl's built-in "length" function is downright silly.

    Indeed, I agree, but for different reasons. Perl 6 will not have a length function at all, because "length" is too general a concept in the age of Unicode. You have to be more specific, and specify the units. So we have the following methods:

    $str.bytes # length in bytes $str.codes # length in codepoints $str.graphs # length in graphemes $str.chars # length in characters @array.elems # length in elements
    Note that this also lets you ask for things like
    @array.chars # length of array in characters
    But having said all that, I also agree with your reason, which is why Perl 6 no longer forces evaluation of scalars in scalar context, but autoenreferences anything it can autoenreference. The underlying problem with Perl 5's length(@array) is that it's forcing evaluation of @array in generic scalar context without knowing whether the array is eventually going to be used in a boolean, string, array, or reference context. Perl 6 straightens out this mess without breaking the behavior that Perl 5 programmers actually want. They don't want an array to return its length in scalar context. They want the array to return its length in numeric context. And they don't actually want its length in boolean context--they merely want to know if the array thinks of itself as "true". In Perl 6, scalar context means "I don't know how this object is going to be used yet, so let's not commit yet."

    Perl is easily my favorite language. However, it would clearly seem to have dropped the ball in these regards.

    I hope you will discover that Perl 6 has again picked up most of the balls it dropped in the past, hopefully without dropping too many new balls. It would be possible (though unpopular) to go back through all the recent nodes on PM that complain about any trap in Perl 5, and add annotations about whether Perl 6 addresses the problem. My rough estimate is that at least 90% of those annotations would say "Fixed in Perl 6". I have to sit on myself not to follow up every time I see one of those nodes. It's hard, because I like to brag. That's the problem with Hubris... :-)

      I have to sit on myself not to follow up every time I see one of those nodes.

      That's also fairly difficult, at least in a Euclidean space.

      It's hard, because I like to brag. That's the problem with Hubris... :-)

      Of everyone who has ever posted on this Monastery, I think that you have the most right to brag. In fact, I'd almost say you have an obligation to brag. *grins*

      On the question of why $length = length(@array) can't do what you expect it to do:
      But having said all that, I also agree with your reason, which is why Perl 6 no longer forces evaluation of scalars in scalar context, but autoenreferences anything it can autoenreference. The underlying problem with Perl 5's length(@array) is that it's forcing evaluation of @array in generic scalar context without knowing whether the array is eventually going to be used in a boolean, string, array, or reference context. Perl 6 straightens out this mess without breaking the behavior that Perl 5 programmers actually want. They don't want an array to return its length in scalar context. They want the array to return its length in numeric context. And they don't actually want its length in boolean context--they merely want to know if the array thinks of itself as "true". In Perl 6, scalar context means "I don't know how this object is going to be used yet, so let's not commit yet."
      Isn't easier to just say "Yeah, Larry blew it here, but it's being fixed in perl 6"? (Update: though actually, that essentially is what you were saying... sorry, need to read more carefully.)

      Perl's polymorphism based on "context" rather than "type" can genuinely be pretty confusing... maybe this is a case where DWIM should've won out over consistency.

      Anyway, it sounds like perl 6 will clear this up in a few different ways: (1) there are more contexts (numeric, string, etc), so it's easier for something to do the Right Thing, (2) functions like "length" are turning into object methods (or properties?), so you can make it clear what you're trying to get the length of: @array.length.

      The array length business really is a bit of a mess in perl 5. "How do you get the length of an array" is one of the first questions I seem to get from beginning perl programmer's; and speaking for myself it took me forever to get to the point where I could remember which of the two ways was which. (Of course, one of the reasons they ask that question is that they think they need to iterate explicitly over the array, but that's another story.)

      Lots of other languages seem more sensible in this respect... e.g. I was doing some emacs lisp programming recently, where "lists" and "strings" are both types of "sequences", and "length" gets you the property of a sequence, so it works on both, no problem. (Though there I had a problem with looking through the section of the manual on string processing to try and find the "length" function, when really it's off in the section on sequences...)

      I am very much looking forward to the release of Perl 6. I have high hopes that it will still be everything I love about Perl but with significantly less nonsense. Here's to unreasonable expectations! :-)
Re: Some Insights from a Traveler Between Languages
by merlyn (Sage) on Apr 23, 2005 at 17:27 UTC
    but is it worth it?
    I absolutely appreciate the terseness of:
    if (@foo) { ... there's something in @foo .. }
    and
    while (@todo) { ... do something while there's somethign to do ... }
    I think the whole context thing is an elegant solution, and mimics what we do naturally in human languages. Quick.. how do you pronounce "record"? Or "wind"? Can't tell, until you get the rest of the sentence. Yet we handle that just fine.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      But, paradoxically, even though it mimics human languages, humans seem rather prone to slip on Perl's use of context. In contrast, the lack of context dependance in other languages may be somewhat inconvenient but is not a banana peel. My point is to question the notion that this particular feature makes Perl somehow more in tune with the way humans think.

      the lowliest monk

        Funny, though, that you didn't write:

        My point are to questions the notion that this particular features make Perl somehow more in tune with the way human thinks.

        There are some corner cases in English where subject-verb number agreement is difficult, but I know a lot of people who do pretty well with it. Why should it be any more difficult?

      I appreciate its terseness as well, but it's a doubled-edged sword. Also, it's not clear to me why you are citing homographs as a defense. To me, it seems that homographs are a misfeature of natural language, resulting from the fact that natural languages are the product of clumsy evolution and combination over the course of eons. Clearly it would be better if languages didn't have homographs as it makes life more difficult needlessly. Yes, we can handle them, but they don't buy us anything, so why should we use that kind of thing as an argument when deliberately designing artificial languages?
        Well, it's easy to come up with egregious examples and show how English could have been better designed, but you overgeneralize. The fact is that we rely on multimethod dispatch all the time in any natural language, and it's just a minor lexical miracle that you don't even notice that you're using homophones with different meanings:
        The chicken is ready to eat.
        The children are ready to eat.
        In short, you're relying heavily on MMD yourself when you use overloaded words like:
        appreciate well clear product clumsy combination course as makes life handle buy should use kind argument
        MMD is useful because it lets you express metaphorical correspondences. To trot out the Inkling's favorite example, a "piercing sweetness" may be neither piercing nor sweet, literally speaking, but your MMD dispatcher is even smart enough to autogenerate missing methods and dispatch to them.
      But the python version is slightly shorter...
      if foo:           # empty list [] evaluates to False
         do_something()
      
      and
      while foo:
         x = foo.pop()
         do_something(x)
      
      Cheers, --OH.
Re: Some Insights from a Traveler Between Languages
by japhy (Canon) on Apr 23, 2005 at 19:18 UTC
    Your analysis of the third case is wrong. What do you expect from $x = (10, 20, 30)? If you expect 3, you're wrong -- you get 30. In this case, (10, 20, 30) is not even a list. The parentheses are used to change precendence here. Inside the parentheses are three expressions separated by the comma operator which, in scalar context (which this is), evaluates the left-hand operand and returns the right-hand operand. If we didn't have parentheses, we'd set $x to 10, and the values 20 and 30 would be in void context.

    As for the fourth case, I can think of cases where I want to initialize an array to holding just one element, an array ref. More often than not, I'd expect I'd have that array ref be empty, but it's really a matter of what I'm doing.


    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
      Yeah, I totally dropped the ball on the third example. It was careless of me to use the values (1, 2, 3) as that was a mine field of coincidence and I paid the price in looking the fool. As for the fourth case, though, I really can't imagine why we would want such a thing. Surely it's not a terrible burden to have to put parentheses around the stuff you are assigning to the array.
        Surely it's not a terrible burden to have to put parentheses around the stuff you are assigning to the array.

        Then do.

        I don't because it's unnecessary. Except for slicing the values returned from a function and creating a one-element lvalue list, I cannot think of a case where parentheses create lists. (I can also argue that in the former case, they don't either; they just mark a context.)

        Put another way, if there are no precedence issues, why do you need to group a one-element list explicitly?

Re: Some Insights from a Traveler Between Languages
by brian_d_foy (Abbot) on Apr 23, 2005 at 21:59 UTC

    Your example in Python illustrates one of its major downfalls though. Certainly you can tell what's going on when you see

    foo = (1, 2, 3) bar = [1, 2, 3]

    What happens after that when you use foo or bar? You have to know what it is, but not only what type of data it has, but how it was created.

    This was my biggest stumbling block with Python: I couldn't use the variable names I wanted and I had to remember extra things about the variables I did see. Some people don't like sigils, but I love them. People keep trying to invent them with various notations and styles of variable naming.

    The trick to any language is to think like the language and forget anything you think is "reasonable" or "makes sense". Instead of fitting the language into the way that you think, just take the language for the way it is.

    For instance, the length() function operates on a scalar and is documented that way. The documentation is very clear on that. It doesn't matter to me what I think it should do as long as I know what it does do. Think like that and all sorts of problems disappear. You might still be offended by some missing symmetry, but life is messy.

    As for your list cases, it's not that Perl allows those cases in particular, but that Perl allows these cases:

    @array = SCALAR; # promotes to ( SCALAR ); ... SCALAR, SCALAR ... # comma operator in scalar context

    Once you have those general expressions, you can put them into larger expressions. It's actually more consistent to keep those because they are extensions of the same rules that allow other things (that you probably don't object to). Once you start adding special cases and exceptions, things get a lot more complicated and a lot harder to explain or remember.

    They can't be syntax errors because they follow the rules. They might be candidates for warnings, however (just as an array slice with one element is legal, but generates a warning).

    --
    brian d foy <brian@stonehenge.com>
      For instance, the length() function operates on a scalar and is documented that way. The documentation is very clear on that. It doesn't matter to me what I think it should do as long as I know what it does do. You're just substituting one form of memory work for another. To maintain python, you have to remember type information: to maintain Perl, you need to embed the counter-intuitive behaviour of length() in your mind, in what purports to be a DWIM language. Both are awkward bits of memory work, in my mind.

        "Counter-intuitive" is a relative term. I've never had a problem with length because I'm not the type to guess at what things do, and if I do guess and get it wrong, I just read the docs. I've never wondered about the "length" of an array because "length" doesn't mean "count" to me. I don't say "what's the length of that bunch of bananas?", just like I don't say "what's the count of this hiking trail?".

        You may think some things are odd, but that doesn't objectively make them odd. Instead of intuiting things, you can usually save the hassle of being wrong by reading the docs.

        --
        brian d foy <brian@stonehenge.com>
Re: Some Insights from a Traveler Between Languages
by Juerd (Abbot) on Apr 23, 2005 at 19:28 UTC

    in Perl, you use () to construct an array

    () does not construct an array.

    In an array assignment, my @array = (1, 2, 3);, the right hand side is a list, not an array. The parentheses in this case are needed only because if you wrote my @array = 1, 2, 3;, precedence would cause that to be interpreted as (my @array = 1), 2, 3;, thus not storing the 2 and 3 in @array. Contrast this to [] that always create a new array and return a reference to it. The thing inside [] is also a list, not an array. Only here you don't need the parens, because there is no difference in precedence with or without. But if you want, you can of course write [ (1, 2, 3) ]. Also note how my @array = (1, 2, (3, 4)); and my $arrayref = [ 1, 2, [ 3, 4 ] ]; are different in that in the former, the four elements are flattened and in the latter, you have a nested data structure.

    The difference between the constructors [] and {} on one hand and the grouping () on the other, is reason enough for me to write them differently. That is why I put extra whitespace inside the constructors.

    In the third case, you end up with a variable that holds the length of the array on the RHS.

    Firstly, the RHS is not an array. In scalar context, it is not even a list, because a list can only exist in list context. Because it's not an array and not a list, it is impossible to get the number of its elements. Instead, you get the last value supplied. Your mistake here was to use 1, 2, 3. If you then get 3, you can't immediately know how to interpret it. In this kind of example or experimental code, always use values that do not naturally occur. Do not begin with 1 and do not end with the number of items you provide. Along those lines you should also not use something like 0, 1, 2. Instead, try a set like 42, 15, 69.

    why is length(@foo) perfectly acceptable syntax?

    Because there's nothing that says you shouldn't use an array in scalar context. In scalar context, an array evaluates to its number of elements. That may not be a useful value to provide to length, but it is to many other functions. And to get a consistent language, it shouldn't be made invalid to determine the number of digits of a number of elements.

    ince in Python you do len(lst) to calculate the length of the list variable lst

    Python does not have context, which is the fundamental difference. It cannot see the list variable as anything else than a list variable, and a function can not be smart about what it returns depending on the context in which it is used. Note, by the way, that Python's list is like Perl's array. In Perl, we have both arrays and lists and they are not the same thing!

    Why can't Perl's "length" function be reasonable?

    It is perfectly reasonable, it just doesn't do what you expected. And that is because your expectations are based on a Python world of physical laws, while Perl functions do not live in that world. They live in a Perl world, where Perl's laws and principles are used. Of course, to a traveler, a new world and its culture may at first seem very weird. But as a traveler, you should know better than to judge so quickly!

    Learn our culture, including context and the difference between arrays and lists, and eventually you will feel right at home. Here is an incomplete list of things, copied from http://juerd.nl/perladvice, that you will need to understand:

    • An object is a reference to a blessed variable.
    • A list is not the same as an array.
    • There are three main contexts: void context, scalar context and list context.
    • Things are named or anonymous.
    • The language is Perl, the implementation is perl. Never write PERL.
    • There are different operators for strings and numbers.
    • Some operators perform short circuit logical operations, and these have high and low precedence versions.
    • There are lexical variables, package global variables and package global variables that are always in the main namespace.
    • Parameters are expected, arguments are passed.
    • An operator is either a unary, binary or ternary operator, or a list operator.
    • A statement consists of one or more expressions.
    • You can use alternative delimiters to avoid the leaning toothpick syndrome.

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      Nevertheless, skynight is correct that Perl 5 contains some fundamental semantic traps. But it is also the case that we have not hesitated to break any existing surface feature of Perl 5 in order to target those deep traps. Perl 6 is simultaneously a better Perl and a completely new language, because it still depends critically on context, but completely revamps how (and when) context works, hopefully in a way that will seem even more Perlish in the long run.

      Many of the traps in Perl arise because of colliding contexts, and we've tried very hard to arrange contexts in Perl 6 so they don't collide so often, and when they do, they prioritize in the expected fashion. That's why there really aren't very many keywords to speak of anymore, so you can override print if you want to, because any lexical or package scope takes precedence over the global scope in which print is defined. That's why we now distinguish modules and classes from packages, and methods from subroutines, so the compiler can know the intent of the programmer better. That's why we revamped the precedence tables to get rid of longstanding traps inherited from C. That's why patterns are now considered a real language, and not just interpolated strings. That's why scalar context no longer forces premature evaluation. It all comes down to linguistics, and more specifically, tagmemics. Every utterance has multiple contexts, and it's really important to keep track of which context is "working" at any particular spot.

        skynight is correct that Perl 5 contains some fundamental semantic traps.

        It does, but only when you're used to different semantics. Perl was my second programming language, but different enough from BASIC to assume no rule would be the same. I never fell into any of its well known traps. (I tend to get bitten by the more obscure semantics.)

        In the same way, HOP says it changes the way you code Perl because it assumes you learned Perl after learning C, or you learned Perl from someone who was familiar with C. For me, this was never true. I'm trained by the logic provided in perldocs, which means that I don't code like a C coder. In fact, I'm almost halway through HOP and although it's a terrific book that I am glad I bought, I still hope to find new ways of coding, because to me, the Perlish way of things feels most natural already.

        length(@foo) returning 2 for a 13 element @foo has never surprised me. It was through contact with many beginners (here and in EFnet #perlhelp) that I discovered that this really is a trap for many.

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      It might be worth mentioning that I've been writing Perl in various capacities for about nine years now. I'm not new to this. In fact, I was writing Perl long before I was writing Python. I consider this relevant because if someone who has been using the language as long as I have can bump up against such issues, then I shudder to consider the degree of perplexity under which the newbie frequently labors. There is a very good reason that in many circles Perl is derisively cast as a "write only language". There are some things that it just makes needlessly complicated or unpredictable.

      The problem with that length() statement is that people have gotten used to the "Do What I Mean" principle of Perl. The array is doing what it thinks you mean by returning the number of its elements. The length() function, however, is selfish and doesn't really care what you mean. :p
Re: Some Insights from a Traveler Between Languages
by chromatic (Archbishop) on Apr 23, 2005 at 19:39 UTC

    Would you have such conceptual trouble if the syntax were instead:

    store_into( array_foo, list_of_values (1, 2, 3) ); store_into( scalar_bar, array_reference( list_of_values( 1, 2, 3 ) ); store_into( scalar_baz, last_expression_of_list( 1; 2; 3; ) ); store_into( array_biz, array_reference( list_of_values( 1, 2, 3 ) );

    I'm not suggesting that's great syntax or that it's even clear syntax, but I am suggesting that that's effectively the way you have to think about the code if you want to understand what it's doing. (That also explains why your interpretation of the third case is incorrect.)

    Is it inconsistent? I don't think so. All four are doing different things and all four look very different. As you suggest, it's similar to the difference between Python tuples (immutable) and lists (mutable). They may look similar if you don't pay attention to the brackets, but they look different because they behave differently.

Re: Some Insights from a Traveler Between Languages
by tlm (Prior) on Apr 23, 2005 at 19:55 UTC

    Recently there was an exchange on some of the points you raise.

    Even though tye and merlyn make very good points to the contrary in the thread I just cited, I still think that Perl should 1) provide a list operator as counterpart to scalar, 2) revoke the rule that "a list cannot exist in a scalar context" (after all, it is possible for a scalar to exist in a list context); and 3) define the value of a list in a scalar context as the length of the list. The latter would not prevent a function from responding idiosyncratically in scalar context; e.g.:

    my $date = localtime; my $n = list localtime; print "$date\n"; print "$n\n"; __END__ Sat Apr 23 14:52:09 2005 9
    tye pointed out in the thread cited above that most uses of this list function would be to get the number of items a function would return in list context, and therefore that it would be better to call it count instead. I agree that most uses of list would be for counting, but I still think that it should be called list. My reason for this is largely formal. Perl imposes no restrictions on how a programmer (or even perl itself) may exploit context information. For example, Perl does not proscribe functions that have side effects only when called in a list context. Since Perl allows programmers complete freedom in how they exploit context information, it is only fair that it also provide ways for programmers, who routinely use code written by others, to control the context in which functions evaluate without having to resort to tricks like =()=.

    (Before I get thoroughly roasted for all this heresy I'm quick to add that the I am aware that any formal merits of the scheme above may not be sufficient to warrant the refactoring of perl that would be required to implement it.)

    As a bonus, I think it would be easier to teach context rules, and Perl gurus like tye and merlyn would not have to spend so much time getting blockheads like me to understand that "lists can't exist in scalar context."

    the lowliest monk

      Fixed in 6. :-)
      I have two reasonably short potential equivalents to your list() function. Which of these do you want? Remember that it will should work the same for all arguments, so the fact that we're using localtime here is irrelevant.
      my ($n) = localtime;
      or
      my $n = (localtime)[-1];
      Context is given by the left hand side of an assignment, but its interpretation is done by the right hand side.

      update: also note that the current interpretation of scalar context by a list means that the last element is returned, wich would mean that your list() function would act like:

      (1,2,3,4) != list(1,2,3,4);

        Neither. As I showed in my original post, after

        my $n = list localtime;
        $n should end up with 9.

        the lowliest monk

Re: Some Insights from a Traveler Between Languages
by jonadab (Parson) on Apr 24, 2005 at 12:43 UTC
    @biz = [1, 2, 3];
    In the fourth case, you end up with an array of just one element, and that element is a reference to the array that was on the RHS. Ugh... Not having mentally transitioned from Python, I fell for the trap of the fourth case.

    In other words, you used the punctuation you would have used in the other language, but it meant something rather different in Perl. This is unavoidable when going from language to language, unless the syntax is so *completely* different that it doesn't overlap at all (e.g., moving from elisp to Perl does not cause this problem in my experience, because the syntax in elisp is impossible to confuse with Perl's). I assert that you would have the same type of problem suddenly moving from Perl to Python, or Perl to Ruby, or Ruby to Python, or Python to C++, or C++ to Inform, or cetera.

    So, what is Perl's justification for having these latter two cases?

    The third case is arguable; it's there for convenience, and it's mightily convenient, and I use it with quite significant frequency, but yes, needing to do an explicit length would not be the end of the universe.

    The fourth case, OTOH, is another matter; I do not see how it would be possible for Perl to get by without the ability to assign references. Doing away with that would make Perl quite a lot less useful than it is. We're not talking about syntactic sugar here; it is absolutely *necessary* to be able to assign a reference to an annonymous array into a variable. The fact that the syntax used to do that doesn't mean the same thing in all programming languages is just a symptom of the fact that not all programming languages are identical.

    So, why is length(@foo) perfectly acceptable syntax? Furthermore, since it is valid syntax, why does it do something so thoroughly dumbfounding?

    Yes, I agree that length should have been overloaded. Even though there's a shorter way to take the length of an array, being able to do it explicitely would not have been a bad thing.

    The Perl6 approach to fixing this is to do away with length altogether; instead you will specify $string.bytes if that is what you want, or, if your characters might not all be a single byte long, $string.graphemes or $string.codepoints or some other Unicode-related thing. For arrays, I think there will be @array.elems or somesuch. Note that this still won't solve the problem of different languages having different syntax; indeed, I expect moving back and forth between Perl5 and Perl6 to be rather painful at times.


    "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68

      Well, here's the thing. I don't have a huge problem with different languages having similar syntactic constructs with different semantic interpretations, though I do find it irksome and perhaps even unnecessary. However, I do consider it problematic that two syntactically very similar constructs in a language have two very differerent semantic interpretations while operating on the same kind of stuff (arrays/lists). This is just a land mine waiting for someone to step on it. I like that Python skirts the issue by only having [] for working with lists, as opposed to Perl which has () on top of that.

      In a language like Perl, it's not clear to me that the distinction between "array" and "array ref" is a useful one. For example, the fact that you can return a reference to a local (er, my) variable inside a function is indicative to me that Perl doesn't seem to have the concept of auto variables. Presumably the things getting placed onto the function call stack are not actual variables, but reference counting pointers. I could be wrong, as I have never looked at Perl's internal code, but given its behavior I can't imagine it working differently. As such, why do we even bother with the variable/reference distinction? What does that buy us over Python's strategy of allowing you only to store references? I think Perl's way of dealing with stuff is a legacy of C-style thinking that needlessly complicates things. Do we really need to care about the distinction between variables and references? We don't even have guarantees about when garbage collection occurs, so what's the point?

      Also, you're totally confusing the issue with point four. Perl could survive just fine if the scenario I described were a syntax error. All you'd have to do is wrap your RHS in parentheses, making it explicit that your intention is to assign to an array with an array reference being the first and only element. This doesn't kill any Perl constructs. It just makes things a lot safer by requiring a little more syntax.

      Perl is a fantastic language for rapid prototyping. It makes things that would be tedious in other languages very easy to bang out quickly so you can stay focused on your nascent ideas and ignore the potentially messy implementation details, if just for the moment. To this end, Perl ought also to try to prevent users from making careless mistakes that cause them to bog down in debugging. Having dangerous syntactic constructs that can easily lead to mental traps undermines the goals of rapid prototyping. Every time a programmer has to deal with the vagaries of a language, he's missing the rapid prototyping boat. The whole point of choosing a language like Perl for rapid prototyping is that it should be as transparent as possible to your endeavors to rapidly craft a prototype.

        In a language like Perl, it's not clear to me that the distinction between "array" and "array ref" is a useful one.

        IMO its a pretty useful distinction. For instance:

        my @array; my $arrayref=[] my @copy=@array; my $refcopy=$arrayref;

        So now what happens to the thing $arrayref references when when we modify $refcopy->[0]? It changes. What happens to @array when we modify $copy[0]. Nothing.

        The point is that you can determine many things about manipulating an object by its sigil. For instance a copy of a ref is cheap, a copy of an array is not. Modifying the contents of a ref will change the results of all things using the same ref. Modifing the contents of an array will only change that array or things that reference it.

        Maybe im too close to the trees but I see a big difference between them and good reasons to have both. Sure you can provide all the same effects with only references (provided you have a way to clone/copy an array) but there is a lot to be said for making them visually distinct. I mean personally i find

        my @copy = @array;
        to be preferable to
        my $copy=$array->copy;
        the former says loudly that a new thing is being created where the latter could be doing anything.

        ---
        demerphq

        In a language like Perl, it's not clear to me that the distinction between "array" and "array ref" is a useful one.

        This is the core of the issue right here.

        For example, the fact that you can return a reference to a local (er, my) variable inside a function is indicative to me that Perl doesn't seem to have the concept of auto variables.

        It's indicative of the fact that Perl does have the concepts of lexical scope, anonymous objects, and, in particular, lexical closures. my variables are not local in the same sense that e.g. Inform means by "local variable". (They are closer to it than Perl's local variables, which are dynamically scoped, but that is neither here nor there ATM.) In particular, it is an easy mistake to make (and one that I made once upon a time, when I was very new to Perl) to misunderstand Perl's lexical variables, thinking either that they are persistently scoped to a given function or block ("static" in C parlance) or to go wrong in the other direction, failing to account for references and assuming immediate destruction at the end of the block. But neither of these semantics would be nearly as useful as the one Perl has (although the former could be a useful *additional* option, and I think we may be getting that in Perl6; I cannot think of any use for the latter).

        Presumably the things getting placed onto the function call stack are not actual variables, but reference counting pointers

        The function call stack is not part of Perl (as a language). It is part of perl (the interpreter), an implementation detail that could change between minor versions without having any significant impact on existing Perl code.

        Reference counting is another matter; yes, Perl5 does reference counting. (Perl6 will have real GC, so I am told, likely of the mark-and-sweep variety.) However, that rears its head in other situations, such as when you have cyclic data structures. In the case you're talking about (assuming I correctly understand what it is you are talking about), the same thing would happen if Perl5 had mark-and-sweep garbage collection today. You can do the same thing in Scheme, for instance. If you can't do it in Python, that is because Python deliberately steers you to thinking according to the OO paradigm; Perl embraces various paradigms: OO, FP, contextual, ... this is the essential paradigmatic flexibility that makes Perl the language that it is. TMTOWTDI, and for various situations or problems one of them may be more helpful than another.

        Do we really need to care about the distinction between variables and references?

        Yes, absolutely -- or, at least, we have to have a distinction between references and the things that they reference. The question of what constitutes a "variable" is another whole thread, and one that I suspect would be a tangent here, not the real issue.


        "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68
Re: Some Insights from a Traveler Between Languages
by johnnywang (Priest) on Apr 24, 2005 at 01:14 UTC
    In terms of syntax, when I played with Python and Ruby, the way I remember which brackets to use is to think all Python/Ruby use are perl refereces. So in both languages, arrays are [], and dictionary/hashs are {}. Did both actually borrow that from perl?
Re: Some Insights from a Traveler Between Languages
by tlm (Prior) on Apr 24, 2005 at 02:26 UTC

    One more thought on this: the first edition of Programming Perl used the term LIST in some places, such as the documentation of builtin functions, but it did not use the term list context; it used array context instead. (A relic of this usage remains in the name of the builtin wantarray.) To me this suggests that Larry himself was unclear, or at least ambivalent, about how list context should be defined. If this was so, then it's no wonder that Perl still harbors some "semantic traps" around this topic.

    the lowliest monk