Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

(RFC) Arrays: A Tutorial/Reference

by jdporter (Canon)
on Jan 12, 2007 at 16:02 UTC ( #594413=perlmeditation: print w/ replies, xml ) Need Help??

(I prefer /msges to replies, whenever practical.)

(RFC) Arrays: A Tutorial/Reference

Array is a type of Perl variable. An array variable is an ordered collection of any number (zero or more) of elements. Each element in an array has an index which is a non-negative integer. Perl arrays are (like nearly everything else in the language) dynamic:

  1. they grow as necessary, without any need for explicit memory management;
  2. they are heterogeneous, or generic, which is to say, an array doesn't know or enforce the type of its elements.

The values of Perl array elements can only be scalars. This may sound like a limitation, if you think of scalars only as comprising numbers and strings; but since scalars can be references to the compound variable types (array and hash), arbitrarily complex data structures are possible. Other scalar types, such as filehandles and the special undef value, are also naturally allowed.

So, given a data structure like that, what kinds of things would you want to do with it? That is, what operations should be able to act on it? You might conceive different sets of operations, or interfaces, depending on how you expect to use an array in your program:

  1. as a monolithic whole;
  2. as a stack or queue — that is, only working with its ends;
  3. as a random access table of scalars — that is, working with all of its elemental parts.
Perl arrays can be used in all those ways, and more.

Here are the fundamental Perl array operations:

  • Initialize
  • Clear
  • Get count of elements
  • Get the highest index
  • Get list of element values
  • Add new elements at the end
  • Remove an element from the end
  • Adds new elements at the beginning
  • Remove an element from the beginning
  • Access one element at an arbitrary index
  • Access multiple elements at arbitrary indices
  • Insert/Delete/Replace items in the middle of an array

This tutorial focuses specifically on the array variable type. There are many things you can do in Perl with lists which will also work on arrays; for example, you can iterate over their contents using foreach. Those things are not discussed here. Also: What is the difference between a list and an array?

Initialize an array

Simple assignment does the job:

@array = ( 1, 2, 3 ); @array = function_generating_a_list(); @array = @another_array;
The key points are that
  1. the assignment to an array gives list context to the right hand side;
  2. the right side can be any expression which results in a list of zero or more scalar values.
The values are inserted in the array in the same order as they occur in the list, beginning with array index zero. For example, after executing
@array = ( 'a', 'b', 'c' );
element 0 will contain 'a', element 1 will contain 'b', and so on.

Whenever an array is assigned to en masse like this, any contents it may have had before the assignment are removed!

Clear an array

Simply assign a zero-length list:

@array = ();
Assigning a value such as undef, 0, or '' will not work! Rather, it will leave the array containing one element, with that one value. That is,
@array = 0; # and @array = ( 0 );
are functionally identical.
Note that omitting the parentheses is bad style, if your goal is actually to assign the one-element list (0) to the array.

Get count of elements

To get the "length" or "size" of an array, simply use it in a scalar context. For example, you can "assign" the array to a scalar variable:

$count = @array;
and the scalar variable will afterwards contain the count of elements in the array. Other scalar contexts work as well:
print "# Elements: " . @array . "\n";
(Yes, print gives its arguments list context, but the dot (string concatenation) operator takes precedence.)

You can always force scalar context on an array by using the function named scalar:

print "# Elements: ", scalar(@array), "\n";
Note that this is a get-only property; you cannot change the length of the array by assigning a scalar to the array variable. For example, @array=0 does not empty the array (as stated in the previous section, Clear an array).

Get the highest index

Often, you want to know what is the highest index in an array — that is, the index of its last element. Perl provides a special syntax for obtaining this value:

$highest_index = $#array;
This is useful, for example, when you want to create a list of all the indices in an array:
foreach ( 0 .. $#array ) { # $_ is set to each index number, in turn, from first (0) to last ($ +#array) }

Unlike scalar(@array), $#array is a settable property. When you assign to an array's $#array form, you cause its length (number of elements) to grow or shrink accordingly. If the length increases, the new elements will be uninitialized (that is, they'll be undef). If the length decreases, elements will be dropped from the end.

Clear an array - Round 2

Given that $#array is assignable, you can clear an array by assigning -1 to its $#array form. (Why -1? Well, that's what you see in $#array if @array is empty.) Generally, this is not considered good style, but it's acceptable.

Another way to clear an array is undef @array. This technique should be used with caution, because it frees up some memory used internally to hold the elements. In most cases, this isn't worth the processing time. About the only situation in which you'd want to do this is if @array has a huge number of elements, and @array will be re-used after being cleared but will not hold a huge number of elements again.

Beware: As mentioned above in Clear an array, assigning @array = undef does not clear an array. Unlike the case with scalars, @a=undef and undef(@a) are not equivalent!

Get list of element values

To get the entire list of values stored in an array at any given time, simply use it in a list context:

print "Here are your things: ", @array, "\n";
This is useful for iterating over the list of values stored in an array, one at a time:
foreach ( @array ) { ...
This works because in the foreach control construct, the stuff inside the parentheses is expected to be a list — or, more precisely, an expression which will be evaluated in list context and is expected to result in a list of (zero or more) scalar values.

Quiz: What's the difference between these two lines of code:

$x = @array; @x = @array;

Answer:

Remove an element from the end

The function to remove a single element from the end of an array is pop. Given the code:

@array = ( 'a', 'b', 'c' ); $x = pop @array;
$x will contain 'c' and @array will be left with two elements, 'a' and 'b'.

Note: By "end", we mean the end of the array with the highest index.

Add new elements at the end

Use the push function to add a number of (scalar) values to the end of an array:

push @array, 8, 10 .. 15;

Remove an element from the beginning

The shift function removes one value from the beginning of the array. That is, it removes (and returns) the value in element zero, and shifts all the rest of the elements down one, with the effect that the number of elements is decreased by one. Given the code:

@array = ( 'a', 'b', 'c' ); $x = shift @array;
$x will contain 'a' and @array will be left with two elements, 'b' and 'c'. (You can see that shift is just like pop, but acts on the other end of the array.)

Add new elements at the beginning

In a similarly analogous way, unshift acts on the beginning of the array as push acts on the end. Given:

@array = ( 1, 2 ); unshift @array, 'y', 'z';
@array will contain ( 'y', 'z', 1, 2 )

Access one element at an arbitrary index

The first element of an array is accessed at index 0:

$first_elem = $array[0];
Why the $ sigil? Remember that the elements of an array can only be scalar values. The $ makes sense here because we are accessing a single, scalar element out of the array. The thing inside the square brackets does not have to be an integer literal; it can be any expression which results in a number. (If the resulting number is not an integer, it will be truncated to an integer (that is, rounded toward zero).

Change the value of the last element:

$array[ $#array ] += 5;

Access multiple elements at arbitrary indices

By analogy, if you want to access multiple elements at once, you would use the @ sigil instead of the $. In addition, you would provide a list of index values within the square brackets, rather than just one.

( $first, $third, $fifth ) = @array[0,2,4];
Jargon alert: this syntax for accessing multiple elements of an array at once is called an array slice.

Never forget that with an array slice the index expression is a list: it will be evaluated in list context, and can return any number (including zero) of index numbers. However many numbers are in the list of indices, that's how many elements will be included in the slice.

Beware, though: an array slice may look like an array, due to the @ sigil, but it is not. For example,

$n = @array[0..$#array];
will not yield the number of items in the slice!

Set the second, third, and fourth elements in an array:

@array[1..3] = ( 'x', 'y', 'z' );

Sidebar: More about indices

We said earlier that array indices are non-negative integers. While this is strictly true at some level, perl conveniently lets you index elements from the end of the array using negative indices. -1 refers to the last element, -2 to the next-to-last element, and so on. To oversimplify a bit, -1 acts like an alias for $#array... but only in the context of indexing @array!

So the following are equivalent:

$array[ -1 ] $array[ $#array ]
But beware:
@array[ 0 .. $#array ]
can not be written as:
@array[ 0 .. -1 ]
because in this situation the -1 is an argument of the .. range operator, which has no idea what "highest index number" is actually wanted.

Insert/Delete/Replace items in the middle of an array

It is possible to insert items into the middle of an array and remove items from the middle of an array. The function which enables this is called splice. It can insert items anywhere in an array (including the ends), and it can remove (and return) any sub-sequence of items from an array. In fact, it can do both of these at once: remove some sub-sequence of items and put another list of values in their place. splice always returns the list of removed values, if any.

The second argument of splice is an array index, and as such, everything we've said about indices applies to it.

The queue-like array functions could have been implemented in terms of splice, as follows:

unshift @a, @b; # could be written as splice @a, 0, 0, @b;
push @a, @b; # could be written as splice @a, $#a+1, 0, @b; # we have to index to a position PAST the end + of array!
$b = shift @a; # could be written as $b = splice @a, 0, 1;
$b = pop @a; # could be written as $b = splice @a, -1, 1;
(Beware that in scalar context splice returns the last of the list of values removed; shift and pop always return the one value removed.)

Remove 3 items, beginning with the 3rd:

@b = splice @a, 2, 3;
Insert some new values after the 3rd, without deleting any:
splice @a, 2, 0, @b;
Replace the 4th and 5th items with three other values:
splice @a, # array to modify 3, # starting with 4th item 2, # remove (replace) two items 'x', 'y', 'z'; # arbitrary list of new values to insert
And while we're at it: Clear an array - Round 3:
@a = (); # could be written as splice @a, 0;

Any Questions?

The Perl FAQ has a section on Arrays.

Related Resources


What about wantarray?

Despite its name, wantarray has nothing to do with arrays. It is misnamed. It should have been named something like is_list_context. It is used inside subroutines to detect whether the sub is being called in list, scalar, or void context. It returns true, false, and undef in those cases, respectively.


Other possible topics:

  • tieing arrays; the Tie::Array module
  • delete and how it doesn't work on arrays
  • exists and how it DOES work on arrays
  • Various related Perl FAQ entries
  • Array-related modules, such as those in the Array:: family
  • Traps/gotchas, such as deleting from an array while iterating over it
  • multidimensional arrays

PS: This RFC has been converted into an actual tutorial, Arrays: A Tutorial/Reference.

Comment on (RFC) Arrays: A Tutorial/Reference
Select or Download Code
Re: (RFC) Arrays: A Tutorial/Reference
by chromatic (Archbishop) on Jan 12, 2007 at 19:26 UTC
    Note that omitting the parentheses is bad style, if your goal is actually to assign the one-element list (0) to the array.

    Nit: it doesn't really matter, for one-element lists. There's no difference between a one-element list with and without parentheses. Parentheses don't create lists--they group comma-separated expressions.

    The reason you need parentheses in:

    my @fib_nums = 1, 1, 2, 3, 5;

    ... is to disambiguate precedence, not to construct a list.

      Parentheses don't create lists

      I guess they do. At least they force list context:

      qwurx [shmem] ~ > perl -le '$_ = "foo bar baz"; $c = /\w+/g; print $c' 1 qwurx [shmem] ~ > perl -le '$_ = "foo bar baz"; $c = () = /\w+/g; prin +t $c' 3

      <update>

      The first line only prints out wether the match succeeded. The second form forces the match to be done in list context. The resulting list is then evaluated in scalar context. Or is that also a kind of disambiguation (of the match operator)?

      qwurx [shmem] ~ > perl -le '$_ = "foo bar baz"; $c = (/\w+/g); print $ +c' 1

      Hmm.. here they obviously don't force list context?

      </update>

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
        I guess they do.

        I hate being cruel... but read the parser.

        Or is that also a kind of disambiguation (of the match operator)?

        In one sense, those parentheses only mark the empty list, but it's probably more accurate to say that those do create a list. That's only true for the empty list however, as parsers have a very difficult time identifying invisible, non-existent characters.

        Hmm.. here they obviously don't force list context?

        Correct. Why would you expect them to do so? They're immaterial to the expression, just as in my $x = ( 2 + 2 );.

        If parentheses did create lists, what would you expect this code to do?

        my $x = ( 1 + 2 ) * 3;

        Perl doesn't have a strict leftmost-applicative evaluation order, so the parentheses are necessary to disambiguate precedence.

        That's the exact reason why the parentheses are necessary to group lists, but do not create lists. In:

        my @fib_values = 1, 1, 2, 3;

        ... the expression my @fib_values = 1 is a complete expression to the parser as it is. Now it may be completely unambiguous to you that the entire expression is a list assignment, but there are plenty of more complicated assignment forms that involve mixed expressions such that Perl will give you a warning that it may have guessed wrong in this case.

        Note also that you don't need parentheses when passing an argument list to a function or especially a built-in operator... again, unless you need to disambiguate something.

      Indeed. But personally I consider it a matter of good style always to enclose the RHS of an array assignment in parentheses for the sake of consistency, since they are necessary in some situations. And you can't exactly say it's a bad habit to get into...

      One exception I make, in my own programming style, is when the RHS is a single function call, e.g. @a = stat;

Re: (RFC) Arrays: A Tutorial/Reference
by kyle (Abbot) on Jan 13, 2007 at 04:51 UTC

    Arrays are a pretty basic concept in Perl and any other language. Someone who needs this tutorial may also need help with concepts like scalar context and when it's implied.

    I usually don't like something like this:

    $count = @array;

    It works, and it's clear, but mostly because of the variable names. Sometimes I've seen something more like:

    $stuff = @things;

    In that case, I think someone wrote a bug. It's clear what's meant if the scalar context is made explicit like so:

    $stuff = scalar @things;

    Otherwise, I have to read (perhaps a lot) more code to see how the variable is used and whether that makes sense, given its origin. When the scalar context is explicit, I can tell right away that the programmer really meant what was written.

      Someone who needs this tutorial may also need help with concepts like scalar context and when it's implied.

      Indeed. My intention was (and is) to link to a tutorial on context, at such time as one is available.

      It's clear what's meant if the scalar context is made explicit like so:

      $stuff = scalar @things;

      Perhaps you missed where I wrote:

      You can always force scalar context on an array by using the function named scalar:
      print "# Elements: ", scalar(@array), "\n";

      I should mention that the lack of comprehensive examples was intentional, and is why I included "Reference" in the title of the tutorial.

      A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: (RFC) Arrays: A Tutorial/Reference
by bsdz (Friar) on Jan 13, 2007 at 11:32 UTC
Re: (RFC) Arrays: A Tutorial/Reference
by kyle (Abbot) on Jan 13, 2007 at 14:24 UTC

    Good article, just a few suggestions...

    Your list of "fundamental Perl array operations" should be linked to the sections later in the node, like an index. See General-Purpose Linking.

    In your "Clear an array" section, you don't mention this:

    @array = ()

    ...but it does show up at the end of the section on splice.

    At your quiz, you say "highlight text to see it", but then use a <spoiler> tag.

    Your description of pop could be as detailed as your description of shift. Maybe put shift first and refer back to it.

    I'm not sure "array slice" counts as jargon.

    I think the whole section on splice would be better with examples. Showing equivalences to other operations such as shift is good, but those are available in the splice documentation that you link, and it might be nice if a novice could come learn about splice without reading everything else. The kind of examples I'm talking about are similar to what you write for unshift and friends.

    @array = ( 'a', 'b', 'c', 'd', 'e' ); @s = splice @array, 1, 2;

    @s will contain ( 'b', 'c' ) and @array will contain ( 'a', 'd', 'e' ).

      In your "Clear an array" section, you don't mention this:

      @array = ()

      Please look again.

      Keep in mind that there is the Clear an array section, and then there are Clear an array - Round 2 and Clear an array - Round 3. Perhaps you were looking in one of the latter two.

      At your quiz, you say "highlight text to see it", but then use a <spoiler> tag.

      That's because I've configured spoiler tags to render as <div>, in my Display Settings. Probably I shouldn't say anything, and just let the user figure out how to deal with spoilers.

      Your description of pop could be as detailed as your description of shift. Maybe put shift first and refer back to it.

      I wrote it this way because I think of pop as being somewhat more fundamental, or intuitive, than shift. Clearly it could have been written the other way. A choice had to be made. :-)

      I'm not sure "array slice" counts as jargon.

      I think it's programming jargon, and frankly I've never heard the term used in relation to programming in any language other than Perl, though obviously it could be. The point was to introduce the reader to this term, which, if never encountered before would seem like jargon, and so ease the introduction by admitting that it's jargon.

      Thank you for all your excellent comments.

      A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: (RFC) Arrays: A Tutorial/Reference
by rir (Vicar) on Jan 15, 2007 at 04:11 UTC
    Perl's arrays are homogeneous. A beginner may not need to know much about Perl's typing, but we should not give him things that will need to be unlearned to advance. This is clarified in the OP, The original post clarifies its statement that arrays are heterogeneous but I feel the correction is too distant--the reader may not make the connection or may not read so far.

    Nice post.

    Be well,
    rir

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://594413]
Approved by wfsp
Front-paged by wfsp
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2014-08-20 07:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (106 votes), past polls