|Perl: the Markov chain saw|
(RFC) Arrays: A Tutorial/Referenceby jdporter (Canon)
|on Jan 12, 2007 at 16:02 UTC||Need Help??|
(I prefer /msges to replies, whenever practical.)
(RFC) Arrays: A Tutorial/Reference
Array is a type of Perl variable. An array variable is an ordered collection of any number (zero or more) of elements. Each element in an array has an index which is a non-negative integer. Perl arrays are (like nearly everything else in the language) dynamic:
The values of Perl array elements can only be scalars. This may sound like a limitation, if you think of scalars only as comprising numbers and strings; but since scalars can be references to the compound variable types (array and hash), arbitrarily complex data structures are possible. Other scalar types, such as filehandles and the special undef value, are also naturally allowed.
So, given a data structure like that, what kinds of things would you want to do with it? That is, what operations should be able to act on it? You might conceive different sets of operations, or interfaces, depending on how you expect to use an array in your program:
Here are the fundamental Perl array operations:
This tutorial focuses specifically on the array variable type. There are many things you can do in Perl with lists which will also work on arrays; for example, you can iterate over their contents using foreach. Those things are not discussed here. Also: What is the difference between a list and an array?
Initialize an array
Simple assignment does the job:
The key points are that
element 0 will contain 'a', element 1 will contain 'b', and so on.
Whenever an array is assigned to en masse like this, any contents it may have had before the assignment are removed!
Clear an array
Simply assign a zero-length list:
Assigning a value such as undef, 0, or '' will not work! Rather, it will leave the array containing one element, with that one value. That is,
are functionally identical.
Note that omitting the parentheses is bad style, if your goal is actually to assign the one-element list (0) to the array.
Get count of elements
To get the "length" or "size" of an array, simply use it in a scalar context. For example, you can "assign" the array to a scalar variable:
and the scalar variable will afterwards contain the count of elements in the array. Other scalar contexts work as well:
(Yes, print gives its arguments list context, but the dot (string concatenation) operator takes precedence.)
You can always force scalar context on an array by using the function named scalar:
Note that this is a get-only property; you cannot change the length of the array by assigning a scalar to the array variable. For example, @array=0 does not empty the array (as stated in the previous section, Clear an array).
Get the highest index
Often, you want to know what is the highest index in an array — that is, the index of its last element. Perl provides a special syntax for obtaining this value:
This is useful, for example, when you want to create a list of all the indices in an array:
Unlike scalar(@array), $#array is a settable property. When you assign to an array's $#array form, you cause its length (number of elements) to grow or shrink accordingly. If the length increases, the new elements will be uninitialized (that is, they'll be undef). If the length decreases, elements will be dropped from the end.
Clear an array - Round 2
Given that $#array is assignable, you can clear an array by assigning -1 to its $#array form. (Why -1? Well, that's what you see in $#array if @array is empty.) Generally, this is not considered good style, but it's acceptable.
Another way to clear an array is undef @array. This technique should be used with caution, because it frees up some memory used internally to hold the elements. In most cases, this isn't worth the processing time. About the only situation in which you'd want to do this is if @array has a huge number of elements, and @array will be re-used after being cleared but will not hold a huge number of elements again.
Beware: As mentioned above in Clear an array, assigning @array = undef does not clear an array. Unlike the case with scalars, @a=undef and undef(@a) are not equivalent!
Get list of element values
To get the entire list of values stored in an array at any given time, simply use it in a list context:
This is useful for iterating over the list of values stored in an array, one at a time:
This works because in the foreach control construct, the stuff inside the parentheses is expected to be a list — or, more precisely, an expression which will be evaluated in list context and is expected to result in a list of (zero or more) scalar values.
Quiz: What's the difference between these two lines of code:
Remove an element from the end
The function to remove a single element from the end of an array is pop. Given the code:
$x will contain 'c' and @array will be left with two elements, 'a' and 'b'.
Note: By "end", we mean the end of the array with the highest index.
Add new elements at the end
Use the push function to add a number of (scalar) values to the end of an array:
Remove an element from the beginning
The shift function removes one value from the beginning of the array. That is, it removes (and returns) the value in element zero, and shifts all the rest of the elements down one, with the effect that the number of elements is decreased by one. Given the code:
$x will contain 'a' and @array will be left with two elements, 'b' and 'c'. (You can see that shift is just like pop, but acts on the other end of the array.)
Add new elements at the beginning
@array will contain ( 'y', 'z', 1, 2 )
Access one element at an arbitrary index
The first element of an array is accessed at index 0:
Why the $ sigil? Remember that the elements of an array can only be scalar values. The $ makes sense here because we are accessing a single, scalar element out of the array. The thing inside the square brackets does not have to be an integer literal; it can be any expression which results in a number. (If the resulting number is not an integer, it will be truncated to an integer (that is, rounded toward zero).
Change the value of the last element:
Access multiple elements at arbitrary indices
By analogy, if you want to access multiple elements at once, you would use the @ sigil instead of the $. In addition, you would provide a list of index values within the square brackets, rather than just one.
Jargon alert: this syntax for accessing multiple elements of an array at once is called an array slice.
Never forget that with an array slice the index expression is a list: it will be evaluated in list context, and can return any number (including zero) of index numbers. However many numbers are in the list of indices, that's how many elements will be included in the slice.
Beware, though: an array slice may look like an array, due to the @ sigil, but it is not. For example,
will not yield the number of items in the slice!
Set the second, third, and fourth elements in an array:
Insert/Delete/Replace items in the middle of an array
It is possible to insert items into the middle of an array and remove items from the middle of an array. The function which enables this is called splice. It can insert items anywhere in an array (including the ends), and it can remove (and return) any sub-sequence of items from an array. In fact, it can do both of these at once: remove some sub-sequence of items and put another list of values in their place. splice always returns the list of removed values, if any.
The second argument of splice is an array index, and as such, everything we've said about indices applies to it.
(Beware that in scalar context splice returns the last of the list of values removed; shift and pop always return the one value removed.)
Remove 3 items, beginning with the 3rd:
Insert some new values after the 3rd, without deleting any:
Replace the 4th and 5th items with three other values:
And while we're at it: Clear an array - Round 3:
The Perl FAQ has a section on Arrays.
What about wantarray?
Despite its name, wantarray has nothing to do with arrays. It is misnamed. It should have been named something like is_list_context. It is used inside subroutines to detect whether the sub is being called in list, scalar, or void context. It returns true, false, and undef in those cases, respectively.
Other possible topics:
PS: This RFC has been converted into an actual tutorial, Arrays: A Tutorial/Reference.