Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

RFC: object-oriented two-dimensional ragged array tutorial

by paulymer (Novice)
on Sep 10, 2013 at 00:00 UTC ( [id://1053126]=perlmeditation: print w/replies, xml ) Need Help??

This is my attempt at demonstrating how to implement and call a 2D ragged array in Moose. A 2D ragged array is a two-dimensional array with row lengths that are not required to be equal. This is a structure I needed for a personal project and I found very little on the web to guide my efforts. Here is one conversation I found dealing with the problem, but this only served to increase my confusion. So, now that I've worked through the details of how to do it, I thought I'd write up my findings and some examples for the benefit of other's who might need something similar.

A few considerations that may be important:

  • I am using perl v5.12.3; no guarantees for earlier versions.
  • This solution uses only two modules: Moose and Moose::Util::TypeConstraints. There are other solutions available.
  • I have only implemented a few methods that serve to illustrate how to access each level of the array. This should be a sufficient guide for extending the functionality of this class.
  • I have not tested out sub-classing. If anyone else wants to try this out, I would be interested to learn how well this solution allows for sub-classing.
  • The best way to understand what is happening is download the code and start tinkering (and read a lot of documentation). Good luck!

The primary issue I came across in my first attempt was that I couldn't access the builtin 'Array' methods (i.e. push, get, elements) in the inner array. I attempted to work-around the problem by pushing new elements onto the data structures contained in the class by using perl's builtin 'push' function and array references. Obviously, this wasn't an acceptable hack because it circumvented Moose's type validation system. I was able to add my own type validation, but that added a lot of unnecessary work, and also introduced the possibility for bugs. But most of all, it just didn't feel right, and that didn't sit well with me.

The solution to the problem was to create a new class for the inner array, which I called 'Row'.

{ package Row; use Moose; use Moose::Util::TypeConstraints; has 'row' => ( traits => ['Array'], is => 'ro', isa => 'ArrayRef[Cell]', required => 1, default => sub { [] }, handles => { pushCell => 'push', getCell => 'get', allCells => 'elements', }, ); }
In the class definition for 'Row', I added the handlers for the delegated functions that were needed to operate on the inner array. Once this was implemented, I just needed to write a wrapper method in the 'Grid' class that would do the right thing when pushing a new element onto the end of a row.

sub addCell { my ($self, $index, @cells) = @_; $self->pushRow( [] ) if !defined $self->getRow($index); $self->getRow($index)->pushCell(@cells); }

This was only part of the solution though. The implementation to this point worked only if the caller passed data structures to the constructor or accessor methods that exactly matched the Moose data types. This meant that in order for the caller to push a new row onto the grid, the argument to $grid->pushRow() had to be a 'Row' class instance. I didn't want to burden the caller with this restrictiveness.

So this is where we come to the 'coerce' statements that were so cryptic in the documentation I had read previously. It turns out that coercion simply creates a map between the data structure sent by the caller and the data type held by the class. If the two are compatible, the class will automatically convert the caller's data structure to the data type required by the class. Simple. Unfortunately, you wouldn't think it's so simple by some of the examples I found. Hopefully, the examples given in this article will clarify the principle.

{ package Grid;; use Moose; use Moose::Util::TypeConstraints; # coerce an 'ArrayRef[ArrayRef[Cell]]' struct provided by the +caller into an 'ArrayRef[Row]' type subtype 'A::Row' => as 'ArrayRef[Row]'; coerce 'A::Row' => from 'ArrayRef[ArrayRef[Cell]]' => via { [ +map {Row->new( row => $_ )} @$_ ] }; # coerce an 'ArrayRef[Cell]' struct provided by the caller int +o a 'Row' type coerce 'Row' => from 'ArrayRef[Cell]' => via { Row->new( row = +> $_ ) }; has 'grid' => ( traits => ['Array'], is => 'ro', isa => 'A::Row', required => 1, default => sub { [] }, coerce => 1, handles => { addRow => 'push', getRow => 'get', allRows => 'elements', }, ); }
I was also initially confused why it was necessary to define a new sub-type with a statement such as subtype 'A::Row' => as 'ArrayRef[Row]';. Why couldn't I simply state the following: coerce 'ArrayRef[Row]' => from 'ArrayRef[ArrayRef[Cell]]' => via ... and remove the 'subtype' statement altogether? When I tried to use the latter expression, I got a compilation error saying that I couldn't use '[' in the 'ArrayRef[Row]' argument. That explains why we need to define a new subtype 'A::Row' to be an 'ArrayRef[Row]' type.

The 'coerce' statements in the 'Grid' class definition provide a simple interface to the class by allowing the caller to pass in intuitive data structures while at the same time passing the class's type validation system. The end result is that you get a very robust class with strong data validation administered 100% by the class, yet a simple and intuitive interface for the client. For example:

  • $grid->addCell($row, $cell); — caller sends in an individual 'Cell' instance object
  • $grid->addRow( [$cell1, $cell2, $cell3] ); — caller sends in an array of 'Cell' instances
  • my $grid = Grid->new( grid => [ [$cellA0, $cellB0, $cellC0], [$cellA1, $cellB1, $cellC1], [$cellA2], ] );
    — caller sends in a 2D array of 'Cell' instances to the class constructor
Additional examples can be found at the bottom of the script below in the Test Cases section.

This is not to say there are no traps for the unwary user. When passing data into the class, the caller needs to pass data structures by reference. This means that $grid->addRow( [$cellX, $cellY, $cellZ] ) will work properly; whereas, @row = ($cellX, $cellY, $cellZ); $grid->addRow( @row ) will not (notice the square brackets denoting an anonymous array in the former expression). Another potential trap in this particular example is failing to instantiate a new 'Cell' object for every element in the grid, otherwise, each element will refer to the same object which is likely not the intended use. Also, in this particular example, the 'Grid' class can only accept 'Cell' objects as elements of the 2D ragged array. Pushing any other data type onto the 'Grid' will fail. Creating a polymorphic 'Grid' could be a useful tool.

This is an overview of what I have learned while attempting to build higher-level data structures in Moose. I hope this article is enlightening to others, particularly beginners new to Moose, who may have also experienced similar challenges.

#!/usr/bin/perl -w use Modern::Perl '2011'; { package Cell; use Moose; has 'name' => ( is => 'rw', isa => 'Str', default => '', ); no Moose; __PACKAGE__->meta->make_immutable; } { package Row; use Moose; use Moose::Util::TypeConstraints; has 'row' => ( traits => ['Array'], is => 'ro', isa => 'ArrayRef[Cell]', required => 1, default => sub { [] }, handles => { pushCell => 'push', getCell => 'get', allCells => 'elements', }, ); sub toString { my $self = shift; join "\t", map($_->name, $self->allCells); } no Moose; no Moose::Util::TypeConstraints; __PACKAGE__->meta->make_immutable; } { package Grid;; use Moose; use Moose::Util::TypeConstraints; # coerce an 'ArrayRef[ArrayRef[Cell]]' struct provided by the +caller into an 'ArrayRef[Row]' type subtype 'A::Row' => as 'ArrayRef[Row]'; coerce 'A::Row' => from 'ArrayRef[ArrayRef[Cell]]' => via { [ +map {Row->new( row => $_ )} @$_ ] }; # coerce an 'ArrayRef[Cell]' struct provided by the caller int +o a 'Row' type coerce 'Row' => from 'ArrayRef[Cell]' => via { Row->new( row = +> $_ ) }; has 'grid' => ( traits => ['Array'], is => 'ro', isa => 'A::Row', required => 1, default => sub { [] }, coerce => 1, handles => { addRow => 'push', getRow => 'get', allRows => 'elements', }, ); sub addCell { my ($self, $index, @cells) = @_; $self->addRow( [] ) if !defined $self->getRow($index); $self->getRow($index)->pushCell(@cells); } sub toString { my $self = shift; join "\n", map {join "\t", map($_->name, $_->allCells) +} $self->allRows; } no Moose; no Moose::Util::TypeConstraints; __PACKAGE__->meta->make_immutable; } use strict; ################## ### Test Cases ### ################## # testing Row class #CASE #1: create a new instance by first building a row (reference to +an array), and then passing this to the constructor print "BEGIN row1 test\n"; my $row; for my $y ("A" .. "E") { my $mycell = Cell->new( name => "${y}0" ); push @$row, $mycell; } my $row1 = Row->new( row => $row ); print $row1->toString . "\n"; my $mycell = Cell->new( name => "F0" ); $row1->pushCell($mycell); print $row1->toString . "\n"; #$row1->pushCell( q/Whammi/ ); #print $row1->toString . "\n"; print "END row1 test\n\n"; #CASE #2: create a new empty instance of a row, then add new elements +individually. print "BEGIN row2 test\n"; my $row2 = Row->new; for my $y ("A" .. "E") { my $mycell = Cell->new( name => "${y}0" ); $row2->pushCell($mycell); } print $row2->toString . "\n"; #$row2->pushCell( q/Whammi/ ); #print $row2->toString . "\n"; print "END row2 test\n\n"; # testing Grid class #CASE #1A: create a new Grid instance by first building an array of Ro +w structs, and then passing this to the constructor (as a reference) print "BEGIN grid1A test\n"; my $struct1A; #my @struct1A; # both of these methods work for my $x (0 .. 4) { my $rowRef; for my $y ("A" .. "E") { my $mycell = Cell->new( name => "$y$x" ); push @$rowRef, $mycell; } my $row = Row->new( row => $rowRef ); push @$struct1A, $row; #push @struct1A, $row; } my $grid1A = Grid->new( grid => $struct1A ); #my $grid1A = Grid->new( grid => \@struct1A ); print $grid1A->toString . "\n"; print "\n"; my $mycell1A = Cell->new( name => "F3" ); $grid1A->addCell(3, $mycell1A); print $grid1A->toString . "\n"; #$grid1A->addCell(3, q/Whammi/); #print $grid1A->toString . "\n"; print "END grid1A test\n\n"; #CASE #1B: create a new Grid instance by first building a 2d-array of +Cell structs, and then passing this to the constructor (as a referenc +e) :: coercion needs to be working for this to succeed print "BEGIN grid1B test\n"; #my $struct1B; my @struct1B; #both methods work for my $x (0 .. 4) { for my $y ("A" .. "E") { my $mycell = Cell->new( name => "$y$x" ); #push @{$$struct1B[$x]}, $mycell; # yikes! that's pre +tty scary! Builds an array-of-arrays-of-Cells as a reference push @{$struct1B[$x]}, $mycell; } } #my $grid1B = Grid->new( grid => $struct1B ); my $grid1B = Grid->new( grid => \@struct1B ); print $grid1B->toString . "\n"; print "\n"; my $mycell1B = Cell->new( name => "F3" ); $grid1B->addCell(3, $mycell1B); print $grid1B->toString . "\n"; #$grid1B->addCell(3, q/Whammi/); #print $grid1B->toString . "\n"; print "END grid1B test\n\n"; #CASE #2A: create a new empty Grid instance, then add new rows individ +ually as Row structs. print "BEGIN grid2A test\n"; my $grid2A = Grid->new; for my $x (0 .. 4) { #my $rowRef; my @row; # both of these methods work as well for my $y ("A" .. "E") { my $mycell = Cell->new( name => "$y$x" ); #push @$rowRef, $mycell; push @row, $mycell; } #my $row = Row->new( row => $rowRef ); my $row = Row->new( row => \@row ); $grid2A->addRow( $row ); } print $grid2A->toString . "\n"; print "\n"; my $mycell2A = Cell->new( name => "F3" ); $grid2A->addCell(3, $mycell2A); print $grid2A->toString . "\n"; #$grid2A->addCell(3, q/Whammi/); #print $grid2A->toString . "\n"; print "END grid2A test\n\n"; #CASE #2B: create a new empty Grid instance, then add new rows individ +ually as an array of Cell structs :: coercion needs to be working for + this to succeed print "BEGIN grid2B test\n"; my $grid2B = Grid->new; for my $x (0 .. 4) { #my $rowRef; my @row; # both of these methods work for my $y ("A" .. "E") { my $mycell = Cell->new( name => "$y$x" ); #push @$rowRef, $mycell; push @row, $mycell; } $grid2B->addRow( \@row ); #$grid2B->addRow( $rowRef ); } print $grid2B->toString . "\n"; print "\n"; my $mycell2B = Cell->new( name => "F3" ); $grid2B->addCell(3, $mycell2B); print $grid2B->toString . "\n"; #$grid2B->addCell(3, q/Whammi/); #print $grid2B->toString . "\n"; print "END grid2B test\n\n"; #CASE #3: create a new empty Grid instance, then add new rows individu +ally as an array of Cell structs :: coercion needs to be working for +this to succeed #this differs from CASE 2B in that I am constructing an anonymous arra +y in one expression, rather than pushing the elements one-by-one onto + the anon array. print "BEGIN grid3 test\n"; my $grid3 = Grid->new; my $o1 = Cell->new( name => 'X1' ); my $o2 = Cell->new( name => 'Y1' ); my $o3 = Cell->new( name => 'Z1' ); $grid3->addRow([$o1, $o2, $o3]); print $grid3->toString . "\n"; my $o4 = Cell->new( name => 'W1' ); $grid3->addCell(0, $o4); print $grid3->toString . "\n"; #$grid3->addRow( qw/Whammi1 Whammi2/ ); #print $grid3->toString . "\n"; print "END grid3 test\n\n"; #CASE #4: last case; create a new empty Grid instance, then add each c +ell one-by-one print "BEGIN grid4 test\n"; my $grid4 = Grid->new; for my $x (0 .. 4) { for my $y ("A" .. "E") { my $mycell = Cell->new( name => "$y$x" ); $grid4->addCell($x, $mycell); } } print $grid4->toString . "\n"; print "\n"; my $mycell4A = Cell->new( name => "F3" ); $grid4->addCell(3, $mycell4A); print $grid4->toString . "\n"; print "\n"; my $mycell4B = Cell->new( name => "A5" ); $grid4->addCell(5, $mycell4B); print $grid4->toString . "\n"; print "END grid4 test\n\n";

And here is the output to the script...

BEGIN row1 test A0 B0 C0 D0 E0 A0 B0 C0 D0 E0 F0 END row1 test BEGIN row2 test A0 B0 C0 D0 E0 END row2 test BEGIN grid1A test A0 B0 C0 D0 E0 A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 F3 A4 B4 C4 D4 E4 END grid1A test BEGIN grid1B test A0 B0 C0 D0 E0 A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 F3 A4 B4 C4 D4 E4 END grid1B test BEGIN grid2A test A0 B0 C0 D0 E0 A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 F3 A4 B4 C4 D4 E4 END grid2A test BEGIN grid2B test A0 B0 C0 D0 E0 A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 F3 A4 B4 C4 D4 E4 END grid2B test BEGIN grid3 test X1 Y1 Z1 X1 Y1 Z1 W1 END grid3 test BEGIN grid4 test A0 B0 C0 D0 E0 A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 F3 A4 B4 C4 D4 E4 A5 END grid4 test

keywords: multi-dimensional array, ragged array, examples, test code, class implementation

Replies are listed 'Best First'.
Re: RFC: object-oriented two-dimensional ragged array tutorial
by Jenda (Abbot) on Sep 11, 2013 at 14:01 UTC

    The question is ... why?

    I mean, what's

    my @AoA; $AoA[0] = [1,2,3]; $AoA[1] = [4,8]; $AoA[0][3] = 99; ...
    if not a ragged array? I can set individual elements, I can query individual elements, I can set or get rows, ...

    May I see some code that would benefit from your maze of classes?

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

      Yes, you have given an example of a ragged array. And yes, perl is great for building and manipulating them as you have shown. The difference between the example you give here and the examples I give in my tutorial is a nice contrast between programming in traditional perl5 and OOP with Moose. If you are working with simple scalars such as numbers or strings as elements of your data structures, then Moose would definitely be overkill. I can see your point, that my examples could be considered overkill since I have simply wrapped a 'Cell' class around a scalar string. If that was all I really needed to do, then yes, I went way overboard. I merely used this simple 'Cell' class as an example. Create any complex class with multiple attributes, multiple methods, and add dependencies between those attributes and methods, then tack on inheritance by sub-classing this complex class, and you might begin to see where this tutorial could be useful.

        Your argument is that you've used a simple, if ultimately unnecessary example, to demonstrate some of the more complex parts of OO; because you didn't want to detract from the techniques you were demonstrating with the detail of the classes you used to demonstrate them.

        That is a fine argument.

        However, without a convincing example of when it is necessary, (or beneficial in some tangible way), to use the techniques you are demonstrating, it leaves the demonstration looking like a solution looking for a problem.

        So, the challenge to you is: show a realistic example of using those techniques that cannot be trivially and beneficially replaced with simpler, non-OO code.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: RFC: object-oriented two-dimensional ragged array tutorial
by remiah (Hermit) on Sep 11, 2013 at 06:14 UTC

    Hello paulymer.

    Your first attempt and this article really fascinate me. I know tobyinc showed us great example, but I was hesitated to introduce so many extension modules.

    I saw some fancy example for coercion, but this is first time I see practical example for coercion and array traits. cheers.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://1053126]
Approved by kcott
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (7)
As of 2024-03-19 08:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found