Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

RFR: Inside-out classes - a Class::InsideOut primer

by radiantmatrix (Parson)
on Mar 15, 2007 at 20:24 UTC ( [id://605062]=perlmeditation: print w/replies, xml ) Need Help??

Fellow monks: this is the first draft of a primer on inside-out classes using Class::InsideOut. It's mostly meant as a "get started guide". I'd appreciate any feedback on the value, content, phrasing, etc.. I'm sure it needs work, but I'm not sure of exactly where.


This document is a primer guide to using inside-out objects via Class::InsideOut. It is intended as a complement to, not a replacement of, the documentation provided with that manual.

Who should read this primer?

This primer intends to address Perl programmers who are already familiar with using and creating object classes in Perl, and who wish to learn about and implement the inside-out class pattern using the Class::InsideOut module.

Introduction to the inside-out class pattern

Anyone familiar with object class development in Perl is familiar with the traditional pattern for a Perl object class: a single, blessed hash reference. Each object instance is its own hash reference with the same data structure.

This approach is simple, and it works well; however, it has certain limitations. For example, anyone instantiating an object can bypass the accessors you so carefully constructed and muck about with the data in the object – after all, the object is just a hash reference like any other.

Besides that, how often have you accidentally wound up with a hard-to-find bug in your object class because you've mistyped the name of a hash key? The interpreter certainly doesn't warn you: it just auto-vivifies a new attribute with the mistyped name.

The inside-out class pattern addresses these issues (and some other more esoteric ones) by figuratively turning the idea of a class “inside-out”. Whereas a traditional object instance is a blessed reference to a hash whose keys are the names of attributes, an inside-out object instance is a common key to several class-local hashes that represent the attributes. That is, an object class defines a hash for each attribute, and an instance is the key that points to one element in that hash. By way of example:

# Accessing an attribute inside a traditional object $self->{ attribute } = $value; # Accessing an attribute inside an inside-out object $attribute { id $self } = $value;

This may be a little hard to wrap your head around at first, but as you continue this primer, it will become more apparent.

Building your first inside-out class

When building any class, one has to first decide what attributes need to be present, and what methods will act on those attributes. For an inside-out class, we'll also need to decide which of the attributes are public, read-only, or private.

A public attribute can be both read and modified by any program or module that instantiates the object (any instantiator.

A read-only attribute can be read by any instantiator, but modified only internally by the class (and its children, which we'll get to later).

A private attribute is not available to the instantiator, only to the class itself (and, again, its children).

Let's create a class to represent a person in an organization. We'll need to know the person's given name, family name, date of birth, phone number, Social Security Number, and position within the organization. Some of these attributes shouldn't be changed by the instantiator, and Social Security Number should be kept private – it's only used internally as a unique identifier.

Now, let's look at the same class, declared with the inside-out pattern and using Class::InsideOut:

package My::Person; use Class::InsideOut qw[public readonly private register id]; # declare the attributes readonly given_name => my %given_name; readonly family_name => my %family_name; public birthday => my %birthday; public phone => my %phone; private ssn => my %ssn; public position => my %position; # object constructor sub new { my $self = shift; register( $self ); $given_name{ id $self } = shift; $family_name{ id $self } = shift; $birthday{ id $self } = shift; $phone{ id $self } = shift; $ssn{ id $self } = shift; $position{ id $self } = shift; }

The first item of interest is that the attributes are declared in the class' package scope, rather than per-instance. Notice that attribute setting/getting inside methods appears “inside-out”.

Using an inside-out class

From the instantiators' point of view, using an inside-out class is no different than using a regular class:

use My::Person; my $manager = My::Person->new( 'Random', 'Hacker', '12/18/1987', '555-1212', 'CEO' ); print "Old phone is: ",$manager->phone(); $manager->phone('555-1313'); #set new phone number print "New phone is: ",$manager->phone();

Note that by declaring the attribute 'phone' as public, Class::InsideOut automatically creates a method that works as an accessor when called without parameters, and as a mutator when called with a parameter.

The major difference instatiators will notice is that the following code will not work in an inside-out class (it would in a traditional class):

$manager->{phone} = '555-1313';

{{what is the behavior of this??}}

Defining accessors and mutators

As we've seen above, there's no need to explicitly define accessors and mutators, because Class::InsideOut does it for you. The syntax for this definition is:

protection accessor_name => my %attribute_variable;

An accessor method will be created for any public or readonly attribute; in the case of a public attribute, the accessor will function as a mutator when it is passed a parameter. For example:

# $obj->public_attrib() accessor, $obj->public_attrib($value) mutator public public_attrib => my %public_attrib; # $obj->read_attrib() accessor, no mutator readonly read_attrib => my %read_attrib; # no accessor or mutator private priv_attrib => my %priv_attrib;

Notice that throughout this document, the accessor name and the the name of the attribute variable are the same. While this is not strictly required, it makes things so much clearer that it's considered a best practice.

Defining other types of methods

Creating methods that act on data inside an object is not entirely dissimilar from the traditional class pattern approach:

sub sendPhoneToDirectory { # sends the phone number to the company directory, using SSN as # the key my $self = shift; my $dir = My::Directory->new(); my $entry = $dir->getEngtryBySSN( $ssn{ id $self } ); $entry->phone( $phone{ id $self } ); $entry->commit; }

The only real difference is how attributes are addressed. Instead of using '$self' as a hash reference and the attribute name as a key, we use the attribute hash and the 'id' of '$self' (not the object itself, but its unique identifier) as the key.

Advanced accessor and mutator behavior

While Class::InsideOut's ability to automatically generate the accessors/mutators for read-only and public attributes is certainly convenient, it does raise a fairly obvious question: what happens if I want my accessor/mutator to do a little more than just get or set the value of the attribute?

It's generally not advisable to have excessive amounts of logic around getting or setting a value, but there are plenty of places where a little bit of logic is appropriate. The most obvious example for a mutator is the desire to validate the proposed value before setting the attribute. For instance, it wouldn't do to allow a floating-point value to be assigned to an attribute that's supposed to be an integer. For accessors, the most obvious example would be dereferencing a data structure before returning it.

Fortunately, Class::InsideOut has this facility. When declaring attributes, two hooks are available. The first, for hooking into the accessor behavior, is called 'get_hook'; its companion, 'set_hook', hooks into mutator behavior, when that's relevant.

These hooks are code references that will be executed either before setting or after getting the attribute, but before the accessor or mutator returns. The return value is ignored, and the '$_' variable is locally aliased so that it contains the value in the attribute (in the case of an accessor) or the value to be assigned (in the case of a mutator).

An attribute can have either a set_hook or a get_hook, or both.

Building a set_hook

A set_hook is a subroutine used to validate or modify the value passed to an attribute mutator before setting the attribute. Inside a set_hook, the variable '$_' is set up to be the value passed to the mutator. Return values are ignored – this means that one changes '$_' if one wants to modify the passed value, and dies if one wants to fail (e.g. if bad data were passed to the mutator).

By way of example, imagine that our My::Person class wants to store the birthday as an integer similar to what would be returned by Perl's 'time' function. However, we want to accept MM/DD/YYYY as input. We'll use the POSIX mktime() function to actually make the conversion. Because this is an example, we'll be assuming that anything that looks like a date, is; in a production environment, you'd want your validation to be much more robust. Here's the sub:

sub _set_birthday { # remember, $_ will contain the value passed to the mutator die "$_ is not a valid date" unless m{(\d{2})/(\d{2})/(\d{4})}; $_ = mktime(0,0,0,$2-1,$1-1,$3-1900); #pack it! }

Now, we have to register this subroutine with the set_hook for the 'phone' attribute. This is done at declaration:

public birthday => my %birthday, { set_hook => \&_set_birthday };

Now, any time an instantiator calls the 'birthday' mutator, the value will be converted. If the value doesn't look like a date, the mutator dies. Note that any code reference will be executed, so there's no need to use a named sub: the 'set_hook' could just as easily be an anonymous sub.

Building a get_hook

A get_hook is a subroutine used to modify the value returned by an attribute accessor. Inside a get_hook, the variable '$_' is aliased to the value stored in the attribute. Return values from get_hook are ignored; the accessor expects that get_hook will simply modify '$_'.The return value from get_hook will be returned by the accessor.

Let's continue our birthday example from the 'set_hook' discussion. Recall that the birthday is stored as a time code in the same form as would be generated by Perl's 'time' function. However, since we accept MM/DD/YYYY as input to the mutator, we should provide the same format as output from the accessor. We'll use the POSIX strftime() function to do the heavy lifting. Here's the sub:

sub _get_birthday { # remember, $_ will contain the value of the attribute strftime '%m/%d/%Y', localtime( $_ ); }

As with a 'set_hook', we register the get_hook at declaration time. Here's what the declaration looks like after adding the get_hook:

public birthday => my %birthday, { set_hook => \&_set_birthday, # was already here get_hook => \&_get_birthday, };

Now, when the birthday accessor is called, it will return the date in the MM/DD/YYYY format. As with set_hook, an anonymous sub could be used instead of a separate, named sub.

Further reading

Hopefully, this text has been helpful in getting you started creating your own inside-out modules using Class::InsideOut. It probably won't be long before you find yourself wanting to do more advanced things using inside-out objects, like serialization, inheriting from traditional classes, and writing thread-safe modules. To address these topics, please refer to the excellent Class::InsideOut::Manual::Advanced.

See also

Updates:

  • 20070315 : Typo fixed (an, -> (and, ; also removed extraneous apostrophe in "its" ; thanks liverpole
  • 20070315 : Added UnixReview column to See Also, per merlyn's shameless self-promotion ;-)
  • 20070316 : Fixed compliment->complement typo, thanks bart
  • 20070316 : Fixed stray ';' and factual error regarding get_hook, thanks xdg
  • 20070316 : Added xdg's presentation

<radiant.matrix>
Ramblings and references
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

Replies are listed 'Best First'.
Re: RFR: Inside-out classes - a Class::InsideOut primer
by xdg (Monsignor) on Mar 16, 2007 at 13:04 UTC

    ++ Very nice start to a tutorial -- and on a topic that continues to need it, as recent criticism suggests.

    I have a few corrections and comments.

    sub new { my $self = shift; register( $self );

    It should be:

    sub new { my $class = shift; my $self = register( $class );

    That's the longhand approach for clarity of what the arguments mean. my $self = register( shift ) would be shorthand.

    When provided with a single, non-reference argument, register treats it as a class name and will return a reference to an anonymous scalar blessed into the given class. Thus, considering this question:

    $manager->{phone} = '555-1313'; # {{what is the behavior of this??}}

    If you use the new() example I give above, this will die with an error as $manager is a reference to a scalar, not a hash.

    Note that if you use an anonymous hash reference (using an alternate style of register):

    sub new { my $class = shift; my $self = register( {}, $class ); # register does the bless, too

    Then $manager->{phone} = '555-1313' will work as usual, but will have nothing to do with the value retrieved by $manager->phone().

    Aren't inside-out objects fun?

    public birthday => my %birthday, { set_hook =>; \&_set_birthday };

    Extra semicolon. set_hook => \&_set_birthday

    Return values from get_hook are ignored; the accessor expects that get_hook will simply modify '$_'.

    This is incorrect. The behavior isn't symmetric with set_hook. For get_hook, $_ is a copy of the attribute -- you can modify it without affecting what is stored. The return value of the get_hook is returned from the accessor. Your example just happens to work because the last thing you do is $_ = strftime and that is what gets returned. You could also have done this:

    sub _get_birthday { strftime '%m/%d/%Y', localtime( $_ ); # but NOT just localtime() }

    But again, your tutorial has a nice build-up from simple to more advanced concepts, around a well-defined example (date storage). With some editing and clarifications from feedback here and elsewhere in the thread, it will probably be something I'm happy to either link to in the Class::InsideOut documentation or possibly even add as Class::InsideOut::Tutorial.

    One thing you might want to clarify a bit more is the id() function and its importance in generating the unique ID for a object.

    I also wouldn't describe Class::Std as "deprecated" -- I'm not a big fan, but it's certainly in use. It is very good for dealing with complex class hierarchies (as one would expect of a module by TheDamian) but I think it errs by not being more fundamentally robust.

    My YAPC::NA 2006 presentation is another potential reference you could add.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      Hm, some very good points. First off, thanks for pointing out the stray semi-colon -- it's an artifact from cleaning up the original HTML (there used to be an &gt; there). Second, thanks for the clarification on get_hook, I must have simply mis-read the documentation: my bad.

      I'm of two minds about including the longer form of register... on the one hand, it would be more accurate, on the other, it does seem a little more confusing. It's clear that what I've got is probably inadequate, but I will have to give some thought to how much of that behavior I want to explain. That, and how best to convey it.

      As for the importance of the id function, would you care to expand? I'm certainly not an expert on inside-out objects, and I've just been using it as a bit of a mantra -- when I need the ID of an object, I do it this way -- I'd appreciate more insight into why I'm doing it. I can then distill that insight down (hopefully) into primer-appropriate materials.

      I call Class::Std "deprecated" on the basis that I would discourage its use to anyone who didn't really know what they were doing. I hadn't considered that others would take "deprecated" in different contexts; I will have to think about how to rephrase that.

      Thanks for the link to your presentation, I will look it over and consider it for mention.

      <radiant.matrix>
      Ramblings and references
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
        As for the importance of the id function, would you care to expand? ... I'd appreciate more insight into why I'm doing it

        The real answer: Class::InsideOut expects/requires that the unique key for each object be its memory address (to ensure safety for threading and pseudoforks), so it provides the alias id() to Scalar::Util::refaddr.

        To explain it in the primer, it may be sufficient to say that each object needs to have a unique key and Class::InsideOut requires the use of the id() function to generate that unique key.

        Regarding Class::Std:

        I would discourage its use to anyone who didn't really know what they were doing.

        I think this is almost exactly what you should say in place of "deprecated".

        -xdg

        Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      With some editing and clarifications from feedback here and elsewhere in the thread, it will probably be something I'm happy to either link to in the Class::InsideOut documentation or possibly even add as Class::InsideOut::Tutorial.

      In Class::InsideOut::Manual::About, it says,

      All other implementation details, including constructors, initializers and class inheritance management are left to the user

      This suggests to me that maybe a Class::InsideOut::Manual::BestPractices would be helpful. After all, most prospective users of C::IO are probably familiar with using classic hashref-based classes, and it's easy to just stick with what you know ... unless there's there's some easy-to-follow guidelines that say, ``Look, if you just need to write some simple classes with some inheritance, your best bet might be to use modules X, Y, Z, and do it like so.''.

      By the way, for double-word score, include some references to relevant chapters and sections in Damian's PBP (detailing where C::IO BP differ from PBP being the most important, IMO).

      IMO, a doc like that would make it a no-brainer to start off your InsideOut-OOP life with C::IO.

Re: RFR: Inside-out classes - a Class::InsideOut primer
by liverpole (Monsignor) on Mar 15, 2007 at 20:43 UTC
    Very nice, radiantmatrix++.

    Your tutorial is both very easy-to-read and quite comprehensive.

    Especially helpful is the use of clear, plausible examples.

    Very nice work!


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: RFR: Inside-out classes - a Class::InsideOut primer
by bart (Canon) on Mar 16, 2007 at 11:29 UTC
    Your introduction on Inside-Out Objects insn't right. You seem mostly concerned with preventing people from bypassing the API and muck directly with the internals. But this is Perl, mucking with internals is allowed, though frowned upon. After all, isn't a common catchphrase for Perl not "stay out of my livingroom because it's the polite thing to do, not because I have a shotgun"?

    But the main reason for wanting them is one you don't mention, and it's not such an "esotheric" one as you claim: it is in order to be allowed to subclass any class without having to be aware of any hidden fields that are stored in the parent class, without any risk of trampling all over them, of which, after all, the list may change (grow) over time.

    Compared to this, the direct field access and the possibility of mistyping are just minor issues.

      I disagree about the main reason for wanting them. They may have been created to solve inheritance issues, but based on how often I'm asked about how to create private and read-only attributes, that seems to be the primary reason many people would want inside-out objects.

      I'm reluctant to address the more esoteric concerns (like subclassing issues) in a primer -- the goal of a primer is to get someone started using a particular skill or technique, and it should not assume too advanced a prerequisite.

      Besides that, your characterization that I'm "mostly concerned with" the private/read-only capabilities is ridiculous: I mention that issue once in the introduction, and refer to it obliquely once more in discussing class design. However, that latter is necessary because Class::InsideOut does require the developer to consider whether one's attributes are public, private, or read-only.

      You are, of course, entitled to your opinions, but that doesn't make mine wrong.

      <radiant.matrix>
      Ramblings and references
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
        I disagree about the main reason for wanting them. They may have been created to solve inheritance issues, but based on how often I'm asked about how to create private and read-only attributes, that seems to be the primary reason many people would want inside-out objects.
        Oh I don't know. I would guess that people who want private and read-only attributes are just used to "strict" languages. I have hardly ever found them useful, since I'm of the opinion that you shouldn't access attributes from outside the object at all (i.e. you always want accessors or other methods to set the usefull attributes anyway), and if someone does validate the encapsulation, it's his problem. In other words, from "the outside", the actual storage mechanism of attributes is not usually relevant, since you normally never access them directly anyway.

        When I use inside-out objects attributes it's when I know I'm going to have many differing inheriting classes. Especially mixing pure-perl and perl/XS code. That's where the safety is relevant - you don't want to overwrite some super/subclasses' internal data - which can be easy to do by accident if you don't know all the internals of your superclass, and if you're subclassing an XS class you usually don't even have a hash-based implementation anyway.

        Class::InsideOut does require the developer to consider whether one's attributes are public, private, or read-only

        Well, perhaps it's clearer to say that Class::InsideOut emphasizes the decision through a choice of syntax.

        Class::InsideOut tries to require as few things as possible:

        1. Objects: All new objects must be passed to (or created by) the register() function
        2. Properties: All data structures for object properties must be passed to the property() function.
        3. Direct Access: The key to directly access the property for an object in a property hash must be generated using refaddr() (from Scalar::Util) or the shorter alias id().

        The public(), private() and readonly() functions just call property() with appropriate options.

        I agree that showing it in practice rather than explaining all this is more appropriate for a primer

        -xdg

        Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        I disagree about the main reason for wanting them. They may have been created to solve inheritance issues, but based on how often I'm asked about how to create private and read-only attributes, that seems to be the primary reason many people would want inside-out objects.
        Then these people are excited about their new shotgun.

        Disallowing people to access the internal gains you nothing. It just makes debugging harder, as Data::Dumper doesn't see anything inside the objects.

      Exactly! An inside-out class can become a base class of any Perl class (inside-out or not) without as much as a glance at its implementation. That is the big deal about inside-out. Everything else is coincidental.

      For a class author this means you can now publish absolute general-purpose classes. Every other class can simply inherit your methods and data. That is impossible with any of the traditional class implementations. It makes inheritance in Perl what it is in dedicated OO languages.

      The tutorial is an excellent introduction to the use of Class::InsideOut, but as a general introduction to inside-out classes it misses the point.

      Anno

      you contradicted yourself. preventing people from messing with the internals means they do not have to be aware of any hidden fields that are stored in the parent class. if they cannot mess with the internals, they do not need to be aware. it is not worrying about something that you are not able to do.

      anyone who designs an api, they may set expectations. if the designer wishes you not to muck around with internals, that is his choice. he can promote that if he pleases. mistyping may not be a minor issue for some. typos create all kinds of bugs.

      timtowtdi. appreciate all aspects of what people do and the reasons they do it. there are more opinions than facts in programming and software engineering.

        there are more opinions than facts in programming and software engineering.

        Ain't that the truth. In life too I guess, but nowhere more evident that in programming.

        And there is a strange phenomena in CS/programming whereby the more plaudits an individual receives for their opinions, the more likely they (the individual), tend to view their own opinions as facts.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: RFR: Inside-out classes - a Class::InsideOut primer
by merlyn (Sage) on Mar 15, 2007 at 23:14 UTC

      You're shameless. But that's ok, because it's a very good column! (Added to See Also section)

      <radiant.matrix>
      Ramblings and references
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: RFR: Inside-out classes - a Class::InsideOut primer
by wulvrine (Friar) on Mar 16, 2007 at 14:07 UTC
    Great Job ++!

    I have read about inside out classes but have always been confused by them.
    Your tutorial and explanation seriously helped improve my knowledge of the concept
    to where I think I could start using them. The short, clear examples really
    drove the subject home.
    Very well done, and thanks!

    s&&VALKYRIE &&& print $_^q|!4 =+;' *|
Re: RFR: Inside-out classes - a Class::InsideOut primer
by rvosa (Curate) on Mar 17, 2007 at 16:23 UTC
    Great tutorial! I have to say, I am also an avid user of inside-out objects - but I agree with others that it's more an inheritance thing than a typo thing. If it was only about the typo issue you could also use locked hashes (well, erm...) or using array refs with constants as indices or something.

    I don't know if you feel like adding it, but there are also some problems with inside-out objects (in general, not with Class::InsideOut, I mean): you have to be more careful about cleanup of the instance data when objects go out of scope, and sometimes it's nice if you can run a hash object through Data::Dumper to debug it.

    And the privacy is still relative - I'm not sure how things work under the hood, but wouldn't you be able to step into the package namespace from elsewhere and clobber the hashes holding the instance data?
      And the privacy is still relative - I'm not sure how things work under the hood, but wouldn't you be able to step into the package namespace from elsewhere and clobber the hashes holding the instance data?

      Not if you use lexical hashes for data storage as is usually recommended. Excluding padwalker trickery, those are only accessible from their lexical scope, which can be made sufficiently small.

      Anno

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://605062]
Approved by liverpole
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2024-03-19 03:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found