Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^3: OO automatic accessor generation

by merlyn (Sage)
on Nov 11, 2009 at 16:58 UTC ( #806551=note: print w/ replies, xml ) Need Help??


in reply to Re^2: OO automatic accessor generation
in thread OO automatic accessor generation

I'll disagree. There's two levels of "learning" here. If you want to get stuff done, and you need to use objects and Perl, Moose is the perfect solution. If you want to learn the guts of Perl, and you know that Perl has some cool low-level technology for method dispatch, then yes, Moose is for later, not for now.

-- Randal L. Schwartz, Perl hacker

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.


Comment on Re^3: OO automatic accessor generation
Re^4: OO automatic accessor generation
by WizardOfUz (Friar) on Nov 11, 2009 at 18:37 UTC

    I (respectfully) disagree with your disagreement.

    Do you honestly think, looking at the OP's code, that he would benefit more from using Moose right now than from re-reading perlobj, perlboot, perltoot and perltooc?

    Maybe I'm an old fart, but I think that writing your own accessor factory is a rite of passage for every aspiring Perl hacker ...

      Do you honestly think, looking at the OP's code, that he would benefit more from using Moose right now than from re-reading perlobj, perlboot, perltoot and perltooc?

      Unless the OP wants to write a new implementation of an object system, I do. (I suspect most people using objects in Perl 5 do so because they have some project to finish, not because they want to understand how an object system works or the syntax for Perl 5 metaprogramming.)

        Well, I assumed that the OP wants to learn OOP in Perl, not to finish some project on a deadline. And I (still) believe that there are people out there who want to understand the basics of a programing language before trying to "get something done" with it.

        I suspect most people using objects in Perl 5 do so because they have some project to finish, not because they want to understand how an object system works

        At this point, people who are just getting things done, aren't they likely going to make bad decisions? This is even more likely when you don't understand what you are working with. So they do something stupid and get away with because the effects are not obvious or not crop up that often.

        Not understanding what you are working on will eventually come back to bite you in some way or another. Looking at all the software that I have had to work with over the years it has been consistent. Few people bother to understand what they are trying to do and eventually screw it up leaving a mess for someone else to clean up.

      Depends on the coders goal.

      package DataTable; use Moose; has 'tablename' => (is => 'rw', isa => 'Str'); has 'columns' => (is => 'rw', isa => 'ArrayRef[Str]'); has 'indices' => (is => 'rw', isa => 'HashRef[Str]'); has 'datatypes' => (is => 'rw', isa => 'ArrayRef[Str]'); has 'lengths' => (is => 'rw', isa => 'ArrayRef[Int]'); has 'decimals' => (is => 'rw', isa => 'ArrayRef[Int]'); has 'signed' => (is => 'rw', isa => 'ArrayRef[Int]'); has 'allownull' => (is => 'rw', isa => 'ArrayRef[Int]'); has 'default' => (is => 'rw', isa => 'ArrayRef[Int]'); has 'usequote' => (is => 'rw', isa => 'ArrayRef[Int]'); 1;

      Thats a lot of simplification right there. And I wouldn't be surprised to find that there might be a short cut to creating a bunch of attributes with the same definition.


      ___________
      Eric Hodges

        package DataTable; use Moose; has 'tablename' => (is => 'rw', isa => 'Str'); has 'columns' => (is => 'rw', isa => 'ArrayRef[Str]'); has 'indices' => (is => 'rw', isa => 'HashRef[Str]'); has 'datatypes' => (is => 'rw', isa => 'ArrayRef[Str]'); has [qw/ lengths decimals signed allownull default usequote/] => (is => 'rw', isa => 'ArrayRef[Int]'); 1;


        ___________
        Eric Hodges

        I wouldn't call the code you posted a simplification. It is shorter, yes. But it is definitely not simple.

        Even the most experienced Perl hacker would have a hard time understanding it without ever having looked at the Moose documentation before. The code posted by the OP (and my little exercise) on the other hand, should be understandable even to Perl beginners who just skimmed over the Camel book.

        Just to prove my point: Can you tell me, without looking at the Moose documentation, what the ArrayRef accessors return in list context? A single array reference? Or the content of the associated array-typed attribute? And if they return an array reference, is that reference a reference to the actual attribute slot or just a reference to a copy of it? And what about scalar context?

        Please don't get me wrong. I have nothing against Moose. I just don't think it should be the starting point for Perl beginners.

      Maybe I'm an old fart, but I think that writing your own accessor factory is a rite of passage for every aspiring Perl hacker ...

      Sure it is, but Moose is so much more then an accessor factory and when it comes to getting real work done why add an object system to your maintenance burden?

      You can take eric256's example one step further and make 'tablename' required (meaning you *must* supply a value to the constructor), and set default values for all the other attributes. I also made the grouped attributes "lazy" which means that Moose will not initialize the slot until it absolutely has too.

      package DataTable; use Moose; has 'tablename' => (is => 'rw', isa => 'Str', required => 1); has 'columns' => (is => 'rw', isa => 'ArrayRef[Str]', default => sub + { [] }); has 'indices' => (is => 'rw', isa => 'HashRef[Str]', default => sub + { +{} }); has 'datatypes' => (is => 'rw', isa => 'ArrayRef[Str]', default => sub + { [] }); has [qw/ lengths decimals signed allownull default usequote /] => (is => 'rw', isa => 'ArrayRef[Int]', lazy => 1, default => sub { [ +] }); 1;
      This is not much more code for Moose, but adding it to your example, or the OPs example would start to get really hairy.

      And then to take it even one more step further using Moose::Meta::Attribute::Native you can add some delegated behavior to the 'columns' attribute like so:

      package DataTable; use Moose; has 'tablename' => (is => 'rw', isa => 'Str', required => 1); has 'columns' => ( traits => [ 'Array'], is => 'rw', isa => 'ArrayRef[Str]', default => sub { [] }, handles => { 'add_column' => 'push', 'remove_column_at_idx' => 'delete', # ... and many more } ); # rest of the class snipped off for brevity ...
      This can then be used like this ...
      my $dt = DataTable->new; $db->add_column( 'foo' ); $db->add_column( 'bar' ); # columns is now [ foo, bar ] $db->remove_column_at_idx( 0 ); # columns is now [ bar ]
      Honestly, would you want to write this into your own accessor generator? Would you want to maintain it? Moose has *thousands* of tests to make sure stuff like this Just Works.

      -stvn
        Honestly, would you want to write this into your own accessor generator?

        Perhaps a more interesting question is why would you want (public) accessors--generated or manual--to every attribute?

        There are two generalised justifictions for this:

        1. Public accessors to prevent direct, external, and unchecked access to object attributes.

          It is (correctly) perceived that allowing this creates the danger that the internal state of the object cannot be guaranteed to be cohesively correct. And that by adding accessors, the values set into attributes can be range and type checked as they cross the public/private boundary, thereby maintaining a coherent internal state.

          But this is wrong in 3 ways:

          1. It makes objects little more than glorified structs--expensively.

            The external code is still setting the values of internal attributes, but now it bears the cost of at least one function call--the accessor itself, but more often at least two or three: another to perform generalised range checking; and another to perform type checking.

            But what benefit does the external code get from this internal checking? The answer is none!. Because what can those internal checks do if they detect errors? Nothing but fail.

            And by whatever mechanism the internal code raises that failure--return code, exception etc--the external code can only respond to that runtime error by itself failing. It can't turned round a say, "Sorry I miscalculated that value, I'll try again. Hows this?". Or, "Whoops, I meant to pass you this string not that float!".

            The only correction is a compile(edit) time correction to the code to ensure that it doesn't attempt to pass in invalid values. Ie. a correction to either the algorithm, or better range and type checking of its inputs.

            And once those corrections are made, and it can no longer pass invalid values to the accessors--their internal checking is duplicate and expensively redundant. And it has no possibility of correcting the other two problems below.

          2. It creates external dependencies upon the internal implementation of the objects.

            By exposing the internal attributes--even though indirectly--the external code is still dependant upon the internal implementation. Whilst the indirection through the accessor would allow the internal implementation to accommodate some simple changes--the renaming of the actual attribute; a change of it's storage representation. Beyond the disconnect between the public interface and private implementation that creates, it hasn't removed the dependency for any substantial change to the internal implementation.

            If, for example, the acceptable range of that attribute changes, modifying the internal checks simply pushes the problem back to the calling code, and that will again require modification in order to correct the situation. And once again, once those changes have been made, the internal ones are expensively redundant.

          3. It doesn't fix the internal coherency problem.

            Don't take either of the above to mean that I'm suggesting that direct access to internal attributes is acceptable! The underlying problem is classes should not expose the ability to modify internal attributes individually! Directly or indirectly.

            It is very rare that the internal state of an object can be comprehensively checked for internal coherency, by the checking of the value of a single attribute. For the grouping of attributes into an object to make any sense, there have to be dependencies between those attributes. Otherwise the grouping of those attributes into an object is fallacious. It is making connections between unrelated attributes. It's bad OO. Pseud-OO!

            Therefore, the only situation in which external code should pass in attribute values, is in a constructor; when all non-derivable internal values are passes in a single call when they can all be checked for type and range individually, and cross-checked for coherency against each other. And any derivable attributes can be coherently set at the same time.

            As a simple (and bad but common), example, it doesn't make sense to set the X value of a point without also setting the Y value at the same time. A point with an X value set and an unset Y value isn't a coherent point. And setting the Y value to some default value makes even less sense. It only makes sense to set both the X & Y values together.

            In previous discussions I've had the argument put that changing the X value of a point individually makes sense in the case of translation. The counter argument is that you should not translate a point through an X (or Y) value accessor, but through a translateX() method.

            There is also an argument that says that points don't move! So any translate method should create a new point by deriving it from the position of an existing point and one or more deltas. But that's a whole different discussion.

        2. private accessors.

          This justifiction suggests that future maintenance can be simplified, by insulating the internal implementation from itself, through the use of private accessors. But all the arguments above still apply. All you've done is move the external code internally. You've changed the ownership of the code; nothing else.

          The methods calling these accessors still need to verify and cross-check their parameters for coherency. And they still need to implement correct algorithms. And once both of these necessary goals have been achieved, any further re-checking of derived attribute values (and all internal attribute values should always be derived!), is redundant.

          If the parameters are verified correct; and the algorithms to derive attribute values from those parameters are correct; then there is no possibility of those derived values being invalid. Further checking is redundant and unnecessarily costly.

        There is merit in having private accessors that perform type and range checking during the internal development of a class, but they should be of the form of simple assertions, because no runtime recovery is possible.

        Once the internal development is complete and verified, those accessors are a redundant runtime burden. They should be disabled for production use. That is, the verified correct external methods should be able to now access the internal attributes directly avoiding the costs of the accessor indirection and redundant re-checking. Of course, this optimisation should be automated by the compiler via compile-time option--not manually by the programmer in the source code.

        Object methods should perform actions that affect the internal state of the object on behalf of the programmer by deriving new values of that state by applying algorithms to its current state and its input parameters. There should be no possibility that externally supplied parameters, (checked or not), directly become internal attribute values without derivation! Checking is not derivation.

        The only exception to this are constructors. But these should require that all non-derivable attributes be passed in a single call so that a complete, correct and coherent object can be constructed without resorting to arbitrary defaults.

        Or require that no attributes be passed. In which case a coherent and correct--but identity--object should be returned.

        Type and range checking should only be necessary (and is only beneficial) at the public boundaries of a class. Other than during (internal) development.

        And there in a nutshell is my problem with Moose. It does little to address the fundamental problem with most Perl OO. That it is pseud-OO. (Also true of most OO code in other languages in my experience.)

        Badly defined and poorly structured. That it is variously glorified structs; or loose groupings of often mostly-unrelated attributes; strung together with badly defined interfaces that mix dependency-creating, too-low-level methods and architecture-dictating, too-high-level methods. And often all of these in the same class.

        Most OO code is badly defined, badly architected and badly implemented. What Moose does is allow its users to create their pseud-OO very easily and quickly and reliably :) That isn't Moose's fault per se. Guns may not kill people, but they are certainly an enabler.

        Good OO is hard. And learning (or teaching) the difference between good OO and pseud-OO is even harder. There is little that Moose could do to address that. An object system that could enforce good OO is probably as intractable a problem as natural language processing.

        What Moose might be able to do are some simple things. Like forcing (or at least defaulting) accessors to private, and allowing a compile-time switch to convert accesses from indirect to direct for performance-critical/production code.

        I'm also dubious of the benefits of (and in most cases outside of initial testing, the need for), deep introspection; but that's a discussion we've had before. And given that introspection is pretty much the core foundation of Moose (MOP), I don't expect to make much headway in that argument with you :) However, there is maybe some merit in my suggesting that Moose might also allow one or more compile-time switches to disable the creation/costs of the methods and data required to provide for deep introspection, for those classes that do not use it?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Moose has *thousands* of tests to make sure stuff like this Just Works.

        Does it work with Apache::Reload (yet/again)? I'm just asking out of (sincere) curiosity ...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://806551]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2014-12-18 00:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (41 votes), past polls