good chemistry is complicated,
and a little bit messy -LW
Often Overlooked OO Programming Guidelinesby Ovid (Cardinal)
|on Dec 29, 2003 at 19:42 UTC||Need Help??|
Often Overlooked Object Oriented Programming Guidelines
The following is not about how to write OO code in Perl. There's plenty of nodes covering that topic. Instead, this is a general list of tips that I like to keep in mind when I'm writing OO code. It's not exhaustive, but it does cover a number of areas that I see many people (including myself), get wrong or overlook.
This constructor is not unusual, but it's suggestive of a useless use of OO. A good example of this is Acme::Playmate (er, maybe not the best example). The module is comprised of a constructor. That's it. And here's the documented usage:
Regardless of whether or not you feel this is a useful module, there's nothing OO about it. In fact, with the exception of methods this module inherits from UNIVERSAL::, it has no methods other than the constructor. All it does is return a data structure that just happens to be blessed (the jokes are obvious; we don't need to go there).
Of course, this is merely an Acme:: module, so discussing how well a joke conforms to good programming practices is probably not warranted, but read through Damian Conway's 10 Rules for When to Use OO to get a good feel for when OO is appropriate.
On the surface, this might appear to simply be an interface that will be used as a base class for a set of classes. However, sometimes people get confused and simply override those methods to return data:
There's really no reason for that. Make it an instance:
Thus, if you need to change how things work internally, you're doing that on only one class rather than hunting through a bunch of useless subclasses.
That seems all fine and dandy. Now, imagine that you have that in 20 places in your code, but in the manager class, someone changes name to full_name. Because the code using the office object was forced to walk through the object heirarchy to get at the data it actually needs, you've created fragile code. Now the manager class must support a name method to be backwards compatible (and we get to start on our big ball of mud), or every reference to it must be changed -- but we've created far too many.
The solution is to do this:
Now, instead of hunting down all of the places where this was accessed, we've limited this call to one spot and made maintenance much easier. This can, however, lead to code bloat. Make sure you understand the tradeoffs involved.
In this case, let's assume there is a Tender::Cash superclass and subclasses along the lines of Tender::CreditCard and Tender::LetsHopeThisDoesntBounce. The credit card and check classes can be used exactly as if they were cash. Their apply() methods are probably different internally, but every method that's available for cash should be available for the subclasses and data which is returned should be identical in form. (this might be a bad example as a generic Tender interface may be more appropriate).
Another example is HTML::TokeParser::Simple. This is a drop-in replacement for HTML::TokeParser. You don't need to change the actual code, but you can then use all of the extra nifty features built in.
This is a common idiom, but it's an example of an anti-pattern. What happens when you want to change that to an array ref? What happens when you want to use inside-out objects? What happens when you want to validate an assignment to this value?
All of these issues and more crop up when you let people reach into the object. One of the major points of OO programming is to allow proper encapsulation of what's going on inside of the object. As soon as you let your defensive programming guard down, you're going to get bug reports. Use proper methods to handle this:
Whoops! Now we have a problem. Not only does every place in the code that might want to log errors have to first check if those errors exist, your log_errors method might erroneously assume that this has been checked. Check the state inside of the method.
Better yet, there's a good chance that you're not concerned about the error log at runtime, so you could simply specify an error log in your constructor (or have the class use a default log), and let the module handle all of that internally.
In the above example, there's an error that should be noted, but since a cached copy of data is acceptable, there's no need for the program to deal with this directly. The object notes the problem internally, adopts a fallback remedy and everything is peachy.
Assuming that a corresponding mutator exists, accessors should return a data structure that the mutators will accept. The following must always work:
Failure to do this will cause no end of grief for programmers who assume that that the object accepts the data structures that it emits.
Here's the YAML dump of a hypothetical product. Remember that, amongst other things, YAML is supposed to be human-readable.
Now here's hypothetical as_string() output that might be used in debugging (though you might want to tailor the method for public display).
That's easier to read and, by doing lookups on the category and bin ids, you can present output that's easier to understand.
One of the strongest objections to OO perl is the idiomatic object constructor:
Which can then be followed with:
And the tests:
Because blessing a hash reference is the most common method of creating objects in Perl, we lose many of the benefits of strict. However, a proper test suite will catch issues like this and ensure that they don't recur. On a personal note, I've noticed that since I've begun testing, I sometimes forget to use strict, but my code has not been suffering for it. In fact, sometimes it's better because I frequently write code for which strict would be a hassle, but that's another example of where the rules get broken, but they're broken because the programmer knows when to break them.
Yet another fascinating thing about tests is the freedom they give you. If you have a comprehensive test suite, you can start taking liberties with your code in a way that you haven't before. Are you having performance problems because you're using an accessor in the bottom of a nested loop? If the object is a blessed hashref, you might get quite a performance boost by just ``reaching inside'' and grabbing the data you need directly. While many will tell you this is a no-no, the reason they mention this is for maintainability. However, a good test suite will protect you against many of the maintainability problems you may face (though it still won't make fixing your encapsulation violations any easier once you are bitten).
That last paragraph might sound a bit curious. Is Ovid really telling people it's OK to violate encapsulation, particularly after he pointed out the evils of it?
Yes, I am saying that. I'm not recommending that, but one thing that often gets lost in the shuffle when ``paradigm'' flame wars begin is that programming is a series of compromises. Rare indeed is the programmer who has claimed that she's never compromised the integrity of her code for performance, cost, or deadline pressures. We want to have a perfect system that people will ``ooh'' and ``aah'' over, but when you see the boss coming down the hall with a worried look, you realize that the latest nasty hack is going to make its way into production. Tests, therefore, are your friend. Tests will tell you if the nasty little hack works. Tests will tell you when the nasty little hack breaks.
Test, damn you!
Many Perl programmers, including myself, learned Perl's OO syntax without knowing much about object-oriented programming. It's worth picking up a book or two and doing some reading about OO theory and pick up some of the tricks that, upon reflection, seem so obvious. Let the object do the work for you. Hide its internals carefully and don't force the programmer to worry about the object's state. All of the guidelines above can be broken, but knowing about them and why you want to follow them will tell you when it's OK to break them.
Update: I really should have called this "Often Overlooked Object Oriented Observations". Then we could refer to this node as "'O'x5".