Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

The fallacy of the *requirement* for read-only instance variables.

by BrowserUk (Pope)
on Apr 17, 2011 at 12:43 UTC ( #899808=perlmeditation: print w/ replies, xml ) Need Help??

For a (public) read-only instance variable (ROIA) to have a value, it must be assigned one. This may happen in one of two ways.

  • It is initialised at instantiation time with a value passed in by the caller.

    In this case, whenever the caller might read the ROIA (back) from the object instance, it could just as easily, and more efficiently, obtain that same value from the same source as it got it from when it instantiated the object in the first place.

    Now, some will argue: "But what of the caller passes the object to some other piece of code that doesn't have access to the value used during initialisation?". And the answer is, that if the caller can pass the object to that other code, it can equally well pass the value to that other code directly rather than via the object.

    And then: "What if the caller has many values that it wants to pass to the other code? Isn't encapsulating them into a data-only object and passing its handle to that other code better than passing a bunch of discrete variables?". And the answer is, how is that better than putting the variables into a simple data structure like a hash or array and passing a reference to that to the other code?

  • It is initialised with a value derived from the values passed in by the caller.

    In this case, the ROIA is calculated or otherwise derived from the other initialisation parameters, and then stored so that it needn't be recalculated every time the caller reads it. In effect, the ROIA is acting as a cache.

    The problem here is that is assumes that the caller will call twice or more. But why would he? Why wouldn't he retrieve the value once and store it locally if he will need to reference it multiple times. The sole purpose of caching the value internally it to save the recalculation expense. But if the caller stores it locally, he also avoids the method call expense.

    But more to the point, the initial calculation is only done if the caller actually needs the value. And if the caller takes the responsibility for caching the value, if he needs to, then the cache space is only allocated if it is needed also.

There is one legitimate case when a ROIA makes sense. For that to be the case requires several properties of the ROIA:

  1. The value stored in the ROIA must be expensive to calculate.
  2. It must be required multiple times.
  3. It must be required both internally to the instance; and externally to it.

This combination of circumstances are far, far rarer than the prominence ROIAs are given in texts, documentation and existing codebases would suggest. And in many cases, maybe even most cases of existing usage, that combination of properties is a strong indication of bad OO.

It indicates either that:

  • The class is deficient in that it forces the caller to perform algorithms that should be provided by the class as a method.
  • Or the value has been wrongly encapsulated into a catch-all object. (Sometimes referred to as a God-object.)
  • Or the class should not be a class, but a simple data structure.
  • Or it is a (truly) premature optimisation based upon whatifitis second guessing what the class user might require.

But in most cases, it is simply the class author thinking: the user has passed me this value, so I'll stick it in an instance variable just in case he wants it back at some point. Forgetting that if the caller gave it to you in the first place, if he needs it again, he can re-access the same place he got it from when he passed it to you.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on The fallacy of the *requirement* for read-only instance variables.
Re: The fallacy of the *requirement* for read-only instance variables.
by moritz (Cardinal) on Apr 17, 2011 at 13:54 UTC

    When I use an object, I usually don't care if a method which receives no argument and returns a value is an attribute accessor or not. Neither should I -- that's the beauty of OO, as a user of a class I don't have to care.

    Consider Date::Simple, one of my favorite CPAN modules. It has several ways to create an instance of that class - you can either pass in a date in YYYY-MM-DD format, or I can pass in year, month and day, or I can create one as result of an arithmetic operation, or just get it from Date::Simple::today().

    On the other I can retrieve several pieces of information from such an object: day, month, year, day of year, day of week and so on. Some of them might be read-only accessors of instance variables, other probably aren't - but I don't have to care.

    If one of those methods is an accessor, I'm pretty sure it doesn't meet all three of your criteria for being "legitimate", and yet I'm pretty sure it doesn't indicate any of the four fallacies you listed: it doesn't force me as the caller to perform any extra action, a Date is surely not a catch-all object, it should surely be a class (otherwise arithmetic with it would be very inconvenient), and I can't see a case of premature optimization either.

    So why are these methods legitimate, even if you argue that some of them probably aren't, just by because they are ROIAs?

    Because they hide implementation details I couldn't care less about. If Date::Simple calculations returned hashes with some data points (say year, month, day), I'd be forced to remember which data is stored in the hash, and which must be calculated from these values (for example day of week). Such a non-uniform interface would put cognitive load on the user - something that should be avoided, even if it means a few extra method calls for obtaining some data.

    These accessors also give you a consistent interface when the internal representation changes, thus decoupling API from implementation - another plus.

    Now, some will argue: "But what of the caller passes the object to some other piece of code that doesn't have access to the value used during initialisation?". And the answer is, that if the caller can pass the object to that other code, it can equally well pass the value to that other code directly rather than via the object.

    And then: "What if the caller has many values that it wants to pass to the other code? Isn't encapsulating them into a data-only object and passing its handle to that other code better than passing a bunch of discrete variables?". And the answer is, how is that better than putting the variables into a simple data structure like a hash or array and passing a reference to that to the other code?

    There's also a middle ground: a routine might need to return both some "dumb" data, and an "active" object. If it makes sense for the abstraction in question, it might make sense carrying that "dumb" data in the "active" object, instead of placing the burden on the caller to deal with both of those separately.

    For a (public) read-only instance variable (ROIA) to have a value, it must be assigned one. This may happen in one of two ways.

    There is a third case: a value might be derived both from values passed by the caller and from "impure" (in the functional programming sense) source like IO and randomness. In this case it might be impossible to obtain the same answer twice, even if the computation isn't expensive. Thus storing data in an attribute might not be just caching, but required for consistency.

    Finally if the object needs a value for further calculations, it can just as well make it available for the caller - yet modifying it might invalidate previously calculated data.

      Some of them might be read-only accessors of instance variables,

      Why would a date class have read-only instance vars?

      However that date object is initialised, why should I not be able to adjust that date by setting one of the values? Why would I not be able to get next year's birthday from this year's by adding 1 to the year component?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Why would a date class have read-only instance vars?

        Because that way you can use a Date like a value type (that is, share one object in different contexts without the risk that some other context might change it).

        Immutable objects also have the nice property of being sharable among threads without need for any locking.

Re: The fallacy of the *requirement* for read-only instance variables.
by sundialsvc4 (Monsignor) on Apr 17, 2011 at 23:58 UTC

    Here is one rule of thumb that I advise:   stick to actual requirements.

    The author of a class might be tempted to put things into the class which do not have an actual, present requirement to be there at this time.   It might be tempting to think that the author is doing us a favor.   In reality, probably not.

    “Forever after,” code is going to be written that is dependent upon the API that has been set forth by this object.   That code will, quite necessarily, be dependent upon everything about it ... the methods and their parameter lists; the acceptable ordinal values and/or character strings; the list goes on.   Going forward, that code will probably be impossible to change, or perhaps even to fully catalog.   It is therefore very highly desirable that this API be as small as possible, and also that it should possess no gratuitous features.   “Verily, you will have to support it ... all of it ... forever.   Choose Wisely.™”

    IBM (had | has) a term for things like these interface-changes:   HIPER = “Highly Pervasive Change.”

    Obviously, obviously, the design of every software system consists of educated guesses.   My rule of thumb is simply, that you should make the best possible guesses concerning what you guess you must guess about.   If on the other hand you guess that you don’t have to make a particular decision yet ... defer that decision until you actually do.

    And, yeah, (gasp!) you might be dead-wrong.   (Oooh, I hate it when that happens!   But it happens a lot.)   Use your best judgment about what choices you deem that you must make now, and try to minimize that number of choices.   You are not actually making a real-world object... you are making a simulacrum of one.   You do not need to make a real-world object.   Endeavor to make exactly what you need, and no more.

Re: The fallacy of the *requirement* for read-only instance variables.
by GotToBTru (Chaplain) on Apr 19, 2011 at 21:12 UTC

    Consider an unique object id. It need not be expensive to calculate and it doesn't matter in the slightest how often it will be used. It might not be used by the object itself. But it must be immutable once defined.

    For instance: a "dogs" class; the unique id is a sequential integer. When I instantiate Scruffy, nobody else knows or cares how many dog objects I have created in the past, so the value must come from the class itself. The pet_store class will need the id to complete the sale, the obedience_school class wants to know who will be showing up, the vet class needs to keep track of rabies shots, and the pet_cemetery class will want to know who's in crypt #53. Or, none of these if I'm the breeder, own a chain disreputable restaurants and just need to keep track of tonight's special.

      Okay, but unless your program is going to run for the life of the dog, and the combined lives of all the dog 'registered', you are going to have store that unique id somewhere between runs of this program. And accessible to other programs that might need to deal with it. And that means that on subsequent runs you are going to have to read it back from some persistent storage and then set it.

      Unless you can set it (at least once) it never has a value. And the only way to set it without using a setter, is to initialise it using a constant. Are you going to hard code all your unique ids into (all) your program(s)?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      Okay, I added a comment on the root of this thread after it had gone on a while. Let me clarify a bit.

      The premise of this thread was "under what conditions is it correct to use a read-only instance variable?" You said there were 3 conditions:
      1. The value stored in the ROIA must be expensive to calculate.
      2. It must be required multiple times.
      3. It must be required both internally to the instance; and externally to it.

      I'm suggesting that a unique id would be a counter example. Does it need to be read-only? Absolutely. The unique id is a property of a particular instance of a class that makes it distinguishable from all other instances. Read-only is relative of course. The object itself must be able to modify the value but nothing external should be able to. The value would be set at the time the object is instantiated. Is an instance variable an appropriate place to store this? Absolutely. The value must reside in the instance of the object. Two different instances will have different unique ids. And there is no need to supply a method, the values are static once set. Simply inspect the value.

      I used the example of a "dog" class. It is easy to imagine another example: breed. The breed of dog would be set in the object at the time of creation and won't ever change during the life of the object.

      The details of setting the unique id need not be expensive at all (increment a class variable by 1 for instance), but only the class itself has the knowledge to set it. Breed would probably be passed as a parameter to the constructor but again, not an expensive calculation any way you look at it. Also, the use of a read-only instance variable is dictated by the application - a value that can't be changed and is tightly associated with the instantiated object. I don't see how it possibly matters how many times it is used.

      I don't see myself how a unique id would be used internally but I haven't worked out the implementation details. I mention this only for completeness. It was the third point and I can't speak to it.

      I hope this clears up what I was saying.

        And my response remains the same.

        Outside of the myopia of a for-arguments-sake-only, do-nothing-and-no-details example, this does not stand up to scrutiny.

        For your example of dogs, which live longer than the program runs, your unique ids have to persist longer than one run of the program, else they serve no purpose. The object handle is already a unique identifier for programmic purposes.

        And once you persist your objects (with identifiers) to disk, you need to be able to re-constitute them by instantiating instances that will be given their ids as read from disk. But, you also need to be able to instantiate new instances which will get their ID from the class.

        Now you need to be able to set the ID to be either then next increment of the monotonically rising counter; or set it to the value read back from disk.

        Therefore, you need a setter. Even if that setter is programmed to only work once per instance. (Ie. Write-once, not Read-only)


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: The fallacy of the *requirement* for read-only instance variables.
by John M. Dlugosz (Monsignor) on Apr 22, 2011 at 06:42 UTC
    And the answer is, that if the caller can pass the object to that other code, it can equally well pass the value to that other code directly rather than via the object.
    I don't think so. The plumbing might already be set up.
      The plumbing might already be set up.

      Obviously you have to compromise with legacy interfaces, no matter how broken.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://899808]
Approved by Corion
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2014-07-30 08:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls