Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Re: Yet Another Perl Object Model (Inside Out Objects)

by demerphq (Chancellor)
on Dec 15, 2002 at 02:27 UTC ( #219943=note: print w/ replies, xml ) Need Help??

in reply to Re: Yet Another Perl Object Model (Inside Out Objects)
in thread Yet Another Perl Object Model (Inside Out Objects)

Why do I serialize objects? I suppose because to certain extent I am a visual person. During the development phase of writing a module I usually dump out data structres and print tables of information. If a data structure is complex then seeing the ouput from a dumper can really help to see whats going (wr)on(g).

Also, for more serious uses, persistancy is a not uncommon requirement and serialization can be the simplest way to achieve it. Config files are a simple (and sometime abused) example but better to look at something like MLDBM.

And in my case, writing and breaking and fixing a number of different serialization modules (aka dumpers) has been both a hobby and an heavy duty course on perl esoterica. Some people do regexes or crazy ties, me I do dumpers. I hope to have a super duper improved version of Data::BFDump ready after the holidays. It should make dumping inside out objects a _lot_ easier.


--- demerphq
my friends call me, usually because I'm late....

Comment on Re: Re: Yet Another Perl Object Model (Inside Out Objects)
Replies are listed 'Best First'.
Re: Re: Re: Yet Another Perl Object Model (Inside Out Objects)
by BrowserUk (Pope) on Dec 15, 2002 at 03:52 UTC

    I probably should have waited for a few other to have responded to my question before replying, but it struck me that serialising objects from the outside is actually a rather strange requirement, and can only be done in perl because of the peculiar nature (relatively speaking) of the way OO is implemented in perl.

    In most OO languages, if serialisation is required, it is done by requesting the object to serialise itself and return that to the caller. Which is what I added to my version of the Quote class like this.

    sub toString{ my $self = shift; sprintf '[%s:' . '%s; 'x2 .'%s]', $self, $phrase{$self}, $author{$self}, $approved{$self}; }

    and to its super class QuotePlus like this:

    sub toString{ my $self = shift; sprintf '[%s:' . '%s;' x 1 . '%s]', $self, $date{$self}, $self->SUPER::toString(); }

    Of course, I haven't written the obverse fromString method yet, but I don't actually see the need for it. The toString method would be used mostly for debugging and perhaps for inclusion into error messages. I don't see me having a need to recreate instances of a class from their serialised form.

    I also added what I would see as a class method rather than an instance method, to dump all instance data for a class. For this I did use Data::Dumper like this.

    # QuotePlus (sub)class sub _dump{ warn "_dump should only be called as a class method.\nIe. QuotePlu +s::_dump()", and return if ref +shift; Data::Dumper->Dump( [ \%date ], [ 'date' ] ) . Quote::_dump(); } ... # Quote class ...sub _dump{ warn "_dump should only be called as a class method.\nIe. Quote::_ +dump()" and return if ref +shift; Data::Dumper->Dump( [ \%phrase, \%author, \%approved, ], [ qw/phrase author approved/ ] ); }

    Barring this from being called as an instance method makes sense to me as what is returned is class specific rather than instance specific.

    In terms of persistance, I'm torn between whether a class should know how to persist itself, of whether this should be done eternally, or by inheritance from a named class.

    It seems attractive at first to see this as a responsibility of the class itself, but then the persistance mechanism becomes fixed and requires modification of you change your database for example.

    I can see the merit in doing this from outside the class at the application level, in terms of "serialise & save" as this allows different applcations to use the same classes and different persistant storage.

    However, I see a problem with this in that if I have two subclasses of a base class and I want to persist all instances of one. Dumping one of the subclasses (via a class dump method) and saving it is going to also save instances of the base class that were created via the second subclass (if your using the Inside Out method).

    However, doing it one at a time through an instance method requires me to iterate over every instance I have created. Possibly safer, but painful none the less.

    The third method I see, is to have every class inherit (either directly or indirectly) from a Persistance class and it would provide each instance with a Persist method and a virtual class method PersistAll that a class could override to cause it to persist all of its instances. This allows the Persistance class to be swapped out for a new one whenever you change your storage medium, database etc.

    I note for the readers that I have done very little in way of OO in perl, and have never had to write this type of library classes in other OO languages, always having just made use of existing classes for these sort of things. I will say that I have always been disappointed in those I have used in Java, C++ and even my breif forays in SmallTalk many years ago.

    Examine what is said, not who speaks.

      To answer your original question:

      A question: Why do you serialise objects?

      Three reasons:

      1. Debugging - quick 'n' dirty view of the current objects state.
      2. Persistance
      3. Copying - soemtimes a quick freeze/thaw cycle is the simplest way

      As to your other comments, I tend towards having serialisation be the responsibility of the class, since it gives you more flexibility.

      This doesn't mean that you have to write a brittle serialisation method for each class - there is nothing stopping you using Storable within your classes freeze/thaw methods (note: "using" not "inheriting from" - inheritence is overrated :-)

      This gives you flexibility (you can change your serialisation method on a class by class basis if necessary) and simplicity (common functionality sits in the serialisation class).

      The "problem" with inside-out objects is that the "simple" case - dump all of the attributes - becomes hard. You can't just throw $self at Storable::freeze in your classes "freeze" routine.

      but it struck me that serialising objects from the outside is actually a rather strange requirement, and can only be done in perl because of the peculiar nature (relatively speaking) of the way OO is implemented in perl.

      Well I dont think its that weird. After all, dumping core is a form of serialization, and that is hardly an unusal activity on a computer. :-) Also many debuggers (im thinking the VB and VC debuggers, and ive seen many open source tools that do similar things) have the ability to show allocated structures in a visual way. So this task is not an uncommon one.

      However that aside I will grant to you that being able to dump data in such a straight forward way is a by product of both perls approach to variables and to OO.

      But I hardly think its a bad thing. For instance, assuming that a number of the more subtle problems in things like Data::Dumper and friends are either not an issue or resolved, dumping provides a simple portable way to inspect data structures iregardless of their origin. Also the use of the language itself to represent its data, while admitedly a potential security risk, has many advantages. It allows minimal understanding to read for the programmer, in fact it may even be enlightening in terms of understanding perls syntax to read the output of a Dumper, it means that no special tools are required to regenerate the object, and it means the generated file is as platform independent as perl is itself. Furthermore it can be used to facilitate code generators, which is a common design approach.

      Compare all this against objects serializing themselves on request. Lets say I throw together a composite data structure like a treap. Now im finished my implementation of the data structure and someone asks me to find a way to save the generated structure. In this case I have to write a whole bunch more code to both read and write it from disk. If I make an error in my implementation, (and as a beginner I just might do that), then things dont work out so well. Sure there are work arounds in other languages where Perls approach doesnt work but I'm guessing they arent as straight forward to implement and are subject to higher levels of programmer error.

      Ultimately I think that Dumper is one of Perls many form of Introspection. And in my experience the more intorspection available the easier my life as a programmer becomes. Especially when things start getting weird...

      --- demerphq
      my friends call me, usually because I'm late....

        I have no problem with the idea of dumping state as a means of debugging.

        I also have no problem with the idea of dumping structure as a mechanism for learning, reference, or debugging.

        Where it all goes wrong (for me) is the idea of using dumping as a mechanism for saving and restoring state, which is where I would argue that "dumping" transitions to "serialisation". This is doubley bad (IMO) if the form that the serialisation takes is that of generating evalable entities.

        To me, the purposes of serialisation (as opposed to dumping) are twofold--persistance & Interchange.

        The problems I see with doing either of these from the outside of the objects are.

        1. Persistance.

          If an object is in anything other than a simple one level , non-composite (either through is-a or has-a), then storing a complete object with all it's superclass attributes and those of any instances of objects that it has as attributes is illogical.

          If I extend an existing class to add some new attributes, when I come to Persist that new class, I don't want to have to modify the existing stored instances nor modify existing applications that use those instances.

          Assuming a RDBMS is used for storage, better to create a new table for the new attributes and have a foriegn key on the existing table, then create a view to join the two tables for use by the new class, than have to modify the existing table. If I ask the object to persist itself and it calls superclass to persist itself. The superclass instance saves itself in its original table and returns is Primary key ID. Then the subclass instance saves its attributes including the returned key(s) in its own table and the job is done.

          If I attempt this from the outside, then the I get one almighty morass of mixed up data and either need to store it as a single entity, which doesn't make sense if there is any chance of two subclass instance sharing a superclass instance. Or I need to parse the returned monolith to seperate out the parts.

          Add in to this the idea that two Employee instances may share the same Address, or two Urls the same IP etc.

        2. Interchange.

          Even interchangeing evalable datastructures between two Perl applications or sites can be fraught with problems. Whilst Storable can nFreeze to and thaw through network ordering, if the source is 64-bit and the destination 32-bit...?

          Or if one site is dependant upon a downlevel verson of perl that doesn't understand something generated by a newer level.

          As soon as you need to interchange the data between different languages, evalable serialisation makes no sense at all. Better again to have the objects themselves produce a standardized interchange format. The obvious one these days is XML. Each object knows it's own structure and can easily produce this. Superclasses just embed the returned XML from the subclasses into their own representation. Receiving application can simply ignore those attributes it is not interested in.

        That's why I think that serialisation should be done from the inside not externally. Once an instance has that ability, dumping can simply use this. I'd rather read an XML representation of a deeeply nested structure than the output from Data::Dumper I think.

        From the pure dumpability point of view, I actually think that Abigail's version of IO objects is better than the standard hash model in as much as I can build in a class method that dumps the entire class' instances and if the class is subclassed two or more times, the bless'd ref keys in the attribute hashes tell me which superclass was involved in the creation of each instance, which I already found useful as a debugging aid.

        Again this requires adding to and invoking from the class, but I think that it is a small price to pay in terms of development of reusable code.

        Examine what is said, not who speaks.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://219943]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (13)
As of 2016-02-09 21:46 GMT
Find Nodes?
    Voting Booth?

    How many photographs, souvenirs, artworks, trophies or other decorative objects are displayed in your home?

    Results (326 votes), past polls