Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

design suggestions for object integration wanted

by rvosa (Curate)
on Sep 19, 2006 at 07:08 UTC ( #573648=perlquestion: print w/ replies, xml ) Need Help??
rvosa has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I am looking to integrate a number of objects that need to notify each other of their state changes. I am looking for design pattern suggestions on how to proceed.
The problem space is that of a branch of biology called 'phylogenetics'. The main objects in this problem space are:
  • Matrices: Matrices are containers that hold biological data.

    The rows in the matrix are biological entities (usually species), the columns are comparative data points ("characters").

    For example, you could have a matrix with three rows - one for Homo sapiens, one for Pan paniscus (the pygmy chimpanzee) and one for Pan troglodytes (the common chimpanzee) - if the matrix contains a single character ("has opposable big toes, yes/no"), the matrix would look like this:
    Homo_sapiens 0 Pan_paniscus 1 Pan_troglodytes 1
    In its simplest form, you could implement this as a two-dimensional array, and it is probably instructive (at least for me) to think of it that way, although requirements of type safety mean there'd almost certainly be an abstraction layer that keeps an eye on what goes into the matrix (character matrices can contain dna sequences, binary character states, "continuous characters", i.e. floating point values, and a bunch of more esoteric data types).

    The typical kinds of operations you'd want to do on a matrix object are things like adding and removing rows and columns, renaming rows, annotating columns.

  • Trees: trees represent a graph representation (directed, acyclical, usually) of the inferred relationships between the rows in a matrix.

    A tree describing the relationships between the species in the matrix described here might show the chimpanzees as more closely related one to another than either to humans, based on the distribution of "character states" (the exact inference depends on the assumed direction of evolutionary change, e.g. did human lose opposable big toes in the course of their evolution, or did chimpanzees gain it?).

    Tree objects usually are recursive data structures of some form.

    Things one might want to do with a tree include performing several calculations on the tree shape, changing the shape, removing branches, renaming entities in the tree.

  • Taxa: the (often incorrectly applied) shorthand term for the intersection between trees and matrices.

    'Homo_sapiens', both in the tree and the matrix, is a "taxon". In its simplest form, you can think of this as a name constituting a unique primary/foreign key in the course of an analysis.

    For example, when we use 'Homo_sapiens' in the matrix, we refer to the same thing as when we use it in a tree.

These objects interact with one another during the course of a typical phylogenetic analysis, which might consist of:
  1. Collecting data
  2. Shoe-horning it into one or more matrices
  3. Inferring one or more trees from the matrices
  4. Analyzing the fit of the data on the trees. This might include things like removing outliers, fixing typos in taxon names, and other things in desparate need of a mechanism to preserve referential integrity between trees, taxa and matrices.
There's a number of packages on CPAN that deal with this kind of research: Bio::NEXUS and Bio::Phylo (I wrote Bio::Phylo, and I'm working with the authors of Bio::NEXUS) and bioperl (the 500-pound gorilla with which we want to stay compatible).

If you care to look at these packages, you'll notice that there are implementations of the tree, matrix and taxon objects I described, but no larger framework that deals coherently with the relationships between them. One part of the integration I am trying to achieve is a situation where changes to a taxon object cascade through to the matrix and tree objects that refer to this taxon. Likewise, changes in the matrix should be reflected in the tree and (sometimes) vice versa.
The problem I have is that I can't quite conceive of the right architecture to keep referential integrity between the different objects I'm dealing with. I am thinking of something like the Observer pattern, but I'm not sure if that's entirely appropriate (observing and handling goes in multiple directions between the objects. I fear spaghetti.) I am very eager to hear your suggestion and comments.


Comment on design suggestions for object integration wanted
Download Code
Replies are listed 'Best First'.
Re: design suggestions for object integration wanted
by GrandFather (Sage) on Sep 19, 2006 at 10:13 UTC

    Is it useful to provide a supervisor for each taxon object that acts to pass edit messages of various types to the various matricies and trees that the taxon object is associated with? The supervisor forwards the messages to each of the objects interested in the taxon object (including the taxon object and the message originator), which then take appropriate action.

    DWIM is Perl's answer to Gödel
      Yes, we have been thinking about that. Actually, we envision a larger 'character data and tree(s)' object ("CDAT"). The idea is that the CDAT object has slots for trees, matrices and taxa, that each specify to the CDAT object which other objects (by UID) they're watching and what actions to take, e.g.:
      my $matrix_id = $cdat->add_matrix( $matrix ); my $tree_id = $cdat->add_tree( $tree ); $cdat->add_handler( 'listener' => $tree_id, 'observable' => $matrix_id, 'handler' => \&handler, ); sub handler { my ( $listener, $observable, $method, @args ) = @_; # $method is called on $observable, # but supervisor $cdat creates an # indirection layer to notify listeners print "$method '@args'" # prints "set_name 'New matrix'" } $matrix->set_name( 'New matrix' );
      ...where the $tree, $taxon and $matrix objects have a way of notifying the $cdat object to trigger the handlers specified for their respective ids when they change state.

      In the example, the $tree is watching the $matrix, and so when a method call is placed on the matrix, the tree can check what has changed and act accordingly.

      I foresee problems with deep recursion when subsequent actions are triggered on the tree, which might in turn be watched by the matrix. How would you deal with that?

        I've not thought it through in detail, but I envisage a forward_message method on your CDAT object that disallows recursive calls with the same message object. Consider:

        sub forward_message { my ($self, $message) = @_; return undef if exists $self->{activeMessages}{$message->id_hash}; $self->{activeMessages}{$message->id_hash} = 1; # forward messages to all objects. Objects determine applicability delete $self->{activeMessages}{$message->{id_hash}}; return 1; }

        Note that both CDAT and the message are objects and retain state. Messages may trigger the immediate processing of more messages, but reprocessing "identical" messages is prohibited. The id_hash determines what is considered to be "identical".

        The idea of dynamic dispatch appeals to me for this application so I'd likely have a chunk of code in the target objects's message handlers that looks like:

        sub message_handler { my ($self, $message) = @_ my $handler = 'handler_' . $message->{action}; return undef if ! $self->can ($handler); return $self->$handler ($message); } sub handler_addTaxonName { my ($self, $message) = @_; my $newTaxon = $message->{newTaxon}; ... my $balance = $self->create_message ('rebalance', -taxon => $newTa +xon, ...); $self->{CDAT}->forward_message ($balance); return 1; } sub handler_rebalance { my ($self, $message) = @_; my $newTaxon = $message->{newTaxon}; ... }

        DWIM is Perl's answer to Gödel
Re: design suggestions for object integration wanted
by exussum0 (Vicar) on Sep 19, 2006 at 16:17 UTC
    Observer/Listener pattern is perfect for this. The only type of errors, beyond simple imperative ones, are recursion errors. You can create an event recursion related, stack overflow. But you can do that in non-observer related code.
      Could you expand on that? Thanks!
Re: design suggestions for object integration wanted
by BrowserUk (Pope) on Sep 20, 2006 at 12:30 UTC

    Surprisingly, or perhaps not, the problem you describe here and Challenge: 2D random layout of variable-sized rectangular units., are the same problem--and it's a tough one.

    The hardest part is arranging for the branches of the directed graph to morph whilst maintaining the intersections between it and the matrices without requiring wholesale re-building of the entire graph.

    I think the solution is to add an extra level of indirection. That is, instead of storing a reference to the taxon in the nodes of the graph and matrix, store a reference to a reference to a taxon. This allows either structure to be manipulated without forcing a full-scan reconstruction.

    That extra level of indirection has a penalties though. Beyond the performance/complexity impacts, the biggest is that it is really very hard to visualise, which makes coding it very hard indeed--for me anyway.

    I only mention it because the idea of the extra level of indirection might gel in your mind and give you ideas for a solution.

    I've been trying to finish my attempt at my challenge, but life is being particularly interventionist at the moment, and it really requires an uninterrupted, sustained period of coding time to write down the solution that is half-formed in my mind.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: design suggestions for object integration wanted
by ruoso (Curate) on Sep 20, 2006 at 09:32 UTC

    I would suggest you taking a look at Game Of Life, it's a game, I know (but I'm still braindamaged by TheDamian's talk), but take a look on how it works, not that you will implement that way, but it might give you good insights...

    BTW, I couldn't find the modules TheDamian used on that talk... I think the name was DFA::Automata... Does anyone has a pointer to it?

Re: design suggestions for object integration wanted
by tbone1 (Monsignor) on Sep 20, 2006 at 11:56 UTC
    Hm, I am standing at a man's crossroads. Do I bring up shared memory, or do I try to reduce the evil in the world? I inherited something like that solution once, and it worked, but Lord, it was not pretty. (Then again, neither was the problem it was addressing.)

    tbone1, YAPS (Yet Another Perl Schlub)
    And remember, if he succeeds, so what.
    - Chick McGee

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://573648]
Approved by Corion
Front-paged by Tanktalus
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2016-05-06 04:26 GMT
Find Nodes?
    Voting Booth?