Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

BerntB's scratchpad

by BerntB (Deacon)
on Sep 09, 2005 at 18:50 UTC ( #490705=scratchpad: print w/replies, xml ) Need Help??

I must say Thank you! to the proof readers, on and off PerlMonks. If this still isn't understandable, the fault is mine. (-: And, to demonstrate the confidence I feel in my pedagogical talents, I'll do the proofreaders the favor of not naming them! :-)


This module's goal is to be like the love child of an XML editor and a spreadsheet. It aims to make it possible to write complex data definitions with code, so calculations are done while editing.

A data definition (like DTDs in XML) define valid data structures and then an editor reads that definition and is used to enter data.

Instead of using text tags (elements) in the definition, this module use objects with code. Data in the document is defined as parts-in-parts (i.e. the objects are built upon hierarchically) which is the same basic structure as XML, which have elements-in-elements.

After writing a specification, it is compiled and loaded as a module with an application framework by e.g. the editor. The editor is a simple CGI program. It is written to be easily extended so it can be specialized for different data definitions.

When should this module be used?

Here are a few examples of when this kind of hierarchical modeling might be applicable.

(a) Since code evaluate if entered data are well formed and valid this could, in some cases, be better than XML. DTDs/Schemas can't e.g. compare names against a database with customers or check total sums of entered values in different parts of the document. After editing, compliant XML can be generated.

(b) To implement the rules for designing a complex item in a game, say a robot or spaceship in a war game. Different parts of the game item depends on other parts, like parameters for hull volume (do they fit together?), power plant (energy needs?), engines, etc. While designing the game item, capabilities like cost/acceleration/etc metrics are continously displayed.

(c) If the game in (b) is later played on a computer, the same model could be used where the complex game items is used. (E.g. code to decide how damage is distributed after a hit or what the item could do after a subsystem is damaged.)

Basic idea

The module is about the "M" in the MVC (Model-View-Controller) pattern. Data is described as parts-in-parts, like a tree of objects; similar to XML turned 90 degrees.

This is not about simulation (no timing methods or any kind of action representation), it is used when there are lots of complex rules for how things should fit together.

There are two phases -- like XML where a DTD defines a document model and in a second phase, documents are written/tested to be compliant with the DTD. The definitions here are very unlike XML, but the basic structure is almost equal.

These are the practical steps of using the module:

  • First, write a Model by specifying object definitions, rules and code. Load the model into the runtime and test.
  • Phase two is defining data that follows the model, using e.g. the CGI-based editor.
  • After a user specifies his data using the model, the object tree might be used for generating a specification -- or be used for simulating behaviour.


Spreadsheets are often used to model complex structures which needs calculations, but makes it hard to model the structure. An XML editor supports structure, but doesn't support calculations while entering data.

This module should, for some cases of parts-in-parts problems, have a much higher complexity limit than spreadsheets.

A trivial translation to XML would be to dump the objects as elements named after class, containing sub-objects. Let the variables in the objects be attributes. (To make it two-ways would be harder, since objects can have class defined attributes changed.)

The objects in the tree inherit in an OO fashion, to share variables and code. (I.e. a build hierarchy and a separate inheritance hierarchy, like most graphic class libraries.)

The code in the objects have an API to add/delete objects, handle typed variables, persistence, decorators, sending questions back to the user, etc.

The generated specifications are loaded into a runtime, which has a simple-to-use API. The API is written to be suitable to stub for RPC.

There is a simple CGI application to test new specifications. This should mostly be enough (with some changes to the html) for use when entering valid data in the second phase.

How is it used in practice?

First -- install

Install the modules and the CGI application.

Second -- write and test

Write and test the specification of your data. This is done in these steps:
  1. Write a minimal configuration file with simple object specifications and inheritance rules.
  2. Generate a data file from the configuration file(s) with the compiler.
  3. Load the data file into the runtime and test, typically using Test::More and the CGI interface.
  4. If everything is ok, the design part (the "DTD") is done! Note that this is under the GPL, so you must publish your model.
  5. If not done, add functionality; more object types, specify limitations on what objects are allowed where, create groups of predefined objects ("Instances") that is added as a unit, write code to override default behaviour, specify hints to the user interface on what to show, etc.
  6. Go back to Step 2.

Third -- edit data

Use the data modeller (the "DTD") from the second step to write data compliant to the model.

To use the program for internal data models (no GUI), call the objects or the runtime api. Otherwise, you can use the CGI editor.

You might want to modify the CGI html templates for your specific objects or do more ambitious UI work with another editor. Remember, it is GPL -- you must publish.

Fourth -- use the model with the data?

Now you have data following the specifications your wrote. Either, generate documents from this -- or use the models as programs.

Some features

  • The runtime is a small application framework, with persistence, etc.
  • Variables have inheritance, flags for if normal user calls might modify them and default values.
  • There is support for automatically updated sums of variables in other objects.
  • Small trees of objects can be defined and added as objects themselves. (They can also override standard defaults for objects.)
  • Decorators that can filter method calls to objects.
  • Well commented and pod:ed.
  • A standard way for objects to send questions with parameters back to the user interface.

Plans, next version

Right now, the basic engine is almost done! This is what is left to do for an alpha version:
  • A big example with new test files. My present one is copyrighted and can't be uploaded.
  • I am writing the web interface now and worry. These points might break the project:
    • Speed issues, even with mod_perl.
    • Usability; is the cgi interface acceptable?
  • Programming documentation. I have the APIs poded, but more is needed.

I am very happy to come this far, but start to think about the next version -- how it really should have been done! These are my ideas. (Send an email with suggestions/requests.)

  • Implement a rpc interface (SOAP, other?) used by the CGI. The speed should skyrocket for complex data specifications. (Might be necessary for the first version.)
  • Make the CGI look better. Some alternative version?
  • More examples.
  • More integrated support for using Test::More (et al) while writing models.
  • Integrate with XML; a Schema spec could do half the job?
  • Smarter sum functions and not only numeric.


Under work.


I thought about modelling a structure in a popular board game for a long time. It was quite hard -- none have done it before -- because of the complexity of the rules, with lots of exceptions.

I got the idea to use an object hierarchy to describe the design as parts-in-parts and rules for how to build the subparts in a tree. Then let the objects have code to modify their behaviour dynamically. The objects can have classes and inheritance. The end result would be specified inherited default behaviour and lots of ways to override it.

Well, it looked easy before I started... :-)

I thought of this in the terms of "Hmm.. object definitions to code. If I make it a network of objects instead of a tree, with more types of interconnects... UML-to-code?!"

The realization that I was writing the backend of an intelligent XML editor came late. Which is a pity, since I might have saved half the work if I had integrated with e.g. XML Schemas, etc.

But I got it (more or less) 90% done. Now there is only the remaining 90% of documentation and some default interface.


This is put under the GPL. I allow any use as long as you publish changes, new user interfaces and the data model. This, since code is generated and loaded into the runtime. I want your data published, since I believe examples are important for this module.

Credit would be appreciated, since I look for work while writing this.

If you can't publish your data, I will grudgingly accept money (size to be decided, ask if important) to send a non-GPL version; I encourage that half of that payment is sent to the Perl Foundation or FSF and a proof of payment for that is sent to me. ( is also fine.) I will use the money for books and sushi, but if I reveive enough I will pay living costs with it and work on this application.

Please note that this is milder than GPL, since it is really ok if you link in some other software package, as long as you publish the interface to it. If some other license fits my needs better, let me know and I'll look at it.

Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (10)
As of 2016-09-28 07:41 GMT
Find Nodes?
    Voting Booth?
    Extraterrestrials haven't visited the Earth yet because:

    Results (521 votes). Check out past polls.