|Welcome to the Monastery|
I have a relatively large data structure made up from interconnected object "nodes" that represent all the physical devices in a network spanning many sites of various sizes.
These nodes are updated all the time by daemons collecting information such as ping status, response times, SNMP interface data, syslog data as well as operator data that chances the node structure by adding, moving and removing devices and connections.
The data structure allows spanning tree loops (because the physical network does) and allows me to do complex searches like determining the overall status of groups of sites, finding common root sources etc. very fast.
At the end of the pipeline sits an HTML/SVG engine that uses HTTP::Server::Simple::CGI to present everything as a web site with full drag/drop support, AJAX menus and the works. A typical request takes only a few milliseconds to serve.
So, what's the problem? As long as the requests are few and queries simple this is good enough but the current solution does not scale. At all. The problem is that my web server process is single threaded, because I can not for the life of me figure out a GOOD way to make it multithreaded:
Reason 1) DBI/Mysql does not work well with multithreading so live updates would be tricky. The current design is to use triggers on the large tables that get updated a lot, these triggers leave "clues" in a dedicated table, telling the server process which records have changed and need to be refreshed. This means updates get picked up without having to search for them but it's not ideal. Some data, like syslog messages, is simply too large to keep in memory and must be fetched from the database every time it's needed.
Reason 2) Copying the entire data structure very slow and difficult, because it's very deep and very recursive. Certainly too slow to do for each and every HTTP request. Besides, any changes made to the structure of a copy would have to somehow find its way back to the master and get distributed to the other copies.
I imagine there must be a way to have like a dozen copies of the same data structure running in its own process. Then, if an update comes in, somehow replicate that change to all of the others. If the HTTP bit scaled well enough, the ping/snmp/etc daemons could be rewritten to post their updates using HTTP and I could do away with the whole insane trigger/polling system.
Has anyone done something like this before? What tools or techniques did you use? Some kind of n-way realtime data replication, serializing stuff and sending it via sockets or whatever?
This is a pet system I develop in my spare time and rely heavily on at work. The current system gets the job done so it doesn't matter if I have to spend a year rewriting version 2 from scratch to get it perfect.
Update: DBI, not CGI stupid.
-- Time flies when you don't know what you're doing