RFC: Data::Sync

A couple of weeks ago I released Data::Sync. The basic idea behind the module is to construct a simple open source metadirectory. This makes 'copy from one database to another' applications very quick and easy to develop. An example is set out below:

You have two databases as follows:

Database1:

NAME VARCHAR(30), POSTALADDRESS VARCHAR(50), TELEPHONE VARCHAR(15), CUSTOMERNO NUM

Database2:

FULLNAME VARCHAR(30), ADDRESS VARCHAR(100), HOMEPHONE VARCHAR(20), OFFICEPHONE VARCHAR(20)

and you want to copy name, address and phonenumber for those entries from database1 whose customer number is less than 1000. The code would look something like this (assuming SQLite for convenience)

use strict;
use Data::Sync;
use DBI;

my $db1 = DBI->connect("DBI:SQLite:dbname=db1");
my $db2 = DBI->connect("DBI:SQLite:dbname=db2");
my $sync = Data::Sync->new();

$sync->source($db1,{select=>"SELECT NAME,POSTALADDRESS,TELEPHONE FROM 
+sourcetable WHERE CUSTOMERNO < 1000"});
$sync->target($db2,{table=>"targettable",index=>"FULLNAME");
$sync->mappings(NAME=>"FULLNAME",
        POSTALADDRESS=>"ADDRESS",
        TELEPHONE=>"HOMEPHONE");
$sync->run();
[download]

This will copy all records matching the select statement from the source database to the target database. If the entry exists, it will be updated - if it doesn't exist it will be created.

Nothing is done until the 'run' method is called - that calls the read, mappings, buildattributes, transforms, and write methods in turn (see below for transforms and buildattributes).

But perhaps the name attribute is formatted differently between the databases. Perhaps db1 has "Firstname Lastname", whereas db2 has "Lastname, Firstname". You can overcome this with a transformation:

$sync->transformations(FULLNAME=>'s/(\w*?)\s+(\w*?)/$2,$1/');
[download]

(You can pass a single quoted string representation of a regex, a coderef or a string in here - strings are for existing defined transformations like 'stripnewlines', 'stripspaces' etc - see the perldocs. Note that 'transforms' acts on the MAPPED names of the attributes - i.e. the names in the target, not the names in the source).

You can also specify an LDAP handle rather than a DBI handle for source or target, in which case the syntax changes to reflect the difference. The perldocs detail this (and it's more specialised, so I've omitted discussion here), but there's one point that's worth illustrating:

Writing to an LDAP directory requires a DN (Distinguished Name) to identify the record, and an objectclass. Reading from DBI to LDAP, you're unlikely to have either, so the 'buildattributes' function allows you to create them:

$sync->buildattributes(dn=>'cn=%NAME%, ou=container,dc=testorg,dc=net'
+,
            objectclass=>'organizationalPerson');
[download]

Once created, you can use transformations on this attribute like any other, so more complex processing can be done.

Rationale

This is all analogous to a number of commercial products: Critical Path IMD, Maxware DSE, IBM Metamerge among others. Implementing these types of systems is how I normally earn my keep. I figured that given my fondness for perl I was going to end up implementing something like this eventually, so decided to do it while I have free time between projects - neatly avoiding implementing it in a project and hence being unable to release/reuse it. The upshot of that is that unlike many CPAN modules, this code, although thoroughly tested with Test is not in live use. AFAIK anyway - someone might have adopted it in the last couple of weeks I suppose. There are undoubtedly criticisms and comments that can be raised against the code as well as the syntax etc. All comments are welcome.

I'm currently at work on a GUI for job definition, and a standalone "run this saved job" script to allow sync jobs to be created and used without writing any perl code, but I'm polling at this point for opinions and feedback - any functionality people would like to see included or any other comments you may have?

--------------------------------------------------------------

$perlquestion=~s/Can I/How do I/g;

Back to Meditations