http://www.perlmonks.org?node_id=422156

Concerning code factorization, efficiency and tied variables:

In the wake of Are lvalue methods (as properties) a good idea?, I've revisited an old node of mine, To Validate Data In Lvalue Subs. One thing that I realize now is that I didn't make clear how independent the Constrained class is from the notions of closures and lvalue subs. Constrained can be applied to any scalar variable, including references to anything.

What good is that? A tied interface is slower than simple assignment for a reason; it's doing something. The 40% (or whatever) extra overhead for assigning or fetching through the tied interface is trivial if you have work to do inside it. What work? Whatever is ubiquitous enough to factor this way! I think that that is more than "syntactic sugar", and it's sweeter, too.

For reference, here is a slightly updated and renamed version of Constrained, package Tie::Constrained;

use Errno qw/EINVAL EDOM ERANGE/; sub TIESCALAR { my $class = shift; my $self = { test => defined $_[0]? $_[0]: \&validate }; $self->{test}($_[1]) or invalid(EINVAL) if defined $_[1]; $self->{val} = $_[1]; bless $self, $class; } sub STORE { my ($self, $try) = @_; $self->{test}($try) or invalid(EINVAL); $self->{val} = $try; } sub FETCH { $_[0]->{val}; } sub DESTROY {} sub validate { 1 } sub invalid { $! = shift; die sprintf("Constraint violation: %s by %s::%s in %s line %s.\n", $!, map { qq($_) } (caller 1)[0,3,1,2] ); } 1; __DATA__
Usage: use Tie::Constrained; tie my $var, Tie::Constrained => \&mytest, $initval; Both arguments are optional, but the default validator function always says yes. mytest() should be designed to return true for valid data and false for data to reject.
That's much like the older code, but I've removed the stringency from FETCH() and added a dummy default validator. Those changes were to make subclassing easier. A subclass would typically override invalid() to get different error handling or validate() to have a common class-wide default test.

Data validation is the job the Tie::Constrained class does. That's an example of something you may want to do many times, the same way each time, every time a mutator is applied to your variable.

Off the top of my head, I can think of three basic ways to code those tests.

  1. Paste in a call to a validator function after each mutation. At least, just before each use where the validity matters.
  2. Bless the variable into a class which overloads mutators to validate.
  3. Tie to Tie::Constrained

The first and most obvious one is dismal in practice. The mutation is done to your variable before you get to validate. You can't apply it to third-party code. Paste errors may gum you up. Maintainance is a nightmare. The limited checking of 1b) does nothing to inform you where the bad data crept in. The code has flashing neon signs saying "Factor me!"

The second is better, but it has problems of its own. The proliferation of classes confuses development. Overriding core functions and overloading core operators confuses everybody. The class packages represent a lot of perhaps tricky code to write. Factorization is pretty good, but nothing like . . .

Three. Once a variable is tied to its very own automatic validity check, every mutator will be checked before the variable is modified. That is true of old code and new, third-party code, perl modules, all without the code needing to know anything about it. No infrastructure at all. No special coding beyond the tie call.

That is code factoring with a vengence. I also consider it a particularly sparse and clean kind of OO code, where the object is the aggregate of variable, test and exception.

I'm considering doing a little more tuning and much pod writing to prepare a distribution for CPAN. I'll welcome your comments.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re: Tie Me Up, Tie Me Down
by diotalevi (Canon) on Jan 16, 2005 at 16:55 UTC

    I have two API changes to request and a request to consider prior work in this space.

    Common validation code in Regexp::Common and Params::Validate already pepper my code. Is there a chance you could draw on these modules when building your common library of constraints? I'd like to be able to use similar validation facing code without having to keep too many variations of how to do this in my head.

    Accept more than a single code reference for validation. Consider { 'Is an integer' => sub { ... }, 'Is prime' => sub { ... } } as a list of named conditions that must be fulfilled. I could have put all that into my single passed in function but its also nice to just document the properties that will be checked by name and leave them all separate.

    I'd like some sugar for tieing multiple variables. How about presenting a tie-or-die function so I can constrain multiple variables without lots of effort?

    constrain_this( \ ( my $x ) => { 'Property 1' => sub { ... }, 'Property 2' => sub { ... }, }, \ ( my $y ) => { 'Property 1' => ...

      Regexp::Common is a very good partner for this tied technique - I had it in mind when I wrote Tie::Constrained. Here's an example of a subclass, Tie::Constrained::URI, where a tied variable is by default restricted to be a well-formed URI.

      package Tie::Constrained::URI; use vars qw/@ISA/; use Regexp::Common 'URI'; use Tie::Constrained; @ISA = qw/Tie::Constrained/; sub validate { $_[0] =~ /^$RE{URI}$/ } 1; __DATA__ Usage: use Tie::Constrained::URI; my $href_ctl = tie my $href, 'Tie::Constrained::URI'; tie my $ftpref, Tie::Constrained::URI => sub { $_[0] =~ /^$RE{URI}{FTP}$/ }; $href_ctl can be used to change the test on the fly. $href_ctl->{'test'} = sub { $_[0] =~ /^$RE{URI}{HTTP}$/ } if $href =~ /^$RE{URI}{HTTP}$/;

      As it stands, I think that Params::Validate compatibility is for the future, but it's a very good idea. I'll probably need to rename my validate class method again.

      Many of the simpler things Params::Validate does for objects can be imitated with scalar Tie::Constrained as it is:

      tie my $obj, Tie::Constrained => sub { !$_[0] or $_[0]->isa('My::Frobnicator'); };
      The first term is so that $obj can be undefined to break circular references.

      Thank you much for the suggestions, I'll take them very seriously.

      After Compline,
      Zaxo