Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Variation on the "Higher Order Perl" Iterator Pattern

by rjray (Chaplain)
on Feb 08, 2008 at 01:09 UTC ( #666887=snippet: print w/replies, xml ) Need Help??
Description:

This came up for some work being done at $JOB, which then went through a complete re-design, making this code no longer necessary. But I thought it was an interesting variation on an iterator.

It probably isn't novel to have this style of iterator wrap multiple sources and present them as a single stream. What makes this different is that the sources are read from on a rotating basis; in this case it was n DBI statement handles which represented data that was split among n different MySQL hosts due to the sheer size of the dataset. For certain (mostly aesthetic) reasons, they wanted the processing stage to interleave the result-sets rather than process them in sequence.

This is based on the wrapped objects being DBI statement handles, and the desired return-format being the sort of hash-ref structure returned by the fetchrow_hashref() method. You can write the next() as you see fit for your encapsulated objects.

For more on clever Perl-ish iterators, see chapter 4 ("Iterators") of Higher-Order Perl by Mark-Jason Dominus (ISBN 9781558607019).

# Usage: my $iter = DbiSthIterator->new($sth1, $sth2, ...);
#
#        while ($row = $iter->next()) {
#            ...undef signals the exhaustion of the iterator
#        }

package DbiSthIterator;

sub new {
    my ($class, @sth) = @_;
    bless \@sth, $class;
}

sub next {
    my $self = shift;

    while (my $sth = shift(@$self)) {
        if (my $row = $sth->fetchrow_hashref()) {
            push(@$self, $sth);
            return $row;
        } else {
            # Leaving the exhausted $sth out of the queue
            next;
        }
    }

    undef;
}

1;
Replies are listed 'Best First'.
Re: Variation on the "Higher Order Perl" Iterator Pattern
by Limbic~Region (Chancellor) on Feb 08, 2008 at 13:54 UTC
      I wrote RFC: DBIx::Iterator because I needed to join over 20 tables in a database, but the schema was unknown...no one was quite sure how the tables were supposed to be joined exactly. So I did a lot of experimenting with joining and unjoining tables. In a normal SQL statement though, if you want to remove the table in a join, you have to remove the appropriate SELECT columns, the table in the FROM clause, and the appropriate parts of the WHERE clause. With the iterators I used, I could remove a join by commenting out one line (saving lots of time in the "experimenting" phase). So in a way, it was over 20 iterators joined together (though all the same type).

      Not a lot, at least not any more than any other given structure. That may change in the future, as we're in the midst of some hard-core refactoring. We have several petabytes of data in MySQL tables, some of which have to be split across physical drives (if not physical machines) due to sheer volume.

      --rjray

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: snippet [id://666887]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2022-05-25 20:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (90 votes). Check out past polls.

    Notices?