Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Pivoting parts of a database table into an HTML table

by Anonymous Monk
on Oct 22, 2012 at 08:00 UTC ( #1000281=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

O Monks, hallowed be your woodiness.

For a variety of reasons, I'm collecting words alongside their date of encounter into a database. Then I group them by their woodiness or tinniness. The database table has three columns and looks like this:

30 Sep (Sun) | woody | pert 30 Sep (Sun) | tinny | newspaper 01 Oct (Mon) | woody | ocelot

Now, to create a more compact view, I'd like to transform the one-word-per-row raw data into an HTML table grouped by date like this:

date | woody words | tinny words -------------+--------------------------+------------- 28 Sep (Fri) caribou litterbin 29 Sep (Sat) wasp, yowling, gorn 30 Sep (Sun) intercourse, bound, pert newspaper 01 Oct (Mon) ocelot, concubine antelope 02 Oct (Tue) vole, sausage 03 Oct (Wed) recidivist, tit

However, I'm having quite a bit of trouble with the loop logic. I can manage the grouping to date, but the special cases of days where only one type of words appear don't quite work and either the words go into the wrong column or the column is left without (empty) markup.

(I actually created a nice generic function that was supposed to solve this sort of task, but it ended up lacking in a few ways. Its interface borders on the silly (expects 4+ subrefs.))

Attached is sample data along with an iterator that emulates a database query handle.

sub make_generator { my ($day, $type, $word) = @_; return sub { my $line = <DATA>; chomp $line; return unless $line; ($$day, $$type, $$word) = split(/\s+\|\s+/, $line); return 1; }; } my $gen = make_generator(\my ($day, $type, $word)); while ($gen->()) { print "$day: encountered '$word', a $type word\n"; } __DATA__ 28 Sep (Fri) | woody | caribou 28 Sep (Fri) | tinny | litterbin 29 Sep (Sat) | woody | wasp 29 Sep (Sat) | woody | yowling 29 Sep (Sat) | woody | gorn 30 Sep (Sun) | woody | intercourse 30 Sep (Sun) | woody | bound 30 Sep (Sun) | woody | pert 30 Sep (Sun) | tinny | newspaper 01 Oct (Mon) | woody | ocelot 01 Oct (Mon) | woody | concubine 01 Oct (Mon) | tinny | antelope 02 Oct (Tue) | woody | vole 02 Oct (Tue) | woody | sausage 03 Oct (Wed) | tinny | recidivist 03 Oct (Wed) | tinny | tit

Comment on Pivoting parts of a database table into an HTML table
Select or Download Code
Re: Pivoting parts of a database table into an HTML table
by Anonymous Monk on Oct 22, 2012 at 08:14 UTC

    I suppose I need to post my half-solution, too. (My original post was getting long.) It's supposed to replace the rather common looping code:

    my $lastid = 0; while ($sth->fetch) { print $row; if ($lastid != $id) { print "</tr><tr>"; } }

    with this:

    alternate( sub { $sth->fetch }, sub { print $row }, sub { $id }, # which variable to monitor sub { print "</tr><tr>" } # what to do when variable changes );

    Currently it has at least an off-by-one error (the words go into the wrong columns), but I can't spot the error.

    sub alternate { my $iter = shift; my $loop = shift; # main loop callback my @conds; while (@_ > 0) { # sub to get value, sub to call if value changed push @conds, [ shift, shift ]; } my @memory; return unless $iter->(); # starting values @memory = map { $_->[0]->() } @conds; do { $loop->(); while (my ($idx, $cond) = each @conds) { my $new = $cond->[0]->(); next if $memory[$idx] eq $new; $cond->[1]->(); # callback $memory[$idx] = $new; # reset subsequent values $memory[$_] = $conds[$_][0]->() for ($idx + 1 .. $#conds); last; } } while ($iter->()); } my $gen = make_generator(\my ($day, $type, $word)); alternate( $gen, sub { print "$word, " }, sub { $day }, sub { print "</td></tr>\n<tr><td>$day</td><td>" }, sub { $type }, sub { print "</td><td>" } );
Re: Pivoting parts of a database table into an HTML table
by Athanasius (Monsignor) on Oct 22, 2012 at 13:20 UTC

    Here is a script which collates the data and produces the desired compact view:

    #! perl use strict; use warnings; use feature 'switch'; { my %h; populate (\%h); stringify (\%h); print_view(\%h); } sub populate { my ($h) = @_; while (<DATA>) { my ($date, $type, $word) = split /\s+\|\s+/; chomp $word; unless (exists $h->{$date}) { $h->{$date}{woody_words} = []; $h->{$date}{tinny_words} = []; } given ($type) { when ('woody') { push @{ $h->{$date}{woody_words} }, $word +; } when ('tinny') { push @{ $h->{$date}{tinny_words} }, $word +; } default { warn "Datum '$word' of unknown type '$typ +e'"; } } } } sub stringify { my ($h) = @_; for (keys %$h) { $h->{$_}{woody_str} = join(', ', sort @{ $h->{$_}{woody_words} + }); $h->{$_}{tinny_str} = join(', ', sort @{ $h->{$_}{tinny_words} + }); } } sub print_view { my ($h) = @_; my $max = 0; for (keys %$h) { my $woody_str = $h->{$_}{woody_str}; my $new_length = length $woody_str; $max = $new_length if defined $woody_str && $new_length > $max +; } printf " date | %-*s | tinny words\n", $max, 'woody words'; printf "-------------+-%s-+-------------\n", '-' x $max; printf "%s %-*s %s\n", $_, $max, $h->{$_}{woody_str}, $h->{$_}{ +tinny_str} for sort { cmp_dates() } keys %$h; } { my %months; BEGIN { %months = (Jan => 1, Feb => 2, Mar => 3, Apr => 4, May => 5, Jun => 6, Jul => 7, Aug => 8, Sep => 9, Oct => 10, Nov => 11, Dec => 12); } sub cmp_dates { my ($day_a, $mon_a) = $a =~ /^(\d{1,2}) (\w{3})/; my ($day_b, $mon_b) = $b =~ /^(\d{1,2}) (\w{3})/; return ($months{$mon_a} < $months{$mon_b}) ? -1 : ($months{$mon_a} > $months{$mon_b}) ? +1 : ($day_a < $day_b) ? -1 : ($day_a > $day_b) ? +1 : 0; } } __DATA__ 28 Sep (Fri) | woody | caribou 28 Sep (Fri) | tinny | litterbin 29 Sep (Sat) | woody | wasp 29 Sep (Sat) | woody | yowling 29 Sep (Sat) | woody | gorn 30 Sep (Sun) | woody | intercourse 30 Sep (Sun) | woody | bound 30 Sep (Sun) | woody | pert 30 Sep (Sun) | tinny | newspaper 01 Oct (Mon) | woody | ocelot 01 Oct (Mon) | woody | concubine 01 Oct (Mon) | tinny | antelope 02 Oct (Tue) | woody | vole 02 Oct (Tue) | woody | sausage 03 Oct (Wed) | tinny | recidivist 03 Oct (Wed) | tinny | tit

    Output:

    23:09 >perl 345_SoPW.pl date | woody words | tinny words -------------+--------------------------+------------- 28 Sep (Fri) caribou litterbin 29 Sep (Sat) gorn, wasp, yowling 30 Sep (Sun) bound, intercourse, pert newspaper 01 Oct (Mon) concubine, ocelot antelope 02 Oct (Tue) sausage, vole 03 Oct (Wed) recidivist, tit 23:10 >

    Of course, the really interesting question is: How do you distinguish words which are ‘woody’ from those which are ‘tinny’?  ;-)

    Hope this helps,

    Updates: Minor code improvements.

    Athanasius <°(((><contra mundum

      Hmm... looks like I forgot to mention that using the iterator would be nice, rather than slurping it all into a hash. The iterator also guarantees the order (dates in order, woody words before tinny), being from a database and all. Which would have much reduced the sorting need I hope.

      Anyway, I've yet to study your code properly, but right now I'm thinking of wrapping my iterator into a second iterator that buffers a day's worth of words. (Dominus's book seems to have had an effect on me.) It's not the nicest answer but one of the easier ones I guess.

        Athanasius's solution is the traditional perl solution for the problem you presented. In other words put the data in a hash (to organize the pivot) and then post process the output.

        If you are trying to leverage some of the databases functionality as you indicated in your second post then you may want to consider $hash_ref = $sth->fetchall_hashref(); rather than the database sort.

        If the goal is to use chapter 4 of Dominus's book to minimize memory overhead then you really just have a recursive problem where you need to identify your base state. (a change of date) You might find the following useful in that case. The data must be pre-sorted by date for this to work. Width formatting is not possible in this output since the data will be printed prior to a test of all rows. That shouldn't be a problem for your suggested goal of web output.

        TIMTOWTDI!

        Update1: changed for to while to honor the iterator concept

        Update2: I just noticed I didn't pick up the last line (no base state test when the query ends. (fixed))

        Update3: fetchrow_hashref is a better fit and can be used with sort

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1000281]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (8)
As of 2014-09-23 03:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (210 votes), past polls