Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

I've been working on an outliner idea, a cgi-based way of organising documents into sections, subsections and so on.

(Note: this is not a work project, just a good way to get myself involved in some interesting perl code, so I'm not looking for a magic bullet or a module, I want to ask for your thoughts)

Here's my idea: the sections, (using a tied hash) will be "numbered" using the "Latin" or "alphadeximal" format I was asking about in another thread. Does AA..ZZ have a name?

So the first section would be section AA, the second section, section AB.

If AA has sub-sections, they are AAAA, AAAB, AAAC and so on.

If they in turn have subsections, they are AAAAAA, AAAAAB, and so on.

This has the advantage that you can sort all sections into their right order, that you can determine the level by length, gives you unlimited sub-levels and a large number of sections before you run out of "digits". I figured this system would be easier than using numbers for those reasons. You can also get all the contents of section AA by grepping the fact that they begin "AA" and so on.

So, the next subroutine I need to write, having got display out of the way, is a renumbering routine for when a section is deleted or inserted.

If I insert a section after AA, that's easy, because it's top level.

A new section after AA is section AB, so everything AB and after in alphabetical order, gets its first two chars incremented by one. The old AB becomes AC and so on.

(And can I just say how useful it is that you can assign to and even increment a substring:

$string = 'ABAA'; substr($string,0,2)++;
produces 'ACAA')

Now I realise I've got to figure out the algorithm that will push a sub-section, and all its sub-subsections, down one level. I've just been staring at it too long.

I've got as far as "sections which begin with the character string, or a character string after that character string in alpha order, are of equal or greater length, and come after it in alpha order" -- will a "@need_updating = grep {conditions} keys(%hash)" for those conditions produce the right list of the affected sections?

Any thoughts gratefully received.

($_='jjjuuusssttt annootthheer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;

Replies are listed 'Best First'.
(jeffa) Re: Outliner Algorithm Ideas?
by jeffa (Bishop) on May 10, 2002 at 03:32 UTC
    "And can I just say how useful it is that you can assign to and even increment a substring..."

    Amen brother! But have your tried to DEcrement a string?

    $ perl -le '$_="ABAA";substr($_,0,2)--;print'
    Whoops! Need to do some more magic to get that work, because i do believe you will need to decrement your sections when you delete. If you don't need to delete or don't care about moving AE to AD and AD to AC because AC was deleted, then never mind. :)

    Here is what i have so far - adding sections is a breeze, but deleting is always tougher. This is by no means a perfect solution, just 'food for thought':

    use strict; my $hash = { AA => { AA => {}, AB => {}, AC => { AA => { AA => {}, }, }, }, }; add($hash); add($hash->{AB}); add($hash->{AB}->{AA}); print_me($hash); add($hash); add($hash->{AC}) for (0..5); add($hash->{AC}->{AC}); print_me($hash); remove($hash->{AC},'AC'); print_me($hash); remove($hash,'AB'); print_me($hash); sub add { my $ref = shift; my $new = (sort keys %$ref)[-1]; if ($new) { $ref->{++$new} = {}; } else { $ref->{'AA'} = {}; } } sub remove { my $ref = shift; my $sec = shift; delete $ref->{$sec}; for (sort keys %$ref) { # skip higher section next if ord(substr($_,1,1)) < ord(substr($sec,1,1)); my $new = substr($_,0,1) . chr(ord(substr($_,1,1))-1); my %new = %{$ref->{$_}}; $ref->{$new} = \%new; delete $ref->{$_}; } } sub print_me { my $ref = shift; my $ind = shift || ''; for (sort keys %$ref) { print "$ind$_\n"; print_me($ref->{$_},$ind."\t") if ref $ref->{$_}; } }
    Too bad you can't just change the actual name of a key, life would be easier (thanks to Zaxo and stephen for confirming this for me). Any monks that know how to improve my remove subroutine, please share!! Corrections welcome too. :)

    Final thoughts - use a relational database! But since this is just for fun and education, i wish you God speed! ;)

    Had some warnings - code fixed to prevent these, text changed to remove mention of them.


    (the triplet paradiddle with high-hat)
Re: Outliner Algorithm Ideas?
by stephen (Priest) on May 10, 2002 at 02:50 UTC

    I would suggest that you store the items in an arrayref of hashrefs containing arrayrefs containing hashrefs, like so:

    # Represents # AA Sections are cool # AB They are extremely cool # ABAA That's right, really cool # ABAB Again, really cool my $sections = [ {content => 'Sections are cool' }, {content => 'They are extremely cool', subsecs => [ { content => "That's right, really cool" +}, { content => "Again, really cool" }, ] } ];
    Then, you can print the sections out with something vaguely resembling this
    recurse_section( $sections ); sub recurse_section { my ($sections, @vaportrail) = @_; push @vaportrail, 'AA'; foreach my $section (@$sections) { print @vaportrail, " ", $section->{'content'}, "\n"; recurse_section($section->{subsecs}, @vaportrail) if defined $section->{subsecs}; $vaportrail[ $#vaportrail ]++ } }
    That way, your model (the section data) doesn't get entangled with the view (what the numbering scheme looks like). Then promoting a section is as easy as a little data manipulation:
    # Add a supersection above "They are extremely cool" my $moving_section = $sections->[1]; $sections->[1] = { content => 'And moving sections is easy too!', subs +ecs => [ $moving_section ] }; recurse_section( $sections );

    You can tie your hashtable to a subroutine that translates the ABAAAC business to a list of array indices. That I'll leave to you. :)


Re: Outliner Algorithm Ideas?
by graff (Chancellor) on May 10, 2002 at 04:18 UTC
    The tricky part here is not pushing a section and its subsections down a level -- this just amounts to splicing in a couple extra "digits" on a set of these keys, and that's easy -- e.g.:
    sub pushSection { my ( $hashref, @secNames ) = @_; my $base = (sort @secNames)[0]; $baselen = length $base; my $newBgn = "AA"; my $grex = join( '|', @secNames ); @pushees = sort grep( /^($grex)/, keys %$hashref ); foreach $i (0 .. $#pushees) { $_ = $pushees[$i]; my $txt = $hashref->{$_}; delete $hashref->{$_}; s/^($grex)(.*)/$base$newBgn$2/; $hashref->{$_} = $txt; $newBgn++ if ( $i < @pushees && length($pushees[$i+1]) == $bas +elen ); } } chomp( @sections = <DATA> ); foreach (@sections) { next unless (/\S/); # not sure why I needed this $outline{$_} = "some text for section $_"; } print "\nbefore:\n"; map {print "$_\t$outline{$_}\n"} sort keys %outline; pushSection( \%outline, "AAAB", "AAAC" ); print "\nafter:\n"; map {print "$_\t$outline{$_}\n"} sort keys %outline; __DATA__ AA AAAA AAAAAA AAAB AAAC AAACAA AAAD AAAE
    (There is one subtlety I'm missing there -- see below -- but there's nothing all that tough involved.)

    The tricky part is providing the suitable method to determine what range of sibling-level sections should be included in this sort of push. E.g. in the example provided above, I'm pushing "AAAB" and "AAAC" down together, while the sibling sections around them ("AAAA" and "AAAD") stay where they are; "AAAD" should probably be "renamed" to "AAAC", etc. (this is the subtlety that I'm not covering yet):

    #before :: after AA :: AA AAAA :: AAAA AAAAAA :: AAAAAA AAAB :: AAABAA # need to insert "AAAB" above this line AAAC :: AAABAB AAACAA :: AAABABAA AAAD :: AAAD # need to rename this to "AAAC" AAAE :: AAAE # need to rename this to "AAAD"
    But it's just as likely that on another occasion, starting with the same structure, the push would need to include AAAB, AAAC, and AAAD (and AAAE is now renamed to AAAC); and so on. Once you have a way of controlling (or discovering) the scope of a push, the actual mechanics of it won't be all that tough.

    The part about "inserting" a parent node and "renaming" subsequent sections after a push has left fewer siblings "prior to" a given node is something I would tend to do as a "sanity pass" after the push -- go through the sorted set of section names at each level to look for gaps, and "decrement" subsequent names at the affected level to eliminate such gaps.

    Update: fixed wording in last paragraph. Damn shame that decrementing strings is not so simple (thanks, jeffa)... but not surprising. What would be the "right" result for '$string--' when the initial value is "AA"? (Of course, this "shouldn't happen" in this sort of outline system, but it's the sort of detail that has to be dealt with.)