http://www.perlmonks.org?node_id=1062969


in reply to Need help with filling a complicated data structure

While you already have two good suggestions, for some reason I couldn't stop wondering how I would have done this, other than by doing it. I am intrigued by the difference in style. There are so many ways to do things...

Compared to the others, this is verbose, and I am undecided whether this helps or hinders understanding. Perhaps it depends on what one knows and how one thinks. I wouldn't offer it, except that it does produce the 'inlist' attributes.

use strict; use warnings; use Data::Dumper::Concise; # Lists in the input are prefixed by indications of their level # and type: # # * is an unordered list # # is an ordered list # #3 a number indicates the value # # Lists are contained in 'inlist' structures. # # An inlist structure is an array of two elements: Type and # Content. # # Type is 'o' for an ordered list or 'u' for an unordered list. # # Content is an array of items in the list. # # The elements of the Contents array are either text or array # refs. # # If the list item has no attributes, it is present as text. # # If the list item has attributes, it is present as reference to # an array of two elements: the text of the element and a # reference to a hash of attributes. # # # Lists may be nested to arbitrary depth. A nested list is contained i +n # the value of the 'inlist' attribute of its parent list item. # my $lists = [ new_inlist() ]; # Stack is an array of pointers to inlist structures. Each inlist # structure contains the elements of a list. The stack grows as # items are added to more deeply nested lists. # # # # Bottom of stack is the current level 1 list. # # Every other element of the stack is a pointer to the inlist # structure containing the current list at some level of nesting. # # The top of stack is a pointer to the inlist structure for the # most deeply nested, current list. # # my $stack = [ $lists->[-1] ]; while(my $line = <DATA>) { chomp($line); if($line =~ m/^\s*$/) { unless( @$stack == 1 and # level 1 @{$stack->[-1]->[1]} == 0 # with no contents ) { push(@$lists, new_inlist()); $stack = [ $lists->[-1] ]; } } else { parse_line_and_add_to_list($stack, $line); } } print Dumper($lists); exit(0); # new_inlist returns an empty inlist data structure # sub new_inlist { return([ undef, [] ]); } # parse_line_and_add_to_list parses an input line into level, # type, value and text, then adds an item to the appropriate list, # according to level, creating sub-lists as necessary. # sub parse_line_and_add_to_list { my ($stack, $line) = @_; my ($pre, $value, $text) = $line =~ m/^([*#]*)([^ ]*)? (.*)/; my $level = length($pre); my $type_marker = substr($pre, -1); my $type = { '*' => 'u', '#' => 'o', }->{$type_marker}; die "unknown list type marker $type_marker" unless($type); while($level < @$stack) { pop(@$stack); } while($level > @$stack) { start_sub_list($stack, $type); } my $item = length($value) ? [ $text, { value => $value } ] : $text; my $inlist = $stack->[-1]; $inlist->[0] = $type unless(defined($inlist->[0])); die "inconsistent type on list element" unless($inlist->[0] eq $type); push(@{$inlist->[1]}, $item); } # start_sub_list adds a sub-list to the last item in the list at # the top of the stack. sub start_sub_list { my ($stack, $type) = @_; # Top of stack points to the innermost inlist structure my (undef, $contents) = @{$stack->[-1]}; my $last_item = $contents->[-1]; $last_item = [ $last_item, {} ] unless(ref($last_item) eq 'ARRAY'); my $attributes = $last_item->[1]; die "Attempt to initialize sub-list on an item with a sub-list" if(exists($attributes->{inlist})); $attributes->{inlist} = [ $type, [] ]; $contents->[-1] = $last_item; push(@$stack, $attributes->{inlist}); } __DATA__ * list 1 unordered item 1 * list 1 unordered item 2 *# list 1 unordered item 2 ordered item 1 *# list 1 unordered item 2 ordered item 2 *# list 1 unordered item 2 ordered item 3 * list 1 unordered item 3 ** list 1 unordered item unordered item 1 ** list 1 unordered item unordered item 2 ** list 1 unordered item unordered item 3 **# list 1 unordered item unordered item 3 ordered item 1 **# list 1 unordered item unordered item 3 ordered item 2 **# list 1 unordered item unordered item 3 ordered item 3 # list 2 ordered item 1 #3 list 2 ordered item 2 # list 2 ordered item 3 #* list 2 ordered item 3 unordered item 1 #* list 2 ordered item 3 unordered item 2 #* list 2 ordered item 3 unordered item 3

Replies are listed 'Best First'.
Re^2: Need help with filling a complicated data structure
by Lady_Aleena (Priest) on Nov 17, 2013 at 16:48 UTC

    ig, this looks like a lot of work, but I am not sure how to incorporate it into the story subroutine. Please see Re^2: Need help with filling a complicated data structure for more, because I can't figure out how to incorporate any of these solves sadly. My weekend has been a wash when it comes to anything I write working.

    No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
    Lady Aleena

      As you iterate over the lines in a section, you recognize the lines which represent list items, but pass each, separately, to sub item, never producing the data structure you described and never calling sub list.

      What you need to do is detect the entire set of lines that represents a list and pass the set to one of the routines that have been suggested, then pass the resulting data structure to your list sub.

      A start might be to redo the loop over lines in a section to facilitate processing groups of lines and then process all the lines of a list together. Maybe something like the following, untested suggestion might work for you.

      for(my $lineno = 0; $lineno < @$section; $lineno++) { my $line = $section->[$lineno]; if($line =~ m/^\[*#]/) { # This line is the start of a list my $start = $lineno; # Where is the end? my $end = $lineno; $end++ while($section->[$end+1] =~ /^[\*#]/); # Get all the lines that are part of the list my @list_lines = @{$section}[$start..$end]; # Put one of the suggested answers to your initial # request into a subroutine and pass it the set of lines # that represents a single list, getting back the data # structure you requested in your original post. my $internal = parse_list(@list_lines); # Extract the list type and contents from the data # structure my ($type, $list) = @$internal; my $opt = '???'; # Where should $opt come from? # And pass these to the list sub to produce the list. list($tab, $type, $list, $opt); # All the lines of the list have been dealt with. Move # the line number (index into @$section) to the end of # the list then carry on to process the rest of the # section. $lineno = $end; } else { # This line is something other than a list line $line = convert_string($section->[$lineno], $line_magic); line($tab, $line), next if $line =~ /^</; line($tab, "<$line>"), next if $line =~ /^[bh]r$/; $doc_magic->{$1}->(), next if $line =~ /^&\s+(.*)/; blockquote($tab, $1), next if $line =~ /^bq\s(.*)/; row($tab + 1, $1, row_line($2)), next if $line =~ /^\|\s(.+)\s +\|\|(.+)$/; heading($tab, $1, $2, { id => idify($2) }), next if $line = +~ /^([1-6])\s+(.*)/; paragraph($tab, $line, { class => 'author' }), next if $line = +~ /^by /; paragraph($tab, $line); } }