http://www.perlmonks.org?node_id=604767

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am facing problem while doing the nested level closing section tags. Can somebody help out for me? i am new for this group. Input:
<h1>Heading level 1 <h2>Heading level 2 <h3>Heading level 3 <h2>Heading level 2 <h1>Heading level 1 paragraph here paragraph here
Output:
<section1> <head>Heading level 1</head> <section2> <head>Heading level 2</head> <section3> <head>Heading level 3</head> </section3></section2> <section2> <head>Heading level 2</head> </section2> </section1> <section1> <head>Heading level 1</head> paragraph here paragraph here </section1>
Your help will be very useful for me.

Replies are listed 'Best First'.
Re: nested level section closing
by GrandFather (Saint) on Mar 14, 2007 at 11:14 UTC

    Use a tool designed for the job. It looks like you may be generating XML, perhaps by parsing some lightly marked up, sorta HTML, source. So lets go with that for a moment and see what can be done:

    use strict; use warnings; use XML::Twig; my $twig = new XML::Twig (pretty_print => 'record'); my $root = new XML::Twig::Elt ('root'); $twig->set_root ($root); my @pendingSections; while (<DATA>) { chomp; if (/<h(\d+)>(.*)/i) { # deal with a header line # First pop any pending sections that need to be closed pop @pendingSections while @pendingSections > $1; while (@pendingSections < $1) { # Add section elements out to current header level my $level = @pendingSections + 1; my $elt = new XML::Twig::Elt ("section$level"); if ($level == 1) { $elt->paste (last_child => $root); } else { $elt->paste (last_child => $pendingSections[-1]); } push @pendingSections, $elt; } my $headElt = new XML::Twig::Elt ("head", $2); $headElt->paste (last_child => $pendingSections[-1]); } else { # It's a paragraph if (@pendingSections) { $pendingSections[-1]->suffix ("$_\n"); } else { $root->suffix ("$_\n"); } } } $twig->print (); __DATA__ <h1>Heading level 1 <h2>Heading level 2 <h3>Heading level 3 <h2>Heading level 2 <h1>Heading level 1 paragraph here paragraph here

    Prints:

    <root> <section1> <head>Heading level 1</head> <section2> <head>Heading level 2</head> <section3> <head>Heading level 3</head> </section3> <head>Heading level 2</head> </section2> <head>Heading level 1</head>paragraph here paragraph here </section1> </root>

    which isn't quite what you showed (especially the root element), but does do the parsing/rendering magic it seems you may be after.


    DWIM is Perl's answer to Gödel
Re: nested level section closing
by shmem (Chancellor) on Mar 14, 2007 at 11:50 UTC
    Your help will be very useful for me.
    Not enough information to help, really. See I know what I mean. Why don't you?

    • What transformation rules do you follow to get the output from the input?
    • Where is the code that does the transformation?
    • What output is expected?

    Please post more information.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: nested level section closing
by ww (Archbishop) on Mar 14, 2007 at 12:10 UTC
    AM --
    As you can perhaps see from the answers above, you've handed us a puzzle; not a comprehensible question.

    What is your question? Where is your code?

    Please see How do I post a question effectively? for invaluable hints (and, hint: that should be "mandatory reading!"] for SoPW.

Re: nested level section closing
by Samy_rio (Vicar) on Mar 14, 2007 at 11:50 UTC

    Hi Anonymous Monk, try like this using Regular Expression,

    use strict; use warnings; my $input = do{local $/; <DATA>}; $input =~ s/(<(h\d+>)[^\n]+)/$1<\/$2/gsi; $input =~ s/(<h(\d+)[^>]*>)(.*?)(?=(<h(\d+)[^>]*>))/"$1".&section_clos +e($2,$3,$5)/egsi; $input =~ s/(<(h\d+>)){2,}/$1/gsi; ######Last level if ($input =~/<h./si) { if ($input =~/(<h\d+[^>]*>)(.*)(<h(\d+)[^>]*>)(.*)$/si) { $input =~s/(<h\d+[^>]*>)(.*)(<h(\d+)[^>]*>)(.*)$/"$1$2$3".&sec +tion_close($4,$5,1)/egsi; } else { $input =~s/(.*)(<h(\d+)[^>]*>)(.*)$/"$1$2".&section_close($3,$ +4,1)/egsi; } } ############Heading replacement $input =~ s/(<h(\d+)>)((?:(?!<\/h\2>).)*)<\/h\2>/$1<head>$3<\/head>/gs +i; $input =~ s/(<\/?)h(\d+>)/$1section$2/gsi; print $input; sub section_close { my ($csect_no,$aft_txt,$asect_no)=@_; my $tag_close; if ($csect_no == $asect_no) { $tag_close="$aft_txt<\/h$csect_no>\n" } if ($csect_no < $asect_no) { my $j = $asect_no - $csect_no; my $i = $csect_no; my $temp = ""; while ($j > 1) { my $k = $i + 1; $temp = $temp."<h$k>\n"; $i++; $j--; } $tag_close = "<h$csect_no>".$aft_txt.$temp; } if ($csect_no > $asect_no) #head separation { my $temp = ""; my $i = $asect_no; for ($i = $asect_no; $i <= $csect_no; $i++) { $temp = "<\/h$i>\n".$temp; } $tag_close = $aft_txt.$temp; } return $tag_close; } __DATA__ <h1>Heading level 1 <h2>Heading level 2 <h3>Heading level 3 <h2>Heading level 2 <h1>Heading level 1 paragraph here paragraph here Output: ------- <section1><head>Heading level 1</head> <section2><head>Heading level 2</head> <section3><head>Heading level 3</head> </section3> </section2> <section2><head>Heading level 2</head> </section2> </section1> <section1><head>Heading level 1</head> paragraph here paragraph here </section1>

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Re: nested level section closing
by marto (Cardinal) on Mar 14, 2007 at 10:25 UTC