Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Hi, I'm converting a set of html list tags to output a JSON string. The ugly, painfully-derived code below will produce the required data structure, but only if hard-coded to the correct depth.
It's well beyond my abilities to get this conversion to work recursively but, starting with the @path data below, how would the experts approach this?
1/ Html list example (from nested menu structure):

<li><a href="News-and-Media.html">News and Media</a></li> <li><a href="Partners.html">Partners</a> <ul> <li><a href="Partners_Subpage1.html">Subpage1</a> <ul> <li><a href="Partners_Subpage1_Page1.html">Page1</a> <ul> <li><a href="Partners_Subpage1_Page1_First.html">First</a></li> <li><a href="Partners_Subpage1_Page1_Second.html">Second</a></li> </ul> </li> <li><a href="Partners_Subpage1_Page2.html">Page2</a></li> <li><a href="Partners_Subpage1_Page3.html">Page3</a></li> </ul> </li> <li><a href="Partners_Subpage2.html">Subpage1</a> <ul> <li><a href="Partners_Subpage2_Page1.html">Page1</a> <ul> <li><a href="Partners_Subpage2_Page1_First2.html">First</a></li> <li><a href="Partners_Subpage2_Page1_Second2.html">Second</a></li> </ul> </li> </ul> </li> </ul> </li> etc etc


2/ Required result - JSON format, TITLE: {URL:PROPERTY} nested objects

{ "Site-Map.html":{ "Home":{ "Home.html":"" }, "About":{ "About.html":"" }, "Products":{ "Products.html":"" }, "Solutions":{ "Solutions.html":"" }, "Services":{ "Services.html":"pass","Subpage1":{"Services_Subpage1.htm +l":"" },"Subpage2":{"Services_Subpage2.html":"" } }, "Support":{ "Support.html":"pass" }, "News-and-Media":{ "News-and-Media.html":"" }, "Partners":{ "Partners.html":"","Subpage1":{ "Partners_Subpage1.html": +"","Page1":{ "Partners_Subpage1_Page1.html":"","First":{ "Partners_Su +bpage1_Page1_First.html":"" },"Second":{ "Partners_Subpage1_Page1_Sec +ond.html":"" } },"Page2":{ "Partners_Subpage1_Page2.html":"" },"Page3 +":{ "Partners_Subpage1_Page3.html":"" } },"Subpage2":{ "Partners_Subp +age2.html":"","Page1":{ "Partners_Subpage2_Page1.html":"","First2":{ +"Partners_Subpage2_Page1_First2.html":"" },"Second2":{ "Partners_Subp +age2_Page1_Second2.html":"" } } } }, "Blog2":{ "http://www.etc.uk/index.html":"" }, "Contact-Us":{ "Contact-Us.html":"" }, "Privacy":{ "Privacy.html":"hid" }, "Terms":{ "Terms.html":"hid" }, "Site-Map":{ "Site-Map.html":"hid" } } }


3/ Existing code:

use strict; use warnings; my %PAGES = (); my @path = ( ###get list input from html file into array [["Home"],"\"Home.html\":\"\""], [["About"],"\"About.html\":\"\""], [["Products"],"\"Products.html\":\"\""], [["Solutions"],"\"Solutions.html\":\"\""], [["Services"],"\"Services.html\":\"pass\""], [["Services","Subpage1"],"\"Services_Subpage1.html\":\"\""], [["Services","Subpage2"],"\"Services_Subpage2.html\":\"\""], [["Support"],"\"Support.html\":\"pass\""], [["News-and-Media"],"\"News-and-Media.html\":\"\""], [["Partners"],"\"Partners.html\":\"\""], [["Partners","Subpage1"],"\"Partners_Subpage1.html\":\"\""], [["Partners","Subpage1","Page1"],"\"Partners_Subpage1_Page1.html\":\"\ +""], [["Partners","Subpage1","Page1","First"],"\"Partners_Subpage1_Page1_Fi +rst.html\":\"\""], [["Partners","Subpage1","Page1","Second"],"\"Partners_Subpage1_Page1_S +econd.html\":\"\""], [["Partners","Subpage1","Page2"],"\"Partners_Subpage1_Page2.html\":\"\ +""], [["Partners","Subpage1","Page3"],"\"Partners_Subpage1_Page3.html\":\"\ +""], [["Partners","Subpage2"],"\"Partners_Subpage2.html\":\"\""], [["Partners","Subpage2","Page1"],"\"Partners_Subpage2_Page1.html\":\"\ +""], [["Partners","Subpage2","Page1","First2"],"\"Partners_Subpage2_Page1_F +irst2.html\":\"\""], [["Partners","Subpage2","Page1","Second2"],"\"Partners_Subpage2_Page1_ +Second2.html\":\"\""], [["Blog2"],"\"http://www.etc.uk/index.html\":\"\""], [["Contact-Us"],"\"Contact-Us.html\":\"\""], [["Privacy"],"\"Privacy.html\":\"hid\""], [["Terms"],"\"Terms.html\":\"hid\""], [["Site-Map"],"\"Site-Map.html\":\"hid\""] ); my $out = "{ \"Site-Map.html\":{ "; my %PAGES = (); my @SET = (); my @temp = (); my $c = 0; for my $i(0..$#path){ if( !$PAGES{$path[$i][0][0]} ){$PAGES{$path[$i][0][0]} = [$i];} if( $#{$path[$i][0]} == 0 ){ push @{ $PAGES{$path[$i][0][0]} },"\"$path[$i][0][0]\":{ ".$path[$i][1 +]; #%hh } elsif( $#{$path[$i][0]} == 1 ){ $PAGES{$path[$i][0][0]}[2]{$path[$i][0][1]} = [ $i,"\"$path[$i][0][1]\ +":{ ".$path[$i][1] ]; } elsif( $#{$path[$i][0]} == 2 ){ $PAGES{$path[$i][0][0]}[2]{$path[$i][0][1]}[2]{$path[$i][0][2]} = [ $i +,"\"$path[$i][0][2]\":{ ".$path[$i][1] ]; } elsif( $#{$path[$i][0]} == 3 ){ $PAGES{$path[$i][0][0]}[2]{$path[$i][0][1]}[2]{$path[$i][0][2]}[2]{$pa +th[$i][0][3]} = [ $i,"\"$path[$i][0][3]\":{ ".$path[$i][1] ]; } else { #descend to madness } } foreach my $s( sort { $PAGES{$a}[0] <=> $PAGES{$b}[0] } keys %PAGES ){ $out.= $PAGES{$s}[1]; if( $PAGES{$s}[2] ){ my @s1 = (); foreach my $t( sort { $PAGES{$s}[2]{$a}[0] <=> $PAGES{$s}[2]{$b}[0] } +keys %{ $PAGES{$s}[2] } ){ push @s1,$PAGES{$s}[2]{$t}[1]; if( $PAGES{$s}[2]{$t}[2] ){ my @s2 = (); my $o2 = ""; foreach my $u( sort { $PAGES{$s}[2]{$t}[2]{$a}[0] <=> $PAGES{$s}[2]{$t +}[2]{$b}[0] } keys %{ $PAGES{$s}[2]{$t}[2] } ){ push @s2,$PAGES{$s}[2]{$t}[2]{$u}[1]; if( $PAGES{$s}[2]{$t}[2]{$u}[2] ){ my @s3 = (); my $o3 = ""; foreach my $v( sort { $PAGES{$s}[2]{$t}[2]{$u}[2]{$a}[0] <=> $PAGES{$s +}[2]{$t}[2]{$u}[2]{$b}[0] } keys %{ $PAGES{$s}[2]{$t}[2]{$u}[2] } ){ push @s3,$PAGES{$s}[2]{$t}[2]{$u}[2]{$v}[1]; } if($s3[0]){$o3 = join " },",@s3;$o3.= " }";$s2[-1].= ",".$o3;} } } if($s2[0]){$o2 = join " },",@s2;$o2.= " }";$s1[-1].= ",".$o2;} } } if($s1[0]){$out.= ",".(join " },",@s1)." } },";} } else { $out.= " },"; } } $out =~ s/,$//; $out.= " } }"; print $out;

In reply to recursive alternative for nested array to JSON script? by davepilbeam

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-03-19 11:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found