Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

So here is the XML::Twig version (warning: not tested, I can't this week, let's see how many bugs you find there!)

#!/bin/perl -w use strict; use XML::Twig; my $MAIN__INDEX = "links_main.html"; # main index, linked to categori +es my $INDEX_SUFFIX = "_links.html"; # used to generate the various f +iles per category my $MAIN_TITLE = "My links"; # Title for the main index my $INDEX_TITLE = "Links for %s"; # printf format for low level in +dex titles my $twig= new XML::Twig); $twig->parsefile( './links.xml'); # load the xml doc in memory my @link= $twig->children( 'link'); # first lets get the categories my %categories; $category{$_->att( 'category')++} foreach (@link); # put the categories in an array, sorted by number of links in descend +ing order my @category= sort { $category{$b} <=> $category{$a} } keys %category # generate the main link page open( MAIN, ">$MAIN_INDEX") or die "$0 cannot open $MAIN_INDEX: $!"; # I know I coulda used CGI.pm... print MAIN qq{<html><head><title>$MAIN_TITLE</title></head> <body><h1>$MAIN_TITLE</h1> <ul>}; foreach my $category (@category) { print MAIN qq{<a href="%s"><li>%s<small> (%s links})</small></a></ +li>}, category_file( $category), $category, $category{$category; } print MAIN qq{</ul></body></html>}; close MAIN; # now let's create the categories # it will be easier if we sort he links by category, # in the same order as the @category list # Hi [merlyn]! @links= map {$_->[1] } sort { {$b->[0] <=> $a->[0] } map { [ $category{$_->att( 'category')}, $_ ] } @link; foreach my $category (@category) { my $category_file= category_file( $category); open( INDEX, ">$category_file") or die "$0 cannot open $category_file: $!"; my $title= sprintf $INDEX_TITLE, $category; print INDEX qq{<html><head><title>$title</title></head> <body><h1>$title</h1> <ul>}; # as the links are ordered we know the links for the # current category are at the beginning of @link my $link= shift @link; while( $link->att( 'category') eq $category) { printf INDEX qq{<li><a href="%s">%s</a> %desc</li>\n", $link->( 'url'), link->( 'name'), $link->att( 'description'); $link= shift @link; } print INDEX qq{ <hr><p align="center"><a href="$MAIN_INDEX">$MAIN_ +TITLE</a></p></body></html>}; close INDEX; } sub category_file { my $category= shift; return lc( $category) . $INDEX_SUFFIX; }

This design does not really allow for a different way of sorting the categories, you would also need to modify it slightly if you want to have next/previous index links.

Now the ObNoE (Obligatory Note on Encodings, yes I know it starts like obnoxious ;--). As you seem to have sites from various countries in your link list, I am pretty sure your system will break as soon as you include an accented description: if you have accented characters in a non-UTF-8 encoding (most likely latin1, aka ISO-8859-1 if my memory serves me well, that's what most Western sites use) in your original XML file you will have to add an XML declaration at the top of your document (something like <?xml version="1.0" encoding="ISO-8859-1"?>). This also means that you will not be able to mix encodings (like getting a link to a Japanese site with a shift-JIS encoded description). The output will be UTF-8 encoded, I hope your browser can display it, otherwise you will have to convert everything back to whatever your favourite encoding is, or use the KeepEncoding option when you ceate the XML:Twig object (if you are using a 1-byte encoding like latin1). Welcome to the beautiful world of XML encoding!


In reply to Re: XML Manipulation by mirod
in thread XML Manipulation by larsen

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others meditating upon the Monastery: (4)
    As of 2020-12-01 01:52 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?