Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: XML Manipulation

by mirod (Canon)
on May 22, 2001 at 12:17 UTC ( #82185=note: print w/replies, xml ) Need Help??

in reply to XML Manipulation

So here is the XML::Twig version (warning: not tested, I can't this week, let's see how many bugs you find there!)

#!/bin/perl -w use strict; use XML::Twig; my $MAIN__INDEX = "links_main.html"; # main index, linked to categori +es my $INDEX_SUFFIX = "_links.html"; # used to generate the various f +iles per category my $MAIN_TITLE = "My links"; # Title for the main index my $INDEX_TITLE = "Links for %s"; # printf format for low level in +dex titles my $twig= new XML::Twig); $twig->parsefile( './links.xml'); # load the xml doc in memory my @link= $twig->children( 'link'); # first lets get the categories my %categories; $category{$_->att( 'category')++} foreach (@link); # put the categories in an array, sorted by number of links in descend +ing order my @category= sort { $category{$b} <=> $category{$a} } keys %category # generate the main link page open( MAIN, ">$MAIN_INDEX") or die "$0 cannot open $MAIN_INDEX: $!"; # I know I coulda used print MAIN qq{<html><head><title>$MAIN_TITLE</title></head> <body><h1>$MAIN_TITLE</h1> <ul>}; foreach my $category (@category) { print MAIN qq{<a href="%s"><li>%s<small> (%s links})</small></a></ +li>}, category_file( $category), $category, $category{$category; } print MAIN qq{</ul></body></html>}; close MAIN; # now let's create the categories # it will be easier if we sort he links by category, # in the same order as the @category list # Hi [merlyn]! @links= map {$_->[1] } sort { {$b->[0] <=> $a->[0] } map { [ $category{$_->att( 'category')}, $_ ] } @link; foreach my $category (@category) { my $category_file= category_file( $category); open( INDEX, ">$category_file") or die "$0 cannot open $category_file: $!"; my $title= sprintf $INDEX_TITLE, $category; print INDEX qq{<html><head><title>$title</title></head> <body><h1>$title</h1> <ul>}; # as the links are ordered we know the links for the # current category are at the beginning of @link my $link= shift @link; while( $link->att( 'category') eq $category) { printf INDEX qq{<li><a href="%s">%s</a> %desc</li>\n", $link->( 'url'), link->( 'name'), $link->att( 'description'); $link= shift @link; } print INDEX qq{ <hr><p align="center"><a href="$MAIN_INDEX">$MAIN_ +TITLE</a></p></body></html>}; close INDEX; } sub category_file { my $category= shift; return lc( $category) . $INDEX_SUFFIX; }

This design does not really allow for a different way of sorting the categories, you would also need to modify it slightly if you want to have next/previous index links.

Now the ObNoE (Obligatory Note on Encodings, yes I know it starts like obnoxious ;--). As you seem to have sites from various countries in your link list, I am pretty sure your system will break as soon as you include an accented description: if you have accented characters in a non-UTF-8 encoding (most likely latin1, aka ISO-8859-1 if my memory serves me well, that's what most Western sites use) in your original XML file you will have to add an XML declaration at the top of your document (something like <?xml version="1.0" encoding="ISO-8859-1"?>). This also means that you will not be able to mix encodings (like getting a link to a Japanese site with a shift-JIS encoded description). The output will be UTF-8 encoded, I hope your browser can display it, otherwise you will have to convert everything back to whatever your favourite encoding is, or use the KeepEncoding option when you ceate the XML:Twig object (if you are using a 1-byte encoding like latin1). Welcome to the beautiful world of XML encoding!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://82185]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2020-12-02 01:02 GMT
Find Nodes?
    Voting Booth?
    How often do you use taint mode?

    Results (26 votes). Check out past polls.