Category: | Web stuff |
Author/Contact Info | Briac Pilpré |
Description: | As strange as it seems, I couldn't find here a code that cleanly modifies HREF attributes in A starting tags in a HTML page. So here's one that I whipped up quickly to answer a question on fr.comp.lang.perl It surely could be easily improved to include other links (<script>, <img src="...">, etc.), but you get the idea... The only (slight) caveats is that the 'a' starting tag is always lowercased and the order of the attributes are lost. But that should not matter at all. To use this script, you have to modify the $new_link variable, and then call the script with the URL of the page to be modified. Every <a href="..."> will have the $new_link added at the start of the href, and the old URL will be properly escaped. It is probably useless as is, but with a minimum of tweaking, you can easily do what you want. |
#!/usr/bin/perl -w use strict; use LWP::Simple qw(get); use HTML::Parser; use URI::Escape; my $new_link = "http://www.baz.com/cgi-bin/doubleclick.cgi?url="; my $url = $ARGV[0] or die "usage: $0 http://www.foo.com/bar.html\n"; my $file = get($url) or die "Cannot get the page '$url'\n"; my $parser = HTML::Parser->new( default_h => [ sub { print shift }, 'text' ], start_h => [ \&modify_link, 'tagname, attr, text' ], )->parse($file); sub modify_link { my ( $tagname, $attr, $text ) = @_; print $text and return if $tagname ne 'a'; $attr->{href} = $new_link . uri_escape( $attr->{href} ); print '<a', ( map { qq' $_="$attr->{$_}"' } keys %$attr ), '>'; } __END__ |
|
---|
Replies are listed 'Best First'. | |
---|---|
(crazyinsomniac) Re: HTML Link Modifier
by crazyinsomniac (Prior) on Dec 24, 2001 at 07:53 UTC | |
Re: HTML Link Modifier
by dmitri (Priest) on Dec 28, 2001 at 00:46 UTC |