Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: appending a unique marker to each url in a file

by Cubes (Pilgrim)
on Aug 08, 2001 at 08:12 UTC ( [id://102968]=note: print w/replies, xml ) Need Help??


in reply to appending a unique marker to each url in a file

As wonderful as regexes are, sometimes they're more trouble than they're worth. The snippet below will do what you want, regardless of whether your links start with http or not. It won't do the right thing if the href targets aren't quoted, or if you have an <a> tag without an href followed by some other tag with an href before the next link, but this is the 5-minute version.

$pos = 0; while ($m = shift @markers) { # locate the beginning of the link last if (($pos = index $htmlfile, '<a', $pos) < 0); # ...then the start of the link's href last if (($pos = index $htmlfile, 'href="', $pos) < 0); # skip past the first " $pos += 6; # ...then the end of the quoted href target last if (($pos = index $htmlfile, '"', $pos) < 0); substr($htmlfile, $pos, 0) = $m; }

At the end, $pos will be -1 and @markers will be empty if you ran out of links before you ran out of markers. If $pos is not -1, do one more index looking for <a and/or href=. If it hits (i.e., does not return -1), you ran out of markers before all of the links were done. If it does return -1, your links and @markers matched up perfectly.

Update: Woops, my ending logic was broken (it's fixed now). The final index check has to be done if $pos is not -1, not just if there's anything left in @markers as I originally stated.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://102968]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-09-07 21:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.