Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: HTML document modification

by BrowserUk (Patriarch)
on May 28, 2004 at 02:04 UTC ( [id://357116]=note: print w/replies, xml ) Need Help??


in reply to HTML document modification

Maybe if you read the file backwards and only replaced the last (first:) occurance of </body>.

If your html is well formed that should be fairly foolproof.

Perhaps, rather than reading backwards line by line, you could read the last couple of hundred bytes and then use the regex

s[(</body>)(?=(?:\s*</html>)>\s*\Z][$insert$1]i;

By only replacing the /body tag if there is only whitespace and an (optional) /html tag between it and the EOF, you'd be pretty certain of correctness assuming reasonably well-formed html. That wouldn't handle comments, but they are (probably) fairly rare at that point in the html?

If you raised an error in the event that the regex didn't match, any oddities could be fixed up manually.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://357116]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-03-19 03:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found