|Problems? Is your data what you think it is?|
How to Deflea a Catby Ovid (Cardinal)
|on Aug 28, 2003 at 15:54 UTC||Need Help??|
In reply to Ovid and Abigail's comments : the TokeParser module is a great idea, the problem is that my remit is to investigate how regexps can be applied to reformatting HTML pages. I have a regexp for a background colour attribute, though the '#' character treats all characters following as a comment!
I'm not sure what you mean by your statement that your "remit is to investigate how regexps can be applied to reformatting HTML pages". If, by that, you mean that someone else has tasked you with this, then they have made a mistake. If someone comes to me and says "Ovid, I need you to deflea my cat. Here, use this shotgun", then I know that person made a mistake that's all too common in business. In short, the mistake is to say "here's a solution, let's see how we can make it fit our problem." That's absolutely the wrong way to go about things.
Mind you, it's an easy thing to do. I suspect that cyanide kills fleas. Therefore, I might ask a friend "how can I use cyanide to deflea my cat?" When that friend tells me to use flea powder, my first instinct shouldn't be "but I've got all of this cyanide handy, how do I use that?" Instead, a better tactic is to revisit the original problem. How do I remove the fleas from my HTML ... er ... cat? If the proposed solution is better than mine, I should be willing to swallow my pride and go with the best solution. Heck, if all politicians believed that, we'd have a much better country :)
Just for giggles, let's look at some valid HTML tags:
Do you like all of those font tags? Most browsers will render all of them identically. That's a great example of why most regular expressions will fail. They're tough to write.
But just to show you that I'm a good sport about how to deflea your cat, here's a link to Tom Christiansen's article, HTML Hacking with Regular Expressions. Enjoy!
New address of my CGI Course.