Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: Re: How to use Regular Expressions with HTML

by Ovid (Cardinal)
on Aug 16, 2003 at 17:02 UTC ( #284343=note: print w/replies, xml ) Need Help??

in reply to Re: How to use Regular Expressions with HTML
in thread How to use Regular Expressions with HTML

Here are a few other generalizations:

  • Use strict.
  • Don't reinvent the wheel.
  • Don't use goto.
  • Don't optimize up front.
  • OO modules shouldn't export anything.

Those are all great ideas and Perl programmers would be better off if they lived by them. That being said, I've broken every one of those rules and will happily do so in the future, if need be. The important thing is that I understand the reasoning behind those things and try to live by them.

From what I can see from your post, you have the same opinion about HTML that I do, but you spent a lot of time qualifying it. I have that sort of attitude regarding my above list of generalizations, but I'd never get a single post finished if I was forced to make all of those qualifications. I toss out the generalizations first and then list exceptions only if needed.

In short, I'm not arguing with you, but for most situations that I encounter, whipping out regular expressions for HTML is a bad idea and encouraging programmers to follow that practice would be an even worse idea.


New address of my CGI Course.

  • Comment on Re: Re: How to use Regular Expressions with HTML

Replies are listed 'Best First'.
Re: Re: Re: How to use Regular Expressions with HTML
by tomhukins (Curate) on Aug 16, 2003 at 18:48 UTC

    I finished reading Christopher Alexander's The Timeless Way of Building today. Alexander writes about architecture, but his ideas, such as patterns, have been adopted by software developers.

    Your list above (use strict, don't reinvent the wheel, etc.) is basically a list of Perl patterns, practices that should exist in well written programs.

    Although we learn good habits by following rules, we ultimately derive those rules from observing what we find good. Patterns, or best practice, summarise our experiences and allow us to share them with others.

    In his last chapter, Alexander notes that another place can be without the patterns which apply to it, and yet still be alive: we should follow the spirit of the rules we lay down, not the letter. So paradoxically you learn that you can only make a building live when you are free enough to reject even the very patterns which are helping you once you understand the patterns well.

    Tim Bray uses Perl's regular expressions to parse XML and you use regexps to parse HTML. I don't anticipate doing either any time soon, because the general problems I encounter fit the solution of using existing CPAN modules, and because I don't consider myself knowledgeable enough about such things to break the rules yet.

Re3: How to use Regular Expressions with HTML
by dragonchild (Archbishop) on Aug 18, 2003 at 13:26 UTC
    One should only break the rules when one understands why the rule exists.

    Like Ovid, I have broken every one of those rules, and more. (Personally, I love playing with soft references in production code, but I'm masochistic.) But, I will follow those rules in 99.9% of my code. The point is that most programmers shouldn't parse HTML with regexes most of the time. Heck, most shouldn't do it all of the time. And, if you do it, it should be modularized, packaged, and then never messed with again. :-)

    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://284343]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (11)
As of 2019-10-23 12:24 GMT
Find Nodes?
    Voting Booth?