Today I had to write a regular expression definitely more complex
than I'm used to do (I'm a real inept: for example this is the
first time I use x
modifier in REs :)).
This experience showed me I spent a lot of time trying to make my
RE working, specially (in my opinion) 'cause I'm not able to debug
it in a good way.
I used some tecniques in order to make my work faster
(well, I should say "not so slow"...):
- x modifier. I think the first debug trick is always
avoid debug, i.e. write correct code :). And to write correct
RE a good trick is comprehend what are you writing.
Whitespaces and comments are very good and, last but not least,
they let you...
- proceed incrementally. I wrote my code incrementally,
commenting large chunks of the regular expression and debugging
only small pieces
I guess there are other tricks of ways to code better and
faster (faster! faster than before...
- prepare case tests instead of using only actual data that will
be processed by the RE.
- exploit the isomorphism between RE and state machines.
Did you ever simulated by hand your RE (I mean, a "real" RE)
using paper, pencil, coins and drawing a state machine?
(and using a computer?
- I'd really like to discover there's a way to follow
what's happening at "core level" when Perl try to
satisfy a RE. This is not very good from my point of view
(I think this kind of "insight" perturbs what's
happening at core level), but I also think that such
a possibility should save me some time this afternoon.
Maybe something like this?
What do you think about?
Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
Want more info? How to link or
or How to display code and escape characters
are good places to start.