Re: Advance Regular expression questions

by sundialsvc4 (Abbot)
For tasks like these I often pull out bigger guns, like Parse::RecDescent, which not only does a lot of the regular-expression work for you but also sets the whole thing up in a hierarchical way, and the parser includes a certain amount of built-in “find a way to get there” capability.

To illustrate this point of view, the file appears to consist, at the highest level, of a list of zero-or-more groups, each enclosed by parentheses and each group separated from the next by one comma.   That is the outermost-level description of the file.   The first “inner” description is to say that each group consists of a list of one or more “tokens” separated by commas, where a “token” is either a number or a quoted-string.

Where parsers really start to shine, though, is when you might want to express some sort of rule about, say, the structure of (or, the meaning of...) one of those groups, especially if the structure of a group may vary in some way.   A parser takes as its basic input a formal description of (a grammar for...) what a valid file may consist of, not just physically but structurally, and it seeks to match what it is given to whatever the grammar says that it is to expect.   Exactly like finding your way through an unfamiliar city using a map.

In my experience, messy file-parsing tasks (done without a parser/grammar) can very quickly devolve into “write-only code.”   You might establish that it works correctly now, but you dare not touch it again.   In many shops, exactly such programs can be found in abundance:   mission-critical, and utterly fossilized.

