http://www.perlmonks.org?node_id=423543


in reply to Comparative satisfiability of regexps.

As long as you mean regular expressions in the CS-theory sense (no backrefs, etc), yes it's perfectly computable, although it takes exponential time (and space) in the general case. No, this is not as general as the undecidable question "are these two algorithms equivalent" that others have mentioned in this thread. The underlying critters we are dealing with are regular languages (not recursive languages), and all decision properties of regular languages (that I can think of, at least) are decidable.

Here's how a theory person would check if two regexes were equivalent. In fact, this is a very good question for a first exam in a computational models/formal languages class.

  1. Convert both regular expressions to equivalent NFAs. This is probably the most annoying part as you need a regex parser. This can be done in linear time.
  2. Convert the NFAs to DFAs. This is the part that takes exponential time in the general case.
  3. Do a product construction on the two DFAs, setting the accept states appropriately so the result accepts the symmetric difference of the two languages. This is done in quadratic time.
  4. Check to see if the resultant DFA accepts the empty language. This is easily done in linear time with a DFS/BFS traversal. If so, then the two regular expressions you started with were equivalent. If not, you can report back any string which this DFA accepts as a counterexample to their equivalence (such a string would be in one language but not the other).
A review of the first few chapters of your favorite automata book might be helpful to fill in the gaps. I have the old classic Intro to Automata Theory, Languages & Computation by Hopcroft/Ullman. It goes through these constructions with an algorithmic feel, which you might appreciate.

Of course, this begs the question, is there a regular language playground module, where I can take a regular language, perform all the closure property constructions, decision problems, and conversion between NFA/DFA/regex representations? And if not, why not? ;)

Update: Fixed ISBN link (for some reason, the ISBN on the back of my edition wasn't recognized.

blokhead