http://www.perlmonks.org?node_id=1014715


in reply to Regex Question

One thing that you need to keep in mind when building regexes like these is the subject of greedy.   By default, the regex will (“greedily ...”) find the longest available string that matches.   You can specify that it should, instead, opt for the shortest one, and sometimes (depending on the nature of the string that is to be processed) you must do that.

I also strongly advocate that your programming should be suspicious of its input files.   If you expect 5 strings, check each time that you have them.   In short, if you can be certain of anything in the correctly-running program given correct data, “be from MIssouri ... show me.”   Quite frankly, most of the time, I’ve encountered broken data.   The supplier of the data didn’t know it was broken.   “Inexplicable bugs” turned out to be from that cause.   Only the computer itself is in the necessary position to recognize the existence of these issues ... take the slight extra time to make it do so.