|Come for the quick hacks, stay for the epiphanies.
Parsing a Variable Format Stringby ozboomer (Friar)
|on Jul 10, 2008 at 01:34 UTC
ozboomer has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to process the text output from a program where the output format keeps changing. I've tried using a format string and unpack() but this is getting unwieldy now that I have a few formats to deal with.
Not including the ">" and "<" characters, example data strings are:
For subsequent processing, I'd prefer to end-up with the format shown in (iii).
I'm thinking along the lines of using some sort of regex but I just can't get my head around the silly things. Maybe I can do the 'extraction' of the fields one by one or perhaps I can do it all in one hit, I don't know. Performance level isn't a huge priority, as most of the programs that need to do this run as overnight batch jobs.
I guess the sort of thing I'm looking for is something like:
...after which I can glean the following info:
Apologies for the cryptic/variable descriptions -- some of the ways the software formats its output is a little scatter-brained IMO(!)
I'd appreciate any suggestions on how I might attack the problem.
Thanks a heap.