http://www.perlmonks.org?node_id=560717


in reply to Parsing a complex config file

Something earlier in this thread seemed to indicate that you have some control over the syntax and formatting contained in the config file. Even if this is not the case, you will certainly save yourself a *lot* of time if you simply use a pre-existing data serialization format, instead of inventing your own. (see e.g., YAML, XML, JSON, WDDX).

The benefits of using a pre-established syntax are too numerous to mention here, but the only *disadvantage* is that you don't get the 'personal growth' experience of going through the tedium of the inventing/parsing/debugging cycle yourself. Learning how to write your own parsing code can be an educational experience, but do you really want to go through all that if all you are doing is reading config files?

Even if you cannot choose a pre-established syntax, you still are probably better off by simply *converting* the "custom" syntax into a pre-existing one. For example, here is some code that converts your sample data into YAML.

### begin_: init perl use strict; use warnings; ### p__: standard perl libraries use YAML; use Data::Dumper; ### begin_: get sample data my $sRaw = join '',<DATA>; ### begin_: convert to YAML for ($sRaw){ ### p__: scrub the top part s/libname=/\n- domain: begin\n libname: /gms; s/pathname=([^\s]+)/\n pathname: "$1"/gms; s/owner=([^\s]+)/\n owner: "$1"/gms; s/libaclinherit=([^\s]+)/\n libaclinherit: "$1"/gms; s/dynlock=([^\s]+)/\n dynlock: "$1"/gms; s/roptions=\x22//gms; ### p__: scrub the roption stuff for my $sOpt qw(datapath indexpath workpath metapath){ s/\s+$sOpt=\x28([^\x29]+)\x29/\n $sOpt: [$1]/gms; } ### p__: scrub the oddball stuff s/\n^\x20{4,}/,/gms; s/,\x2e{3}//gms; s/\x22;//gms; s/\x5d[\x2c\x20]+/\x5d/gms; $_ .= "\n"; }; ### begin_: display result ### p__: show raw converted to yaml print $sRaw; print "\n---\n"; ### p__: show yaml converted to perl my $oData = YAML::Load($sRaw); print Data::Dumper->Dump([$oData], [qw(oDomains)]); ### begin_: end_perl 1; __END__ libname=foo pathname=/path/to/metadata/foo owner=someuser libaclinheri +t=no dynlock=no roptions=" datapath=('/data/path1' '/data/path2' '/data/path3' ...) indexpath=('/indx/path1' '/indx/path2' '/indx/path3' ...) workpath=('/work/path1' '/work/path2' '/work/path3' ...) metapath=('/meta/path1' '/meta/path2' '/meta/path3' ...)"; libname=foo pathname=/path/to/metadata/foo owner=someuser libaclinheri +t=no dynlock=no roptions=" datapath=('/data/path1' '/data/path2' '/data/path3' ...) indexpath=('/indx/path1' '/indx/path2' '/indx/path3' ...) workpath=('/work/path1' '/work/path2' '/work/path3' ...) metapath=('/meta/path1' '/meta/path2' '/meta/path3' ...)";
The Raw-To-YAML conversion gives you something like this:
- domain: begin libname: foo pathname: "/path/to/metadata/foo" owner: "someuser" libaclinherit: "no" dynlock: "no" datapath: ['/data/path1','/data/path2','/data/path3'] indexpath: ['/indx/path1','/indx/path2','/indx/path3'] workpath: ['/work/path1','/work/path2','/work/path3'] metapath: ['/meta/path1','/meta/path2','/meta/path3'] - domain: begin libname: foo pathname: "/path/to/metadata/foo" owner: "someuser" libaclinherit: "no" dynlock: "no" datapath: ['/data/path1','/data/path2','/data/path3'] indexpath: ['/indx/path1','/indx/path2','/indx/path3'] workpath: ['/work/path1','/work/path2','/work/path3'] metapath: ['/meta/path1','/meta/path2','/meta/path3']
The YAML-To-Perl conversion gives you something like this: (this is all done for you by YAML, no parsing necessary)
$oDomains = [ { 'owner' => 'someuser', 'indexpath' => [ '/indx/path1', '/indx/path2', '/indx/path3' ], 'libaclinherit' => 'no', 'libname' => 'foo', 'workpath' => [...] ... ];

Even if you cannot store the config files as YAML, you can still use simple regex code to convert them. Sure, you will still have to do a little tweaking and debugging to make sure the YAML output is well-formed, but the leverage you get makes the task much simpler, *especially* if your perl skills are a tad rusty.

=oQDlNWYsBHI5JXZ2VGIulGIlJXYgQkUPxEIlhGdgY2bgMXZ5VGIlhGV

Replies are listed 'Best First'.
Re^2: Parsing a complex config file
by solitaryrpr (Acolyte) on Jul 12, 2006 at 18:46 UTC
    While I have some general control over what goes in the config file, the structure is pretty much defined for me. Grandfather pegged it. Would that I were that able.