Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Configuration file design

by castaway (Parson)
on Jan 04, 2005 at 09:14 UTC ( #419184=perlmeditation: print w/replies, xml ) Need Help??

A script I am writing doesn't have a GUI, or much user interaction (it basically just warns of potential destructive actions, and asks the user to hit a key to continue, or ctrl-c to stop), it does have a configuration file, however. Since this is it's only user-editable interface, I'd like to keep it simple and fairly self-explanatory.

Which is where I've run into some problems, how to design it so I can fit in all the info I'm likely to need, and keep it simple? How do other people design their configuration files?

To the specific problem: currently my configuration file just contains a bunch of "XXX=YYY" lines, where the variable names are fairly self explanatory (to users of this program anyway. The script itself is intended to be a shortcut to installing a complex piece of software, which normally requires the installer to edit/apply a bunch of templates to create the configuration. (For example, some variables appear in multiple configuration files for this software, and need to be the same in order for it to function properly, they're usually paths). Anyfish, one of these configuration files contains a variable FTP_INSTANCES which contains a list of numbers, eg: FTP_INSTANCES="01 02 03 04". For each of the numbers contained in this variable, I need to be able to configure a file with the number in its name (in this case, my script should produce the files ftp01.cfg ftp02.cfg ftp03.cfg ftp04.cfg), each containing different information about the ftp server they are supposed to collect files from. So, how do I represent this information in my scripts configuration file, without getting too complex?

I have a feeling I might need to make it like an .ini file:

[main] root=/path/to/root filepath=/path/to/files ftp_instances=01 02 03 [ftp01] sourcepath=/source/path/on/server savepath=/save/files/here [ftp02] sourcepath= savepath= [ftp03] sourcepath= savepath=
I don't find it very pretty though, or intuitive (that the user will need to create as many new ftpXX sections as they need for the values in the ftp_instances variable). Also my config file parser will need to be more complex, since currently I'm just splitting on /=/ and that's it.

There appears to be a ton of configuration modules on CPAN, maybe I'm missing a good one, most seem to concentrate on configurations created by, and read by, programs, I'd like one thats easily writeable by humans, and read by programs (with appropriate errors when malformed). Config::General, for example, writes its 'blocks' as xml-like tagged blocks:

<ftp01> sourcepath= savepath= </ftp01>
Which is/would be a more typing for the user, and thus more to get wrong..

Any suggestions, clues?


Update:Removed confusing restriction

Replies are listed 'Best First'.
Re: Configuration file design
by pelagic (Priest) on Jan 04, 2005 at 09:54 UTC
    I ran into a similar problem a while ago and I ended up doing it like that:
    config file:
    [one_server] save_dir = /var/spool/bla/files [one_server ABC] dir = /var/spool/gugus/bla/save glob = . archive = [one_server XYZ] dir = /var/spool/kw4711/bla glob = . archive = kw4711.bla.tar.gz [one_server DEF] dir = /var/spool/quatsch/bla/save glob = . archive =
    I then use Config::IniFiles to read the configuration:
    my $cfg = Config::IniFiles->new( -file => "./$p_cfgfile" ); foreach my $group ($cfg->Groups) { ## do something with: $cfg->val($group, 'save_dir'); foreach my $member ($cfg->GroupMembers($group)) { ## do something with: $cfg->val($member, 'dir'); ## do something with: $cfg->val($member, 'glob'); ## do something with: $cfg->val($member, 'archive'); } }
    The concept of "groups" and "members" can be used to implement structured configuration as long as it contains only two layers.

Re: Configuration file design
by Corion (Patriarch) on Jan 04, 2005 at 09:42 UTC

    If you want to keep the description "intuitive" (for the users of it, who have enough domain knowledge I assume), I think you will have to put "sensible" logic into your program.

    For example, you should do away with the user-configurable listing of ftp_instances and dynamically generate it from each ftpXX section - if there are three ftpXX sections, you set ftp_instances to 01 02 03.

    I can't recommend YAML as a configuration file format, as it is very sensitive to whitespace at the end of the file.

    You could look at how fetchmail "structures" its config file format, but an .ini-style format isn't too bad either. Also consider the possible errors the users can/will make, when you phone in the changes they need to make - whitespace (except maybe newlines) shouldn't matter too much, and the script should output very good diagnostic messages when it encounters stuff it doesn't understand (and also output a log file for post mortem analysis). It should have a facility to comment out a single line. Putting whole sections into comments is an interesting thing but unnecessary in my opinion, and it creates too many nonobvious problems, if for example both, the comment begin and the comment end are outside of the displayed area.

Re: Configuration file design
by hossman (Prior) on Jan 04, 2005 at 09:39 UTC
    Which is/would be a more typing for the user, and thus more to get wrong..

    This isn't always a bad thing ... in fact, it's frequently a good thing.

    By having a format with a little extra markup, you help reduce the number of missunderstandings about what was intended. In simple formats (like KEY=VAL) how does a person represent a value with a newline in it? or what about a key with an equals sign in it ... or a space?

    The user might try these things, and your program might run just fine without any errors, because what they've given you is a valid file, it just may not be valid in the way they think it is. By having a few mandatory requirements that require a little additional typing, you can help ensure that there isn't any confusion.

    Your INI file approach already does a little of this ... assuming you are validating that the only sections in the file are [main] and [ftpNN] where NN appears in the list of ftp_instances, and that everything in the list has a section -- that way you reduce the risk that they add an [ftp99] section without adding it to the list. But you should also worry about the possiblity of duplicate sections (happens a lot when people cut/paste key/val pairs and change the value but forget to change the key)

    The biggest downside to an INI type solution is that it does require them to duplicate data ... you could avoid that by saying that the set of sections determines the number of FTP hosts, and eliminate the "ftp_instances" list completely -- but that's probably not very practical if you want to have lots of sections for lots of sets of config info.

    Which is where XML style config files come in handy ... the biggest complaint against XML configs is duplicate markup; but nestability, quoting, and DTDs make it a really great choice to eliminate confusion and identify mistakes...

    <config> <root>/path/to/root</root> <files>/path/to/files</files> <ftps> <instance id="00"> <source>/blah/path/blah</source> <save>/blah/path/blah</save> </instance> <instance id="01"> <source>/blah/path/blah</source> <save>/blah/path/blah</save> </instance> ... </ftps> </config>

      Sorry, but if you have to make manual changes to such a file, and have a non-XML-competent user, this will become really ugly. You can't "phone in" any changes to such a file, and there is far too much unnecessary fluff that clutters the relevant data - you can't easily put a whole section into comments unless you use the XML comments, which will lead to confusion regarding which section is active or inactive.

      Hmm, sorry, I disagree. The XML looks even worse for a human to edit, sure, its readable, but to create by hand? Yuck.

      OTOH you did give me another idea: The one with leaving out the ftp_instances variable, and just getting the user to give information for each ftp instance. I can create the value of ftp_instances dynamically, the IDs too, since they just need to be unique.

      Another way to do it would be to have variables with a list of values for each ftp_instance variable, but I think that's uglier:

      hostnames=one,two,three sourcepaths=/path/one,/path/two,/path/three
      (and less usable)

      As for your other concerns, the variables can't contain newlines, the split only splits on the first '=' and variable names cant contain an '=' or a space. (These are unix environment vars, so I'd be quite surprised if that even worked, the config files contain mostly a bunch of "export VAR=X", anyway, even if it was possible, this software doesn't have any).

      These won't be idiots, that will be running this, after all, they usually install the software by editing all of it's config files, and getting them lined up with each other. I'm just trying to make the process easier, without making it more complex in the process.

      Duplicating data is not a bad thing, if it provides more clarity.. IMO


Re: Configuration file design
by gwhite (Friar) on Jan 04, 2005 at 11:16 UTC

    ConfigReader::Simple is nice.

    # you can include comments ftp1_param foobar ftp1_root /local/foobar ftp1_string "this is foobar" # ftp2_param foobar ftp2_root /local/foobar ftp2_string "this is foobar"

    You can even use multiple files, one for program settings and one for user settings, so they have less chance of screwing something up, or in your case you could create a separate file for each ftp.

Re: Configuration file design
by holli (Abbot) on Jan 04, 2005 at 11:05 UTC
    if the user is perlwise (and only then), my favourite option is to simply put the config-data as perl code in the config-file, like
    "root" => "/path/to/root", "filepath" => "/path/to/files", "ftp_instances" => [ { "sourcepath" => "/source/path/on/server", "savepath" => "/save/files/here", }, { "sourcepath" => "/source/path/on/server", "savepath" => "/save/files/here", }, ]
    Then you can simply read the file, do
    eval "%CFG = (" . readfile("cfg") . ");";
    and you´re gone.

    The benefit is that you can safely bring in all kind of difficult data (newlines etc.), and you can use an idiom you are pretty much used to.

    just my $ 2/100

      eval "%CFG = (" . readfile("cfg") . ");";

      Is there any specific reason for not using my %CFG = do 'cfg'; instead?

      Juerd # { site => '', plp_site => '', do_not_use => 'spamtrap' }

        if any, then because i didn´t know it could be done that way ,-)
Re: Configuration file design
by martinvi (Monk) on Jan 04, 2005 at 12:33 UTC

    I'm using the approach with Config::IniFiles. A lot of people are "Windows-wise" and able to edit an ini-file, while directed over the phone.

    Non-printable characters are represented bei some escapes like \n for a newline or \s for a blank in the right hand value. Even my most ... um ... inexperienced users have learned those escapes.

    Since my *nix boxen live within an ADS-environment, I'm looking into something like Bundle::Net::LDAP for (mis-)using the existing ADS/LDAP as a general authentification/configuration-tool. I've no practical experience with that approach yet.

    BTW, it's one of the goals evaluating a LDAP-based solution to restrict the ability for configuration to some users -- i.e. teamleaders -- who have also the rights to reset password, assign printers and such.

Re: Configuration file design
by ctilmes (Vicar) on Jan 04, 2005 at 12:17 UTC
    I like YAML:
    root: '/path/to/root' filepath: '/path/to/files' ftp_instances: - sourcepath: '/source/path/on/server' savepath: '/save/files/here' - sourcepath: savepath: - sourcepath: savepath:
    (Though, as noted, it is sensitive to whitespace, YAML returns nice error messages, including a program usable code, a human usable error message and the line number of any error.)

    Reading that YAML file returns this perl structure:

    $VAR1 = { 'ftp_instances' => [ { 'savepath' => '/save/files/here', 'sourcepath' => '/source/path/on/server' }, { 'savepath' => '', 'sourcepath' => '' }, { 'savepath' => '', 'sourcepath' => '' } ], 'filepath' => '/path/to/files', 'root' => '/path/to/root' };
      I prefer YAML, also. It has the high signal-to-noise ratio of traditional config files, while also providing an elegant way of representing heirarchical data -- even quite complex data. And the reference support is cool, too. And I like Ingy's YAML module.

      But then, I tend to see the world monochromatically... I tend to grab one tool (a hammer) and view the rest of the world as a nail...

      And while I'm thinking about it, I'll reference a warning in this thread: 347905
Re: Configuration file design
by dragonchild (Archbishop) on Jan 04, 2005 at 14:05 UTC
    I'm really partial to Config::ApacheFormat. I've found it to be the best config format when it comes to both power and ease-of-use. Plus, it's extremely configurable. :-)

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: Configuration file design
by perrin (Chancellor) on Jan 04, 2005 at 18:21 UTC
    My escalation pattern, based on complexity of the data to be represented, goes like this:
    1. perl config file
    2. Config::ApacheFormat or Config::IniFiles
    3. XML with XML::Simple
    Contrary to what some are saying here, I find XML files to be the easiest solution as soon as you go beyond the basic operation of the others. I always write them by hand and haven't found that to be a problem. It's really a small step from Apache-style configs, but it allows for representation of more complex configuration data.
Re: Configuration file design
by Anonymous Monk on Jan 04, 2005 at 15:16 UTC
    First rule for user editable config files, specially if the file needs to be edit often is to be very, very liberal in what you accept. This would rule out XML, Yaml or Perl code solution. It also means that if there are two obvious ways of putting things in the file, you should accept both. And a third way.

    Second rule is that it should be simple, which again would rule out XML, Yaml and Perl code.

    I might format the config file like this:

    root     = /path/to/root
    filepath = /path/to/files
    server = ftp1
        sourcepath = /source/path/on/server
        savepath   = /save/files/here
    server = ftp2
        savepath   = /files/go/here
        sourcepath = /some/path/somewhere
    I've been using, and still use, lots of configuration files in different formats (most of them just for personal use). And while I try to make them have more or less the same syntax, I often find that every project benefits from a different syntax.

    Some parsers I've used are simply line based, using hardly anything more than split. Sometimes, I write a Parse::RecDescent grammar.

      I quite like this suggestion, since it still just involves X=Y lines and not sections, and if I give the variables meaningfull names (similar to how they are referenced in the documentation), it will probably work quite well.

      I agree, by the way, the config file should be as simple and liberal as possible, and accept a variety ways of doing things, even if it makes the parsing code complex. (It should be easy for them, not me), which is why I'm also ruling out using Perl, XML, YAML and so on.

      Thanks, C.

Re: Configuration file design
by hardburn (Abbot) on Jan 04, 2005 at 14:52 UTC

    I like curly-bracket block formatting, myself:

    main { root=/path/to/root filepath=/path/to/files ftp_instances=01 02 03 } ftp01 { sourcepath=/source/path/on/server savepath=/save/files/here }

    If you've never used Parse::RecDescent before, the above is a great way to learn. The grammar for the above is fairly simple.

    I don't really like the look of XML files, but they do have the advantage that, using XML::Simple, most of the work is already done for you.

    "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Re: Configuration file design
by demerphq (Chancellor) on Jan 04, 2005 at 17:43 UTC

    One thing I dont get about this is why you need the "ftp_instances" key at all. I'd just have a bit of code that assumed that all Ini file sections whose name matches /^ftp/ are valid instances. Its easy enough to loop through the sections as needed, and it means that when somebody decides to delete one of the sections they dont have to remember to update the instances list, and it also means that if the users want to give the instances more descriptive names they can.

    Also, an alternative for things like this is to use a DB for the config data. Of course that still leave the problem of where to put the DB connection details but allowing the user to manage their config through a web client or litte-gui may tickle their fancy more than using a config file.


      Basically, I was stuck in 'I need to do X' mode, and didnt realise until I started reading my answers, that of course I can just construct that value, and not make people enter it at all.

      To confuse things some, this variable (and its associated config files), changed name between two program versions (now it does ftp or scp, and is called transfer)..

      A DB plus a GUI would be lovely, but as I said, this is a script to install a piece of software, on various machines (remote ones, usually), so actually having the DB accessable, even an SQLite one, isn't all that practicable.

      . o O(Although a nice little side project would be to extract this information from our Lotus Notes delivery DB, and precreate some of the config file, or have a CGI to create it in a DB, and a button to create the config file from that info, in which case it could well be XML or whatever.. ) O o ... I can dream..


Re: Configuration file design
by pcouderc (Monk) on Jan 04, 2005 at 17:32 UTC
    I suggest XML associated with a good XML editor as a modern way to solve his kind of problem. Pierre Couderc

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://419184]
Approved by ysth
Front-paged by broquaint
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (4)
As of 2022-11-28 22:10 GMT
Find Nodes?
    Voting Booth?