Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Converting a Flat-File to a Hash

by Anonymous Monk
on Aug 13, 2006 at 21:40 UTC ( #567118=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've got a bit of a problem. For security and convenience reasons, I'd like my program to read in some data at start, rather than hard-code it into the program. So far, the best I could come up with was the following:
#!/usr/bin/perl use strict; use warnings; my %data; my $InputFile = "/the/file/in/question"; open INPUT, "< $InputFile"; foreach $line (<INPUT>) { my @array = split /:/, $line; $data{$array[1]} = $array[2]; } contents of /the/file/in/question: foo:FOO bar:BAR baz:BAZ qux:QUX
Simple and minimalist as it is, I cannot help feeling there is a better way of doing things, which will make better use of the computer's resources. Any ideas?

Comment on Converting a Flat-File to a Hash
Download Code
Re: Converting a Flat-File to a Hash
by bobf (Monsignor) on Aug 13, 2006 at 22:09 UTC

    Check out the Config:: namespace on CPAN. There are quite a few modules that exist for this purpose - pick one that uses the most convenient format for you. I used Config::General a few times, but Config::IniFiles and Config::Simple also look good. I selected Config::General because it allowed me to insert comments and blank lines for whitespace, and also use a block structure to nest levels of config parameters (thereby giving me the ability to construct a HoH or other complex data structure). Config::IniFiles reads INI style files and Config::General reads a INI-like files.

Re: Converting a Flat-File to a Hash
by Cody Pendant (Prior) on Aug 14, 2006 at 00:17 UTC
    Any solution which uses a module won't really be making "better use of the computer's resources", I suppose.

    How about you code your hash as %config=(foo=>'foo',bar=>'bar') etc., and just require it, using "our"? That way you don't have to read the file and convert it to Perl, it's Perl already.



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print
Re: Converting a Flat-File to a Hash
by GrandFather (Cardinal) on Aug 14, 2006 at 00:40 UTC

    The following code is a slightly more robust version. Note in particular that the result of the open is checked and a three parameter open is used. Note too that the line gets chomped and the format checked (at least a little) and bad input lines rejected. Even then a lot more checking ought to be done (duplicate keys anyone?).

    I'd recommend however that you look at some of the configuration options bobf mentioned.

    #!/usr/bin/perl use strict; use warnings; my %data; #open INPUT, '<', $InputFile" or die "Failed to open $InputFile: $!"; while (<DATA>) { chomp; next if ! /(\w+)\s*:\s*(\w+)/; $data{$1} = $2; } print "$_ => $data{$_}\n" for keys %data; __DATA__ foo:FOO bar:BAR bogus data line - comment maybe? baz:BAZ qux:QUX

    Prints:

    bar => BAR baz => BAZ qux => QUX foo => FOO

    DWIM is Perl's answer to Gödel
Re: Converting a Flat-File to a Hash
by Cody Pendant (Prior) on Aug 14, 2006 at 02:05 UTC
    I just thought of something else. Your
    my @array = split /:/, $line;
    Will run into trouble if one day you need to code:
    errormessage:"couldn't open file: check permissions"
    or
    module:LWP::Simple
    so you should maybe make that
    my @array = split (/:/, $line, 2);
    so you don't have to worry about it.


    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print
Re: Converting a Flat-File to a Hash
by graff (Chancellor) on Aug 14, 2006 at 02:06 UTC
    I cannot help feeling there is a better way of doing things, which will make better use of the computer's resources.

    Any improvement on the OP would be a matter of robustness (along the lines suggested by GrandFather), or of "standardizing" on a solution that is already available (i.e. using a CPAN module), or merely of style or perceived maintainability (e.g. using fewer lines of Perl code and/or adding commentary/POD to describe the expected input file format, etc).

    All of those are possible, but none of them have any impact that I could imagine on "making better use of the computer's resources". I'm not really sure what you mean by that, but if you mean "make the process more efficient", I don't think any change of the OP code would have noticeable impact -- what you've posted is close enough to being as efficient as possible.

    If there is some other aspect to "use of the computer's resources" that you're thinking of, that might make an interesting discussion.

    BTW, I think, given the sample file contents, your use of  $date{$array[1]} = $array[2] would be wrong; the indexes should be 0 and 1 instead. Or better yet, something like this:

    open INPUT, "<", $InputFile; my %data = map { chomp; split( /:/, $_, 2 ) } <INPUT>; close INPUT;

    That's really minimalist -- maybe a bit too much so for some, but it's really a matter of taste and how much you can trust your data files to be as expected.

    (Updated the split so that it returns at most two elements per line from the input file; this is still vulnerable to serious trouble if the file contains any sort of line that lacks a colon.)

Re: Converting a Flat-File to a Hash
by GrandFather (Cardinal) on Aug 14, 2006 at 02:27 UTC

    "make better use of the computer's resources" is a pretty meaningless phrase - does it mean make the process as inefficient as possible so more computer resources are used? If it means "make this process execute in minimum time", then don't worry about it, even if the configuration file is megabytes big the time to slurp the file and process it (you did realise that the foreach slurps the file?) the time to process it is likely to be negligable. The time wasted diagnosing and fixing input errors due to a complete lack of validation on the other hand could cost a heap of time better spent drinking beer.

    Get someone else to do the work for you - use a module, then take the rest of the afternoon off for a beer.


    DWIM is Perl's answer to Gödel
      Get someone else to do the work for you - use a module...

      I'd say take that advice with a grain of salt... For things that are basically pretty simple -- easy to set up and easy to validate, like the OP's case -- it can be quicker and more reliable to roll your own. A module posted by someone else may have been written to do a slightly different task, and finding that out, and figuring out whether/how it can be shoe-horned into your particular task, might end up being more work with a less satisfying outcome.

      But for the harder things that make you scratch your head and say "I'm not sure how to solve this", definitely go to CPAN and look for help. Even if there is no single module that does exactly what you need, you're likely to learn about how to break the problem down into manageable chunks, and/or find useful references, and so on.

        Many things look simple, few things are simple. Take OP's "simple requirements" for example: "I'd like my program to read in some data at start, rather than hard-code it into the program". Ok, OP is talking about storing some configuration information and the sample data given indicates simple key:value data.

        But how simple is that? The code given breaks in all sorts of ways - empty lines, lines without a colon, values containing colons, duplicate keys, nasty configuration file names (two param open rather than three), missing or otherwise unreadable configuration file and very likely other things as yet unthought of.

        Ok, OP spends a little time up front trawling through CPAN (with a little guidance) and comes up with a tool kit for solving the problem today and, well golly, solving the problem again in a different context tomorrow. Sounds like well spent time to me, and there is still time left this afternoon for a beer.

        Sure, at some point you have to write some code to solve your own specific problem, but the more glue and the less new code the more likely it is that you don't have to deal with all the edge cases and stuff you've not thought of. In this case half an hour research and five minutes coding is likely to save several hours down the track bodging up the holes in the first implementation - and they are likely to be hours with people breathing down your neck as you sort out problems with a live system. Saving those sort of hours is worth several up front hours any day!


        DWIM is Perl's answer to Gödel
Re: Converting a Flat-File to a Hash
by mickeyn (Priest) on Aug 14, 2006 at 06:18 UTC
    no Tie suggestions ? :-)

    ok, here's one:

    #!/usr/bin/perl use strict; use warnings; use Tie::File; tie my @array, 'Tie::File', "/the/file/in/question" or die "can't tie +file"; my %data = map { split /:/; $_[0] => $_[1] } @array;
    Enjoy,
    Mickey

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://567118]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2014-08-30 20:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls