http://www.perlmonks.org?node_id=506212

sk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have question on how to parse named parameters. I am trying to build a mini-interpreter, actually a mini code-generator based on some keywords that users put into a text file. Generating the required code is easy once i can get the parameters. Here is an example input

option1 = value0 value1 value2 option3 =value3 value4 option2=value5 # please note that the options might not be in order as you can see ab +ove. Not all options are required to be used.
I would like to store these in a hash with  option1 option2 and option3 as keys with their corresponding values.

I know this problem becomes much easier if the values for the options are quoted something like option1 = "value0 value1 value2". But, it would be great if i could parse the input without forcing the double quote requirement.

My attempt to split on = and then search for keywords made it so complicated messy. The code got ugly and i am not posting it here. Could you guys help me out with some ideas on how to parse this? If this is too complex then i would force the double-quote requirement.

Thanks a lot!

cheers

SK

Replies are listed 'Best First'.
Re: Parsing named parameters
by sauoq (Abbot) on Nov 07, 2005 at 02:52 UTC
    #!/usr/bin/perl my $line = <DATA>; chomp $line; my $options = {}; for (split /(?=\s\S+\s*=)/, $line) { my ($option, $value) = split /=/, $_; $option =~ s/^\s+//; $option =~ s/\s+$//; $options->{$option} = $value; } use Data::Dumper; print Dumper $options; __DATA__ option1 = value0 value1 value2 option3 =value3 value4 option2=value5 __END__ $VAR1 = { 'option1' => ' value0 value1 value2', 'option3' => 'value3 value4', 'option2' => 'value5' };
    -sauoq
    "My two cents aren't worth a dime.";
    
      Thanks very much sauoq,

      Even though I can sort of read positive/negative-look-ahead regexs, I still haven't been able to exploit it effecitively in my code! Whenever I am stuck like this next time I should try out that type of solution :)

      Just for other people's benefit, I shall try to explain your solution. Please correct me if i am wrong

      Input: option1 = value0 value1 value2 option3 =value3 value4 option2=v +alue5 split /(?=\s\S+\s*=)/, $line
      Here the regex looks for positions where there is a space followed by more than one non-space char and then an =. When such a position is found, it splits on null char. Here, it will split right before option3 = (for the first time). We can also modify this regex slightly to make it -  /\s+(?=\S+\s*=)/.

      Thanks very much.

      Also thanks to QM and pg for their suggestions/solutions!

      cheers

      SK

Re: Parsing named parameters
by pg (Canon) on Nov 07, 2005 at 03:23 UTC

    Another way:

    use Data::Dumper; use strict; use warnings; my $params = "option1 = value0 value1 value2 option3 =value3 value4 op +tion2=value5"; $params =~ s/\s*=\s*/=/g; my @params = split /\s+/, $params; my %params; my $cur_val; for my $param (@params) { my @pairs = split /=/, $param; $cur_val = $pairs[0] if ($#pairs) ; push @{$params{$cur_val}}, $pairs[-1]; } print Dumper(\%params);

    Which gives:

    $VAR1 = { 'option1' => [ 'value0', 'value1', 'value2' ], 'option3' => [ 'value3', 'value4' ], 'option2' => [ 'value5' ] };
Re: Parsing named parameters
by QM (Parson) on Nov 07, 2005 at 03:11 UTC
    Probably overkill, but you could use something like Getopt::Declare to spec your options and then parse them. G::D allows you to create a parser -- it's not just for command line arguments.

    Update: Corrected link (thanks sk).

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re: Parsing named parameters
by samizdat (Vicar) on Nov 07, 2005 at 14:43 UTC
Re: Parsing named parameters
by robin (Chaplain) on Nov 07, 2005 at 14:58 UTC
    my $data = <DATA>; my %hash; $hash{$1} = $2 while $data =~ s/\s*(\w+)\s*=\s*([^=]+)$//; use Data::Dumper; print Dumper \%hash; __DATA__ option1 = value0 value1 value2 option3 =value3 value4 option2=value5

    Not the fastest possible solution, but simple and concise.

      As dwildesnl suggested, reversing makes it significantly faster: see benchmark below. But don't do the optimisation unless you need it!
      my $data = <DATA>; use Benchmark "cmpthese"; cmpthese(10_000, { "reversed" => sub { my %hash; my $reversed_data = reverse($data); $hash{reverse($2)} = reverse($1) while $reversed_data =~ /\s*([^=]+?)\s*=\s*(\w+)/g; }, "non-reversed" => sub { my %hash; my $data = $data; $hash{$1} = $2 while $data =~ s/\s*(\w+)\s*=\s*([^=]+)$//; }, }); __DATA__ option1 = value0 value1 value2 option3 =value3 value4 option2=value5
      gives results:
      Rate non-reversed reversed non-reversed 1263/s -- -89% reversed 10989/s 770% --
        But don't do the optimisation unless you need it!

        What would be the penalty of performing the optimisation if you don't need it?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parsing named parameters
by chibiryuu (Beadle) on Nov 07, 2005 at 16:42 UTC
    This one is pretty short:
    my %options = (); while (<>) { my (undef, @options) = split /(\S+)\s*=/; %options = (%options, grep s/^\s*|\s*$//g, @options); }