Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Parsing named parameters

by sk (Curate)
on Nov 07, 2005 at 02:38 UTC ( #506212=perlquestion: print w/replies, xml ) Need Help??

sk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have question on how to parse named parameters. I am trying to build a mini-interpreter, actually a mini code-generator based on some keywords that users put into a text file. Generating the required code is easy once i can get the parameters. Here is an example input

option1 = value0 value1 value2 option3 =value3 value4 option2=value5 # please note that the options might not be in order as you can see ab +ove. Not all options are required to be used.
I would like to store these in a hash with  option1 option2 and option3 as keys with their corresponding values.

I know this problem becomes much easier if the values for the options are quoted something like option1 = "value0 value1 value2". But, it would be great if i could parse the input without forcing the double quote requirement.

My attempt to split on = and then search for keywords made it so complicated messy. The code got ugly and i am not posting it here. Could you guys help me out with some ideas on how to parse this? If this is too complex then i would force the double-quote requirement.

Thanks a lot!

cheers

SK

Replies are listed 'Best First'.
Re: Parsing named parameters
by sauoq (Abbot) on Nov 07, 2005 at 02:52 UTC
    #!/usr/bin/perl my $line = <DATA>; chomp $line; my $options = {}; for (split /(?=\s\S+\s*=)/, $line) { my ($option, $value) = split /=/, $_; $option =~ s/^\s+//; $option =~ s/\s+$//; $options->{$option} = $value; } use Data::Dumper; print Dumper $options; __DATA__ option1 = value0 value1 value2 option3 =value3 value4 option2=value5 __END__ $VAR1 = { 'option1' => ' value0 value1 value2', 'option3' => 'value3 value4', 'option2' => 'value5' };
    -sauoq
    "My two cents aren't worth a dime.";
    
      Thanks very much sauoq,

      Even though I can sort of read positive/negative-look-ahead regexs, I still haven't been able to exploit it effecitively in my code! Whenever I am stuck like this next time I should try out that type of solution :)

      Just for other people's benefit, I shall try to explain your solution. Please correct me if i am wrong

      Input: option1 = value0 value1 value2 option3 =value3 value4 option2=v +alue5 split /(?=\s\S+\s*=)/, $line
      Here the regex looks for positions where there is a space followed by more than one non-space char and then an =. When such a position is found, it splits on null char. Here, it will split right before option3 = (for the first time). We can also modify this regex slightly to make it -  /\s+(?=\S+\s*=)/.

      Thanks very much.

      Also thanks to QM and pg for their suggestions/solutions!

      cheers

      SK

Re: Parsing named parameters
by pg (Canon) on Nov 07, 2005 at 03:23 UTC

    Another way:

    use Data::Dumper; use strict; use warnings; my $params = "option1 = value0 value1 value2 option3 =value3 value4 op +tion2=value5"; $params =~ s/\s*=\s*/=/g; my @params = split /\s+/, $params; my %params; my $cur_val; for my $param (@params) { my @pairs = split /=/, $param; $cur_val = $pairs[0] if ($#pairs) ; push @{$params{$cur_val}}, $pairs[-1]; } print Dumper(\%params);

    Which gives:

    $VAR1 = { 'option1' => [ 'value0', 'value1', 'value2' ], 'option3' => [ 'value3', 'value4' ], 'option2' => [ 'value5' ] };
Re: Parsing named parameters
by QM (Parson) on Nov 07, 2005 at 03:11 UTC
    Probably overkill, but you could use something like Getopt::Declare to spec your options and then parse them. G::D allows you to create a parser -- it's not just for command line arguments.

    Update: Corrected link (thanks sk).

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re: Parsing named parameters
by samizdat (Vicar) on Nov 07, 2005 at 14:43 UTC
Re: Parsing named parameters
by robin (Chaplain) on Nov 07, 2005 at 14:58 UTC
    my $data = <DATA>; my %hash; $hash{$1} = $2 while $data =~ s/\s*(\w+)\s*=\s*([^=]+)$//; use Data::Dumper; print Dumper \%hash; __DATA__ option1 = value0 value1 value2 option3 =value3 value4 option2=value5

    Not the fastest possible solution, but simple and concise.

      As dwildesnl suggested, reversing makes it significantly faster: see benchmark below. But don't do the optimisation unless you need it!
      my $data = <DATA>; use Benchmark "cmpthese"; cmpthese(10_000, { "reversed" => sub { my %hash; my $reversed_data = reverse($data); $hash{reverse($2)} = reverse($1) while $reversed_data =~ /\s*([^=]+?)\s*=\s*(\w+)/g; }, "non-reversed" => sub { my %hash; my $data = $data; $hash{$1} = $2 while $data =~ s/\s*(\w+)\s*=\s*([^=]+)$//; }, }); __DATA__ option1 = value0 value1 value2 option3 =value3 value4 option2=value5
      gives results:
      Rate non-reversed reversed non-reversed 1263/s -- -89% reversed 10989/s 770% --
        But don't do the optimisation unless you need it!

        What would be the penalty of performing the optimisation if you don't need it?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parsing named parameters
by chibiryuu (Beadle) on Nov 07, 2005 at 16:42 UTC
    This one is pretty short:
    my %options = (); while (<>) { my (undef, @options) = split /(\S+)\s*=/; %options = (%options, grep s/^\s*|\s*$//g, @options); }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://506212]
Approved by Errto
Front-paged by monkfan
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (6)
As of 2023-06-08 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you go to conferences?






    Results (30 votes). Check out past polls.

    Notices?