Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Combinations of lists to a hash

by tel2 (Pilgrim)
on Oct 04, 2019 at 11:47 UTC ( [id://11107038]=perlquestion: print w/replies, xml ) Need Help??

tel2 has asked for the wisdom of the Perl Monks concerning the following question:

Beloved Monks,

After hours of messing around with various attempts at this (which I'd rather not bore you with), I seek your help a script which takes input like this:

Prefix1=A,B:c,d value1
Prefix2=A:b,c:1,2 value2
(That's a space before each value).
And outputs all combinations of the comma separated items within the colon separated groups, and puts the result in a hash like this:
%hash =
{
  'Prefix1=A:c' => 'value1',
  'Prefix1=A:d' => 'value1',
  'Prefix1=B:c' => 'value1',
  'Prefix1=B:d' => 'value1',
  'Prefix2=A:b:1' => 'value2',
  'Prefix2=A:b:2' => 'value2',
  'Prefix2=A:c:1' => 'value2',
  'Prefix2=A:c:2' => 'value2'
};
There could be any number of colon separated groups and comma separated items within each group, so I expect a recursive function would be apt, (at least that's what I was trying to do). I'm not wanting the overhead of a module, please.

I've been using a solution which I coded ages ago, but that only handles 2 groups (using a for loop within a for loop), but now I want something which will handle any number of groups.

UPDATE:
Thank you all for your answers, many of which I have voted for.
Some things which I failed to mention (sorry!) when I posted the original are:

#1. Any group can be an '*', e.g.:

Prefix3=A:*:1,2 value3
I notice that this poses problems for the 'glob' solutions, and I think some have suggested solutions to that.  If you have a simple fix, please update your solution with it, otherwise I'll see what I can come up with.  I found that escaping the * like \* in the input is one option, but not ideal for me.

#2. The value (on the right) can be a '|' separated list, e.g.:

Prefix4=A:*:1,2 value4a=10|value4b=20
The above should result in %hash entries like this:
{
  'Prefix4=A:*:1' => {
                   'value4a' => '10',
                   'value4b' => '20'
                     },
  'Prefix4=A:*:2' => {
                   'value4a' => '10',
                   'value4b' => '20'
                     },
}
In fact, all input, whether it has '*' or not, and whether it has single or multiple values, is meant to create a hash of hashes like the above example.

#3. Some day I might want to include the prefixes in the things which can contain lists, e.g.:

Prefix5,Prefix6=A:*:1 value5=10
The result would be:
{
  'Prefix5=A:*:1' => {
                   'value5' => '10',
                     },
  'Prefix6=A:*:1' => {
                   'value5' => '10',
                     },
}
I'm not worried if no one wants to handle that possibility, but if you're willing then great!  (It looks as if AnomalousMonk was already heading that way. (Update: And tybalt89 has now done it.))

#4. I'm just realising now that I think I'm able to simply the requirements by doing away with the '=' and using ':' in place of it, so that should make the code easier. For eample:

Prefix7:A:*:1 value7=10

Sorry I didn't mention any of this originally. The hash of hashes item slipped my mind until I'd posted, then it was 1 AM so I went to bed, and I was hoping I could manage the adjustment to any solutions myself, but at this stage I'm struggling to understand your solutions enough to do so, though I love their concise elegance!

This is for a web application that I've been working on for years, and it is invoked during each page submission, so speed is important.
Thanks again.
Tel2

Replies are listed 'Best First'.
Re: Combinations of lists, etc (updated)
by haukex (Archbishop) on Oct 04, 2019 at 12:17 UTC
    I'm not wanting the overhead of a module, please.

    As opposed to the overhead of "hours of messing around with various attempts at this"? ;-) Anyway, you might be interested in the code in my post here.

    Update: I used this as the motivation to release Algorithm::Odometer::Tiny!

      My hours of messing around was a secondary consideration in this case, because I thought loading a module would be slightly less efficient than in-line code, especially for this regularly hit situation (i.e. every page load on my website).

      I'm deeply honoured and even impressed that you wrote that module as a result of my problem, haukex.  Thank you.  Don't forget to include me in the credits.   My name is spelt: "Some Kiwi Novice @ PerlMonks".

        My hours of messing around was a secondary consideration in this case, because I thought loading a module would be slightly less efficient than in-line code, especially for this regularly hit situation (i.e. every page load on my website).

        Yes, the code could certainly be inlined, hence the ::Tiny. Although depending on how many items you're generating, a speed boost in generating combinations by a factor of roughly 10x can be achieved with XS modules such as Set::Product::XS.

        Don't forget to include me in the credits. My name is spelt: "Some Kiwi Novice @ PerlMonks".

        Done :-)

Re: Combinations of lists, etc
by LanX (Saint) on Oct 04, 2019 at 12:30 UTC
    No recursion needed.

    You could regex your input to form a glob pattern:

    main::(-e:1): 0 DB<1> x <Prefix1={A,B}:{c,d}> 0 'Prefix1=A:c' 1 'Prefix1=A:d' 2 'Prefix1=B:c' 3 'Prefix1=B:d' DB<2> @keys = <Prefix1={A,B}:{c,d}> DB<3> @hash{@keys} = ('value1') x @keys

    > (which I'd rather not bore you with)

    That's a bad approach because first it wasn't clear for me what you are trying to achieve.

    Especially because %hash = {...} is broken Perl. :)

    HTH!

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

    Update

    A demo to build the glob pattern:

    DB<22> ($pre,@comb) = split /=|:/, "Prefix1=A,B:c,d" DB<23> p $pattern = "$pre=" . join ":", map { "{$_}" } @comb Prefix1={A,B}:{c,d} DB<24> x <"$pattern"> 0 'Prefix1=A:c' 1 'Prefix1=A:d' 2 'Prefix1=B:c' 3 'Prefix1=B:d' DB<25>

    NB: You should take care to properly escape potential meta characters like * or ?

      Here's an | another (just saw tybalt89's post)  s/// approach to building the globulation string:

      c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my %globize = ('=' => '={', ':' => '}:{', '' => '}'); ;; my $globule = 'Prefix2=A:b,c:1,2'; $globule =~ s{ ([=:] | \z) }{$globize{$1}}xmsg; print qq{'$globule'}; ;; my @globs = glob $globule; dd \@globs; " 'Prefix2={A}:{b,c}:{1,2}' [ "Prefix2=A:b:1", "Prefix2=A:b:2", "Prefix2=A:c:1", "Prefix2=A:c:2", ]
      Is this any better?   (shrugs)   (haukex's Building Regex Alternations Dynamically technique might be an interesting general approach to building the  s/// pattern, but the curly at the end of the string would present a minor problem.)


      Give a man a fish:  <%-{-{-{-<

        Nice work, AnomalousMonk, thank you!
        I tried that in a script, getting it to read all the lines of data, and it works perfectly!  See below.

        I'm now trying to change it so it handles the 4 updates in my original post.  I see it already handles update #1 re the '*'. How does it do that, given that you're using glob?

        Below is your code as a full script with my attempted changes to make it process all the data, and to handle updates #3 & #4 commented out because they don't work.  Any suggestions on how to get them working?  And how to best fit update #2 in to your code?

        And how is '' => '}' replacing the line ending?  I know \z matches the end, but those empty 'quotes' puzzle me.

        #!/usr/bin/perl use strict; use warnings; my %globize = ('=' => '={', ':' => '}:{', '' => '}'); #my %globize = ('?What here?' => '{', ':' => '}:{', '' => '}'); my (@globs, %hash); while (<DATA>) { my ($globule, $value) = split / /; chomp $value; $globule =~ s{ ([=:] | \z) }{$globize{$1}}xmsg; #$globule =~ s{ (\A | : | \z) }{$globize{$1}}xmsg; print qq{'$globule'}."\n"; @globs = glob $globule; $hash{$_} = $value for @globs; } use Data::Dump; dd \%hash; __DATA__ Prefix1:A,B:c,d value1=10 Prefix2:A:b,c:1,2 value2=20 Prefix3:A:*:1,2 value3=30 Prefix4:A:*:1,2 value4a=10|value4b=20 Prefix5,Prefix6:A:*:1,7 value5=10
      > NB: You should take care to properly escape potential meta characters like * or ?

      Is there a trivial way to do this?

      - Ron

        In general, meta-quoting for regexes (as used in split) can be done with quotemeta or with the  \Q ... \E intepolation escape sequences (see Quote and Quote-like Operators), which work for both double-quote and regex interpolation.


        Give a man a fish:  <%-{-{-{-<

        Something like    s/[][\\?*{},]/\\$&/g might do. (Untested)

        But in the case that I'd worry about such input I'd rather write a loop multiplying arrays.

        That's easier to test. :)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: Combinations of lists, etc
by tybalt89 (Monsignor) on Oct 04, 2019 at 15:03 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11107038 use warnings; my %hash; while( <DATA> ) { my ($prefix, $glob, $value) = split /(?<==)|\s/; $hash{$_} = $value for glob $prefix . $glob =~ s/[^:]+/{$&}/gr; } use Data::Dump 'dd'; dd \%hash; __DATA__ Prefix1=A,B:c,d value1 Prefix2=A:b,c:1,2 value2

    Outputs:

    { "Prefix1=A:c" => "value1", "Prefix1=A:d" => "value1", "Prefix1=B:c" => "value1", "Prefix1=B:d" => "value1", "Prefix2=A:b:1" => "value2", "Prefix2=A:b:2" => "value2", "Prefix2=A:c:1" => "value2", "Prefix2=A:c:2" => "value2", }
Re: Combinations of lists, etc
by tybalt89 (Monsignor) on Oct 04, 2019 at 19:46 UTC

    Another TMTOWTDI, or ""Who need glob?"

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11107038 use warnings; my @lines = <DATA>; @lines = map { /[^=:\s]*,[^=:\s]*/; map "$`$_$'", split /,/, $& } @lin +es while "@lines" =~ /,/; my %hash = split ' ', "@lines"; use Data::Dump 'dd'; dd \%hash; __DATA__ Prefix1=A,B:c,d value1 Prefix2=A:b,c:1,2 value2

    UPDATE: Should be :

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11107038 use warnings; my @lines = <DATA>; @lines = map { /[^=:\s]*,[^=:\s]*/ ? map "$`$_$'", split /,/, $& : $_ +} @lines while "@lines" =~ /,/; my %hash = split ' ', "@lines"; use Data::Dump 'dd'; dd \%hash; __DATA__ Prefix1=A,B:c,d value1 Prefix2=A:b,c:1,2 value2
Re: Combinations of lists to a hash
by tybalt89 (Monsignor) on Oct 04, 2019 at 23:04 UTC

    Aww, gee, you made me add one whole line to my program to handle your update, sigh :(

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11107038 use warnings; my @lines = <DATA>; @lines = map /[^=:\s]*,[^=:\s]*/ ? map "$`$_$'", split /,/, $& : $_, @ +lines while "@lines" =~ /,/; my %hash = split ' ', "@lines"; $_ = { split /[=|]/ } for values %hash; use Data::Dump 'dd'; dd \%hash; __DATA__ Prefix1=A,B:c,d value1=5 Prefix2=A:b,c:1,2 value2=6 Prefix4=A:*:1,2 value4a=10|value4b=20 Prefix5,Prefix6=A:*:1,7 value5=10

    Outputs:

    { "Prefix1=A:c" => { value1 => 5 }, "Prefix1=A:d" => { value1 => 5 }, "Prefix1=B:c" => { value1 => 5 }, "Prefix1=B:d" => { value1 => 5 }, "Prefix2=A:b:1" => { value2 => 6 }, "Prefix2=A:b:2" => { value2 => 6 }, "Prefix2=A:c:1" => { value2 => 6 }, "Prefix2=A:c:2" => { value2 => 6 }, "Prefix4=A:*:1" => { value4a => 10, value4b => 20 }, "Prefix4=A:*:2" => { value4a => 10, value4b => 20 }, "Prefix5=A:*:1" => { value5 => 10 }, "Prefix5=A:*:7" => { value5 => 10 }, "Prefix6=A:*:1" => { value5 => 10 }, "Prefix6=A:*:7" => { value5 => 10 }, }

    I tweaked your original tests to comply with your new specs. Is that valid? If not, you owe us a new test case.

      > "...you made me add one whole line to my program to handle your update..."
      Yeah, sorry about that.  I tried to make it harder, honest!

      Yes, that is valid, thank you very much tybalt89!  Nice work - you forced me to vote again.

      I've added a 4th update to my original post, which may help you to simplify your code, while helping me...again.

      BTW, how did you notice my update?  Just by chance or did you get some kind of message?  Just wondering whether I need to bring it to the attention of others who have provided solutions...though I may well go with yours.

      Update: In response to Anonymous Monk's (quite valid) rebuke below, I've realised that my 4th update in my original post simply required the removal of a couple of '=' signs from 1 line of tybalt89's last solution, i.e.:

      #@lines = map /^=:\s*,^=:\s*/ ? map "$`$_$'", split /,/, $& : $_, @lines
      @lines = map /^:\s*,^:\s*/ ? map "$`$_$'", split /,/, $& : $_, @lines
      
      Well, it seems to work, anyway.  If there's better way to make that adjustment, I'm all ears.
        Perlmonks is not a code writing service. Please show some efforts.
Re: Combinations of lists, etc
by LanX (Saint) on Oct 04, 2019 at 18:50 UTC
    TIMTOWTDI

    DB<41> x ($pre, $ranges) = split /=/, "Prefix1=A:b,c:1,2", 2 0 'Prefix1' 1 'A:b,c:1,2' DB<42> $del="=" DB<43> x @keys =$pre 0 'Prefix1' DB<44> for $mult (split /:/, $ranges ) { @keys = map { $a=$_; map { +"$a$del$_" } split /,/, $mult } @keys; $del = ':' } DB<45> x @keys 0 'Prefix1=A:b:1' 1 'Prefix1=A:b:2' 2 'Prefix1=A:c:1' 3 'Prefix1=A:c:2' DB<46> @hash{@keys} = ('value2') x @keys DB<47> x \%hash 0 HASH(0x3598330) 'Prefix1=A:b:1' => 'value2' 'Prefix1=A:b:2' => 'value2' 'Prefix1=A:c:1' => 'value2' 'Prefix1=A:c:2' => 'value2' DB<48>

    HTH! :)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      Nice work, LanX, thank you!
      I tried that in a script, getting it to read all the lines of data, and it works perfectly!
      Here it is:
      #!/usr/bin/perl while (<DATA>) { #($pre, $ranges) = split /=/, $_, 2; ($pre, $ranges, $value) = $_ =~ /^(.+?)=([^ ]+) (.+)$/; $del = "="; @keys = $pre; for $mult (split /:/, $ranges) { @keys = map { $a=$_; map { "$a$del$_" } split /,/, $mu +lt } @keys; $del = ':'; } #@hash{@keys} = ('value2') x @keys; @hash{@keys} = ($value) x @keys; } use Data::Dump 'dd'; dd \%hash; __DATA__ Prefix1=A,B:c,d value1 Prefix2=A:b,c:1,2 value2 Prefix3=A:*:1,2 value3
      And the output:
      { "Prefix1=A:c" => "value1", "Prefix1=A:d" => "value1", "Prefix1=B:c" => "value1", "Prefix1=B:d" => "value1", "Prefix2=A:b:1" => "value2", "Prefix2=A:b:2" => "value2", "Prefix2=A:c:1" => "value2", "Prefix2=A:c:2" => "value2", "Prefix3=A:*:1" => "value3", "Prefix3=A:*:2" => "value3", }
      If you have any suggested changes to my changes, I'm all ears.
      I've also added 4 updates to my original post in case you're interested.  No worries if not. You can ignore update #1.
        That's not my code. You replied to the wrong person/post (and no glob in this one).

        No idea. Sorry.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: Combinations of lists, etc
by 1nickt (Canon) on Oct 04, 2019 at 14:33 UTC

    "I'm wanting to know how to tune the engine on my lawnmower, but I don't want the overhead of using the socket sitting in the tool cabinet over there to extract the spark plug ..."


    The way forward always starts with a minimal test.
      "I'm wanting to know how to tune the engine on my lawnmower, but I don't want the overhead of using the socket sitting in the tool cabinet over there to extract the spark plug ..."

      Interesting analogy, 1nickt, but I don't want the overhead of leaving the socket in its regularly hit position (i.e. every webpage load), for the rest of my application's life, which could be decades.  I'd rather spend more time now getting it working (hopefully) slightly more efficiently.  This may or may not have been a bad call.  Who is one to judge, especially when one doesn't know the whole picture?
      (And if I were doing the job without PerlMonks, with my limited experience, I wouldn't know what appropriate socket to use.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11107038]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-24 01:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found