Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

List Values As Multidimensional Hash Keys

by joule (Acolyte)
on Mar 14, 2004 at 20:44 UTC ( [id://336520] : perlquestion . print w/replies, xml ) Need Help??

joule has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I am attempting to create a multidimensional hash whose keys are the elements of a list. Simply put, I am parsing strings from a file and splitting them into key/val pairs on the '=' character. Then, I split the key itself on the ':' character, and would like to assign the result(s) as keys to a multidimensional hash.

Example:

The string/line read in from the file is 'a1:a2:a3=foo' - as a result, 'a1:a2:a3=foo' is assigned to $_

# $key = 'a1:a2:a3', $val = 'foo' my ($key, $val) = split(/=/); # find the number of ':' in string my $num = map(/:/g, $key) + 1; # create hash keys - split returns 'a1','a2', and 'a3' # hash key creation NOT WORKING %hash = split(/:/, $key, $num);
I'd like to create and assign $hash{'a1'}{'a2'}{'a3'} = 'foo'.

In case you're wondering, I find the number of ':' because this code is used in a loop, and each line may or may not vary with each loop iteration. The last split creates the keys, I just can't figure out the creation and assignment to the hash. I looked at map(), but have yet to come up with a solution.

Thanks for any help provided.

Replies are listed 'Best First'.
Re: List Values As Multidimensional Hash Keys
by diotalevi (Canon) on Mar 14, 2004 at 22:39 UTC

    Please. Do this without eval and without arbitrary restrictions on numbers of lists.

    use List::Util 'reduce'; use Data::Dumper; $_ = "a1:a2:a3=foo"; my ($key,$val) = split(/=/); my @keys = split /:/, $key; my $last = pop @keys; my %hash; ( @keys ? reduce( sub { $a->{$b} ||= {} }, \%hash, @keys ) : \%hash )->{ $last } = $val; print Dumper( \%hash );
Re: List Values As Multidimensional Hash Keys
by graff (Chancellor) on Mar 14, 2004 at 23:30 UTC
    Some useful solutions have been provided, but no matter which approach you choose, you still need to be very confident about the quality of your input data for anything to work as intended. In particular, think what will happen if your input includes any two records like the following:
    a1:b2:c3=foo a1:b2=bar
    This would create a logical contradiction: node "b2" shows up as both a leaf node and a parent node (it's supposed to hold both a string and a hash ref). Actually, whichever of these two records happens to come second in the input would obliterate data for the one that came earlier.

    Unless you have perfect confidence in the input (that is, you have already tested it for well-formedness), you will want to include sanity checks in your hash-creation logic -- don't assign a scalar value to a hash element if it already exists as a reference, and don't use a hash element as a reference if it already contains a scalar. It may be easiest to add this sort of checking to the recursive solution proposed above -- to wit:

    use strict; use warnings; use Data::Dumper; my $tree = {}; while (<DATA>) { chomp; my ( $key, $val ) = split /=/, $_, 2; unless ( $key and $val ) { warn "Skipped bad input at line $. -- $_\n"; next; } my $result = insert( $tree, $val, split( /:/, $key )); warn "$result -- skipped line $. -- $_\n" if ( $result ne "ok" ); } print Dumper( $tree ); sub insert { my ( $tree, $val, @keys ) = @_; my $key = shift @keys; my $result; if ( @keys and exists( $tree->{$key} )) { if ( ref( $tree->{$key} ) eq 'HASH' ) { $result = insert( $tree->{$key}, $val, @keys ); } else { $result = "Tried to overwrite string value as hash ref"; } } elsif ( @keys ) { $tree->{$key} = {}; $result = insert( $tree->{$key}, $val, @keys ); } elsif ( exists( $tree->{$key} ) and ref( $tree->{$key} ) eq 'HASH' + ) { $result = "Tried to overwrite hash ref with string value"; } else { # Note: a scalar can still overwrite a prev. scalar $tree->{$key} = $val; $result = "ok"; } return $result; } __DATA__ a1:b1:c1=first data record a1:b2=second data record a1:b2:c2=third data record a1:b3:c2=fourth data record a1:b3:c2=fifth data record a1:b3=sixth record a2:b1:c1:d1:seventh data record a2:b1:c1:d1=eigth data record __OUTPUT__ Tried to overwrite string value as hash ref -- skipped line 3 -- a1:b2 +:c2=third data record Tried to overwrite hash ref with string value -- skipped line 6 -- a1: +b3=sixth record Skipped bad input at line 7 -- a2:b1:c1:d1:seventh data record $VAR1 = { 'a1' => { 'b3' => { 'c2' => 'fifth data record' }, 'b2' => 'second data record', 'b1' => { 'c1' => 'first data record' } }, 'a2' => { 'b1' => { 'c1' => { 'd1' => 'eigth data record' } } } };
Re: List Values As Multidimensional Hash Keys
by matija (Priest) on Mar 14, 2004 at 21:18 UTC
    Basicaly, you're building a tree structure, where the leafs contain the data, and the brancehs are labeled by the parts of the key.

    Something like this might do the trick:

    #!/usr/bin/perl -w my $tree={}; # warning: recursive subroutine sub insert { my ($tree,$val,@keys)=@_; my $key=shift @keys; unless (defined($tree->{$key})) { $tree->{$key}={}; } if (scalar @keys) { insert($tree->{$key},$val,@keys); } else { $tree->{$key}=$val; } } insert($tree,'1',qw(a b c d)); # test 1 insert($tree,'2',qw(a b d e)); # test 2 insert($tree,$val,split(':',$key); # the call you were looking for.
      I've got a canned implementation like this on CPAN, Data::DRef, which will also do the key splitting for you:
      use Data::DRef ( set_value_for_key ); $Data::DRef::Separator = ':'; my ($key, $val) = split /=/; set_value_for_key( $hash, $key, $value );
Re: List Values As Multidimensional Hash Keys
by kappa (Chaplain) on Mar 14, 2004 at 21:47 UTC
    Seemed to me rather interesting task :)
    @a = qw/a b c d e f/; $data = 'K'; $data = { pop @a => $data } while @a; %hash = %$data;
      Or, building top-down instead of bottom-up:
      $_ = 'a1:a2:a3=foo'; my ($key, $val) = split(/=/); my @keys = split /:/, $key; my %hash; my $href = \%hash; $href = $href->{shift @keys} = (@keys>1) ? {} : $val while @keys; print Dumper(%hash);

      The PerlMonk tr/// Advocate
Please don't use eval for this! (was Re: List Values As Multidimensional Hash Keys)
by merlyn (Sage) on Mar 14, 2004 at 23:43 UTC
    As usual, this topic comes up every three to six months, and the same "eval" solutions get posted. As usual, I've downvoted any solution I've seen (or will see) in this thread that uses "eval". It's both unnecessarily inefficient, and a big security hole as well. Please use any other solution as a starter.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      Being of the ornery sort, this (to me) begs the following question:
      Efficiency aside, is there a *safe* way to utilise eval as a solution to this problem? Not a "good" way, or even a "mediocre" way, just safe?

      The intrinsic problem with eval is the possibility of hostile data being introduced into to evaluated string. So, is there a way of rendering the data safe?
      The obvious way is via taint checking, and string sanitising with tr or s, but is there a better way?

      Not that this should be construed as approval of the idea - the process startup overheads alone should be reason enough to do it any other way!
      -R
        Taking tachyon's sample code:
        use strict; use warnings; my %hash; my $a = '1};print "You have just been cracked!\n";#a1:a2:a3=foo'; my ($key, $val) = split /=/, $a, 2; $key =~ s/:/}{/g; eval "\$hash{$key}=\"$val\""; __END__ You have just been cracked!
        You would replace the $key =~ s/:/... line with
        use Data::Dumper; $Data::Dumper::Terse = 1; $Data::Dumper::Useqq = 1; $key = join '}{', Dumper split /:/, $key, -1;
Re: List Values As Multidimensional Hash Keys
by kvale (Monsignor) on Mar 14, 2004 at 21:16 UTC
    QM's solution is clever. Here is a prosaic method that works for a known maximum number of keys:
    my ($key, $val) = split /=/; my @keys = split /:/, $key; if (@keys == 1) { $hash{ $key[0] } = $value; } elsif (@keys == 2) { $hash{ $key[0] }{ $key[1] } = $value; } # etc.

    -Mark

Re: List Values As Multidimensional Hash Keys
by BrowserUk (Patriarch) on Mar 15, 2004 at 02:00 UTC

    If you can live with a hashref rather than a hash at the top level...

    use List::Util qw[ reduce ]; my $line = 'a1:a2:a3=key'; my $href = reduce{ my $r={}; $r->{$b}=$a; $r } reverse split /:|=/, $l +ine; print Dumper $href; $VAR1 = { 'a1' => { 'a2' => { 'a3' => 'key' } } };

    Update: It struck me later that hash refs are a distinct advantage as it avoids the collision problem graff brought up.

    #! perl -slw use strict; use List::Util qw[ reduce ]; use Data::Dumper; my @AoH; while( <DATA> ) { chomp; push @AoH, reduce{ my $r={}; $r->{$b}=$a; $r; } reverse split /:|=/, $_; } print Dumper \@AoH; __DATA__ a1:a2:a3=key1 a1:a2=key2 b1:b2:b3:b4:b5:b6:b7:b8:b9=key3

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
Re: List Values As Multidimensional Hash Keys
by QM (Parson) on Mar 14, 2004 at 21:02 UTC
    Something like this?:
    use strict; use warnings; my $a = 'a1:a2:a3=foo'; my ($key, $val) = split /=/, $a, 2; $key =~ s/:/}{/g; my %hash; eval "\$hash{$key}=\"$val\"";

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      Leving aside the fact that this does not compile under strict as you don't declare %hash, this is a security hole just waiting for a cracker.The string form of eval is *dangerous*, don't use it until after you understand why. Here is a hint....

      use strict; use warnings; my %hash; my $a = '1};print "You have just been cracked!\n";#a1:a2:a3=foo'; my ($key, $val) = split /=/, $a, 2; $key =~ s/:/}{/g; eval "\$hash{$key}=\"$val\""; __END__ You have just been cracked!

      The print could be any arbitrary code. unlink, rm, shutdown....*any* code, running with the perms of whoever started the script.

      cheers

      tachyon

        Your post has been downvoted. Please don't malform internal links again (you see, manually editing the URI is just sooooo incredibly difficult). Of course I am entirely kidding.
Re: List Values As Multidimensional Hash Keys
by dragonchild (Archbishop) on Mar 15, 2004 at 14:13 UTC
    I'm curious - what's the need to have this as a tree? Are you planning on working with all the children at a specific level?

    The reason I ask is that if all you want are straight lookups from the root node, it would be easier to have the path to the leaf as the entry itself. So, you can cut out the split on ':' and just have the hash key be "a1:a2:a3". (In other words, flatten the tree.)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      The file I'm parsing contains strings which are used for a program's configuration files. The configuration files themselves are stored in various directories/sub-directories. I want to provide a logical grouping of the configuration files while also providing flexibility and extensibility with the program's configuration. Hope that makes sense, it's hard to describe with words... :)
Re: List Values As Multidimensional Hash Keys
by joule (Acolyte) on Mar 15, 2004 at 14:03 UTC
    Wow! I never imagined I would receive so many well informed replies. You guys rule!

    After working (unsuccessfully) with map(), I was attempting to code a recursive function, similar to what matija posted. The eval method is a hack, and a commonly disliked solution one as some of the replies have shown.

    My gratitude and appreciation go out to you all. Thanks again.

Re: List Values As Multidimensional Hash Keys
by meredith (Friar) on Mar 15, 2004 at 18:10 UTC
    Another solution is using (abusing?) list auto-stringification, or multidimensional emulation. (How long should the term be for something so simple?) Instead of all those cool hash-reference trees, you could use a flat hash with namespaced keys, just like your file. If you feed perl a list for a hash key, it will apply a join('', ...) to it. (I believe there is a perlvar to change the separator, though. <looks> Oh, It's $;) After you build the hash this way, walking your tree is is as simple as sort keys.
    $; = ':'; my %Hash; while (<>) { chomp $_; my ($key, $val) = split(/=/); if ($key =~ /:/) { my @keyparts = split(/$;/, $key); $Hash{ join("$;", @keyparts) } = $val; # I'm not sure how to get perl to not scalar-ize it in this ca +se # But it's probably better to reduce cargo-cultism ;) } else { $Hash{$key} = $val; } } foreach (sort keys %Hash) { print "$_ \t=> " . $Hash{$_} . "\n"; } print $Hash{'a1','b1','c1'} . "\n"


    mhoward - at - hattmoward.org