Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

split string into hash of hashes...

by bcarroll (Pilgrim)
on Mar 03, 2013 at 01:15 UTC ( #1021474=perlquestion: print w/replies, xml ) Need Help??
bcarroll has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to figure out how to split a string (line of a text file) into a hash of hashes. The textfile looks very much like the Windows Registry. There are some lines that contain a registry like path, with backslashes delimiting key paths.

Here is an example from the file:
LanMan:= ; REG_SZ

I am trying to figure out how to take the HKEY_LOCAL_MACHINE line and build a hash of hashes. If I were building the hash manualy, I would normally do it like this:

$hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}=12345; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LanMan'} +{'Value'}=""; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LanMan'} +{'Type'}="REG_SZ"; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LDAP'}{' +Value'}=""; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LDAP'}{' +Type'}="REG_SZ"; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'ODBC'}{' +Value'}=""; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'ODBC'}{' +Type'}="REG_SZ";
The problem is that I am parsing a textfile and trying to build the hash of hashes dynamically, because I don't know which registry-like keys will be included.

Of course I will need another variable to store where in the hash the current key is to add the "Key name", "Value" , and "Type" for the lines that follow the HKEY... line, but I am not too concerned about that right now.

Anybody have any ideas?

Replies are listed 'Best First'.
Re: split string into hash of hashes...
by BrowserUk (Pope) on Mar 03, 2013 at 01:45 UTC

    It cannot be done.

    The same hash key cannot contain

    both a numeric value:    ...{'SerialNumberUserAttribute'} = 12345;

    and a subtree of hashes: ...{'SerialNumberUserAttribute'}{'LanMan'}{'Value'}="";

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: split string into hash of hashes...
by davido (Archbishop) on Mar 03, 2013 at 01:54 UTC

    BrowserUk correctly identified that your proposed structure is flawed. But beyond that, certainly someone has done this (the right way) before... enough times that there would be a module on CPAN for it: Win32::TieRegistry. It's not quite the direction you were headed, but it's a reasonably well tested solution.


Re: split string into hash of hashes...
by LanX (Bishop) on Mar 03, 2013 at 02:05 UTC
    first of all, this is not possible

    $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{' +Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}=12345; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{' +Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LanMan'}{'Value'}= +"";

    a hash-entry can't be 12345 and a hash-ref to {'LanMan'}... at the same time.

    consider ... {value}=12345

    > Anybody have any ideas?

    first step should be to build the mainpath as string with splits and joins.

    then eval this string to autovivify this hash, and assign a ref to a $subhash to it.

    This $subhash -ref can be populated now with all entries which come.

    Cheers Rolf


    if you don't like evals build the HoH-path in a loop (line 163)

    DB<160> $line='HKEY_LOCAL_MACHINE\SOFTWARE\Vendor\Product\CurrentVer +sion\Tokens\Encotone\SerialNumberUserAttribute=12345' DB<161> ($path,$value)= split /=/, $line DB<162> $subhash=$hash={} DB<163> $subhash = $subhash->{$_} = {} for split /\\/,$path DB<165> $subhash->{value}=$value DB<166> $subhash->{LanMan}{value}="" DB<167> $subhash->{LanMan}{type}="REG_SZ" #... and so on for all sub-entries ... DB<168> $hash => { HKEY_LOCAL_MACHINE => { SOFTWARE => { Vendor => { Product => { CurrentVersion => { Tokens => { Encotone => { "SerialNumberUserAttribute +" => { LanMan => { type => "REG_SZ", value => "" }, value => 12345 }, }, }, }, }, }, }, }, }

    now repeat these steps for all paths!



    this might delete older sub-structures from previous paths

    DB<163> $subhash = $subhash->{$_} = {} for split /\\/,$path

    so better do

    $subhash = $subhash->{$_} //= {} for split /\\/,$path

      Thanks for the reponses, I realize now that my manual example is flawed...
        > I realize now that my manual example is flawed...

        Line 165 in my code solved this. You need a dedicated key in a subhash for "direct" values and your fine.

        It's disputable if "value" is a good choice for a key-name, but that's up to you.

        Cheers Rolf

Re: split string into hash of hashes...
by punch_card_don (Curate) on Mar 03, 2013 at 16:52 UTC
    Update: Working this out for myself demonstrated to me that this is just a clunky version of Rolf's more elegant code, above. But it makes each operation more clear to my less expert eyes.

    This is a case where autovivification and hashrefs are your friends. With a helping hand from some creative data structure manipulation.

    You have two issues:

    1. how to create a hash of hashes without knowing the depth of the structure in advance.
    2. How to include a value for the entire dimension three levels from the bottom, knowing, as others have pointed out, that an element cannot contain at once a value and a reference to the lower levels of the structure.
    The solution to (1) is autovivifciation and hash refs. The solution to (2) is to pad your hash with dummy elements down to the lowest dimension.
    #!/usr/local/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Sortkeys = 1; print "Content-type:text/html\n\n"; my $line1 = 'HKEY_LOCAL_MACHINE\SOFTWARE\Vendor\Product\CurrentVersion +\Tokens\Encotone\SerialNumberUserAttribute=12345'; my $line2 = 'LanMan:= ; REG_SZ'; my @line1_parts = split(/[=\\]/, $line1); my @line2_parts = split(/:= ;/, $line2); my %hash; my $href = {}; my $i; for $i (0 .. $#line1_parts) { if ($i == 0) { # for the first time around, create the first level of the str +ucture, itself as a hash. Autovivification means that simply referri +ng to it creates it. No need to put anything in it yet. $hash{$line1_parts[$i]} = {}; #create a hash reference to that element (which is itself a ha +sh); $href = \%{ $hash{$line1_parts[$i]} }; } elsif ($i == $#line1_parts) { # last time around, create the acutal hash elements you want, +filling in down to the bottom level with dummy elements for the Seria +lNumberUserAttribute. $href->{'dummy'} = $line1_parts[$i]; $href->{$line2_parts[0]}{'Value'}=""; $href->{$line2_parts[0]}{'Type'}=$line2_parts[1]; } else { # each next time around, create the next level of the structur +e, itself as a hash, tacked onto the hash that the hash ref points to $href->{ $line1_parts[$i] } = {}; # move the hash ref to this new hash $href = \%{ $href->{$line1_parts[$i]} }; } } print "<pre>\n"; print Dumper(\%hash); print "</pre>\n";
    $VAR1 = { 'HKEY_LOCAL_MACHINE' => { 'SOFTWARE' => { 'Vendor' => { 'Product' => { 'CurrentVersion' => { 'Tokens' => { 'Encotone' => { 'SerialNumberUserAttribute' => { 'LanMan' => { 'Type' => ' REG_SZ', 'Value' => '' }, 'dummy' => '12345' } } } } } } } }

    Time flies like an arrow. Fruit flies like a banana.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1021474]
Approved by igelkott
[Corion]: usemodperl: No, asking is not unreasonable. Insisting that it must always keep working for you is unreasonable.
[Corion]: But again, you haven't explained why your users need to download, or why they need to download via http, or why you can't host the SSL-stripper yourself.
[choroba]: Wait a second. Does shutting down mean the cpan clients can no longer use their urllist?
[usemodperl]: i'm under no obligation to explain myself mommy, and i'm not insisting, just asking a simple question
[Veltro]: So, I'm confused on the matter actually. There are locked-down computers, but they need updates. How does that work? Isn't that contradictary?
[usemodperl]: EXACTLY
[usemodperl]: and the https only situation makes it even worse
[Veltro]: How is that 'exactly'. Locked-down means you don't change or update. So why do you need to access CPAN?
[Corion]: choroba: Heh - I think that's an interesting situation - you can't bootstrap a vanilla Perl then via CPAN, as IO::Socket::SSL is not in core, but also can't be installed from a cpan client. You need a wget/curl with SSL built-in to do that :)
[usemodperl]: i was thinking of forcing something like to proxy from meta but they seem all https too :-/

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2018-06-24 16:43 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.