Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

split string into hash of hashes...

by bcarroll (Monk)
on Mar 03, 2013 at 01:15 UTC ( #1021474=perlquestion: print w/ replies, xml ) Need Help??
bcarroll has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to figure out how to split a string (line of a text file) into a hash of hashes. The textfile looks very much like the Windows Registry. There are some lines that contain a registry like path, with backslashes delimiting key paths.

Here is an example from the file:
HKEY_LOCAL_MACHINE\SOFTWARE\Vendor\Product\CurrentVersion\Tokens\Encotone\SerialNumberUserAttribute=12345
LanMan:= ; REG_SZ
LDAP:= ; REG_SZ
ODBC:= ; REG_SZ

I am trying to figure out how to take the HKEY_LOCAL_MACHINE line and build a hash of hashes. If I were building the hash manualy, I would normally do it like this:

$hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}=12345; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LanMan'} +{'Value'}=""; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LanMan'} +{'Type'}="REG_SZ"; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LDAP'}{' +Value'}=""; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LDAP'}{' +Type'}="REG_SZ"; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'ODBC'}{' +Value'}=""; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{'Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'ODBC'}{' +Type'}="REG_SZ";
The problem is that I am parsing a textfile and trying to build the hash of hashes dynamically, because I don't know which registry-like keys will be included.

Of course I will need another variable to store where in the hash the current key is to add the "Key name", "Value" , and "Type" for the lines that follow the HKEY... line, but I am not too concerned about that right now.

Anybody have any ideas?

Comment on split string into hash of hashes...
Download Code
Re: split string into hash of hashes...
by BrowserUk (Pope) on Mar 03, 2013 at 01:45 UTC

    It cannot be done.

    The same hash key cannot contain

    both a numeric value:    ...{'SerialNumberUserAttribute'} = 12345;

    and a subtree of hashes: ...{'SerialNumberUserAttribute'}{'LanMan'}{'Value'}="";


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: split string into hash of hashes...
by davido (Archbishop) on Mar 03, 2013 at 01:54 UTC

    BrowserUk correctly identified that your proposed structure is flawed. But beyond that, certainly someone has done this (the right way) before... enough times that there would be a module on CPAN for it: Win32::TieRegistry. It's not quite the direction you were headed, but it's a reasonably well tested solution.


    Dave

Re: split string into hash of hashes...
by LanX (Canon) on Mar 03, 2013 at 02:05 UTC
    first of all, this is not possible

    $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{' +Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}=12345; $hash{'HKEY_LOCAL_MACHINE'}{'SOFTWARE'}{'Vendor'}{'Product'}{'CurrentV +ersion'}{' +Tokens'}{'Encotone'}{'SerialNumberUserAttribute'}{'LanMan'}{'Value'}= +"";

    a hash-entry can't be 12345 and a hash-ref to {'LanMan'}... at the same time.

    consider ... {value}=12345

    > Anybody have any ideas?

    first step should be to build the mainpath as string with splits and joins.

    then eval this string to autovivify this hash, and assign a ref to a $subhash to it.

    This $subhash -ref can be populated now with all entries which come.

    Cheers Rolf

    UPDATE

    if you don't like evals build the HoH-path in a loop (line 163)

    DB<160> $line='HKEY_LOCAL_MACHINE\SOFTWARE\Vendor\Product\CurrentVer +sion\Tokens\Encotone\SerialNumberUserAttribute=12345' DB<161> ($path,$value)= split /=/, $line DB<162> $subhash=$hash={} DB<163> $subhash = $subhash->{$_} = {} for split /\\/,$path DB<165> $subhash->{value}=$value DB<166> $subhash->{LanMan}{value}="" DB<167> $subhash->{LanMan}{type}="REG_SZ" #... and so on for all sub-entries ... DB<168> $hash => { HKEY_LOCAL_MACHINE => { SOFTWARE => { Vendor => { Product => { CurrentVersion => { Tokens => { Encotone => { "SerialNumberUserAttribute +" => { LanMan => { type => "REG_SZ", value => "" }, value => 12345 }, }, }, }, }, }, }, }, }

    now repeat these steps for all paths!

    HTH

    UPDATE

    this might delete older sub-structures from previous paths

    DB<163> $subhash = $subhash->{$_} = {} for split /\\/,$path

    so better do

    $subhash = $subhash->{$_} //= {} for split /\\/,$path

      Thanks for the reponses, I realize now that my manual example is flawed...
        > I realize now that my manual example is flawed...

        Line 165 in my code solved this. You need a dedicated key in a subhash for "direct" values and your fine.

        It's disputable if "value" is a good choice for a key-name, but that's up to you.

        Cheers Rolf

Re: split string into hash of hashes...
by punch_card_don (Curate) on Mar 03, 2013 at 16:52 UTC
    Update: Working this out for myself demonstrated to me that this is just a clunky version of Rolf's more elegant code, above. But it makes each operation more clear to my less expert eyes.

    This is a case where autovivification and hashrefs are your friends. With a helping hand from some creative data structure manipulation.

    You have two issues:

    1. how to create a hash of hashes without knowing the depth of the structure in advance.
    2. How to include a value for the entire dimension three levels from the bottom, knowing, as others have pointed out, that an element cannot contain at once a value and a reference to the lower levels of the structure.
    The solution to (1) is autovivifciation and hash refs. The solution to (2) is to pad your hash with dummy elements down to the lowest dimension.
    #!/usr/local/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Sortkeys = 1; print "Content-type:text/html\n\n"; my $line1 = 'HKEY_LOCAL_MACHINE\SOFTWARE\Vendor\Product\CurrentVersion +\Tokens\Encotone\SerialNumberUserAttribute=12345'; my $line2 = 'LanMan:= ; REG_SZ'; my @line1_parts = split(/[=\\]/, $line1); my @line2_parts = split(/:= ;/, $line2); my %hash; my $href = {}; my $i; for $i (0 .. $#line1_parts) { if ($i == 0) { # for the first time around, create the first level of the str +ucture, itself as a hash. Autovivification means that simply referri +ng to it creates it. No need to put anything in it yet. $hash{$line1_parts[$i]} = {}; #create a hash reference to that element (which is itself a ha +sh); $href = \%{ $hash{$line1_parts[$i]} }; } elsif ($i == $#line1_parts) { # last time around, create the acutal hash elements you want, +filling in down to the bottom level with dummy elements for the Seria +lNumberUserAttribute. $href->{'dummy'} = $line1_parts[$i]; $href->{$line2_parts[0]}{'Value'}=""; $href->{$line2_parts[0]}{'Type'}=$line2_parts[1]; } else { # each next time around, create the next level of the structur +e, itself as a hash, tacked onto the hash that the hash ref points to $href->{ $line1_parts[$i] } = {}; # move the hash ref to this new hash $href = \%{ $href->{$line1_parts[$i]} }; } } print "<pre>\n"; print Dumper(\%hash); print "</pre>\n";
    Output:
    $VAR1 = { 'HKEY_LOCAL_MACHINE' => { 'SOFTWARE' => { 'Vendor' => { 'Product' => { 'CurrentVersion' => { 'Tokens' => { 'Encotone' => { 'SerialNumberUserAttribute' => { 'LanMan' => { 'Type' => ' REG_SZ', 'Value' => '' }, 'dummy' => '12345' } } } } } } } }



    Time flies like an arrow. Fruit flies like a banana.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1021474]
Approved by igelkott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2014-07-29 10:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (213 votes), past polls