Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Parsing a hash of hashes using a schema file

by NewLondonPerl1 (Acolyte)
on Apr 02, 2013 at 01:09 UTC ( [id://1026558]=perlquestion: print w/replies, xml ) Need Help??

NewLondonPerl1 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perl monks I am new to perl and was hoping you could possibly help me with how best to go about doing the following. I have tried many different ways and I just can't seem to get this working. I have a very large file which consists of numerous hashes of hashes and I need to parse this data structure using a defined schema file and then output all the values in the order defined within the schema. Below is a very small sample of the file containing the hashes of hashes that I need to parse. I need to be able to parse this without hard coding any of the hash names (@dev, @fred etc). The original file has about 50 more hashes of hashes defined.

{ "schema":"Configuration file for our servers", "description":"mount points for servers", "@dev":{ "home_nfs":{ "%default":{ "%mount_opts":"nfsvers=3,timeo=600,retrans=2", "%mount_user":"root", "%mount_group":"root", "%mount_acl":"0755" }, "home-lnk-mpt":{ "%export_name":"/links10", "%filer_device":{ "ny_loc":"nydevnfstest10_links", "nj_loc":"njdevnfstest11_links" }, "%filer_volume":{ "ny_loc":"/vol/linkstest10", "nj_loc":"/vol/linkstest11" } } } }, "@fred":{ "home_nfs":{ "%default":{ "%mount_opts":"nfsvers=3,timeo=600,retrans=2", "%mount_user":"root", "%mount_group":"root", "%mount_acl":"0755" }, "home-lnk-mpt":{ "%export_name":"/links", "%filer_device":{ "ny_loc":"nydevnfs_links", "nj_loc":"njdevnfs_links" }, "%filer_volume":{ "ny_loc":"/vol/links", "nj_loc":"/vol/links" } } } } }

I have the following schema file that defines all the keys that a hash should have (this is the %definition) and then an %output section that defines how the data should be outputted

{ "%name": "nfs mount schema", "%description": "Definitions for nfs_mount properties and their re +quired fields", "%definition": {"nfs": { "%comment": "Properties for an NFS mount definition", "%mount_opts": {"%type": "string"}, "%mount_user": {"%type": "string"}, "%mount_group": {"%type": "string"}, "%mount_acl": {"%type": "integer"}, "%export_name": {"%type": "path"}, "%filer_device": {"%type": "string"}, "%filer_volume": {"%type": "path"} }}, "%output": { "%comment": "Output format available", "mountnfs": { "%comment": "Output format needed for the mount script", "%uses": "nfs", "%format": "%tag:%filer_device:%filer_volume:%export_name" } } }

So what I am trying to do is essentially output two files; one for ny_loc and one for nj_loc. I need to read the file containg all the hashes of hashes and compare each hash with the schema and if the hash has the all the keys that are defined in the schema %definition section then output a line of text to ny_loc file and another line of text to nj_loc file using the output format described in the schema %output section. And continuing doing the same for every hash encountered and appending lines of text to the files ny_loc and nj_loc. So for example I want to be able to do something like this. Read file containing hashes of hashes and then for example I find @dev, @fred, @fred1 etc I then want to search through each of these for anymore hashes. In the small sample I have provided I would then find home_nfs in both @dev and @fred (note in the original file there are loads more hashes defined at this level). Then I want to be able to look at all the hashes defined at this level for example home_nfs etc and then check each and every hash like home-lnk-mpt etc to see whether it has all the keys defined in the schema %definition section and if it has then using the format in the %output section of the schema I want to output a line of text to ny_loc file and a line of text to nj_loc file for each hash that is processed. So when this particular hash of hashes is parsed I will have two files with this in:

ny_loc file = @dev:nydevnfstest10_links:/vol/linkstest10:/links10 @fred:nydevnfs_links:/vol/links:/links nj_loc file = @dev:njdevnfstest11_links:/vol/linkstest11:/links10 @fred:njdevnfs_links:/vol/links:/links

Replies are listed 'Best First'.
Re: Parsing a hash of hashes using a schema file
by kcott (Archbishop) on Apr 02, 2013 at 13:16 UTC

    G'day NewLondonPerl1,

    I had some real problems trying to equate your schema (both code and description) with the data you presented. Here's some of the issues:

    • Your generic use of the word "hash", without further qualification, requires assumptions which may not be correct. I'm not trying to be pedantic or obtuse, and I'm pretty sure I've understood your intent throughout, but if I haven't, that may well be the reason.
    • %definition apparently "defines all the keys that a hash should have"; however, this only contains the single key nfs: the hash contains no keys called nfs.
    • %definition:nfs contains a number of keys with various issues:
      • %comment is not a key in the hash.
      • %filer_device's type is not a string; it's a hash.
      • %filer_volume's type is not a path; it's a hash.
      • Overall, there's no indication of the structure where these keys exist.
    • %output:mountnfs: What does the "mountnfs" refer to? It's not mentioned anywhere else.
    • %output:mountnfs:%format seems to have many issues:
      • %tag: What does ths refer to? It's not mentioned anywhere else.
      • %filer_device: The definition suggests a string should be output here but, as mentioned above, the type is wrong.
      • %filer_volume: The definition suggests a path should be output here but, as mentioned above, the type is wrong.

    Given all these problems, I am not able to provide a solution that uses your schema. However, the following code generates the output you want. When you've sorted out the schema issues, perhaps you can integrate that into this code. Here's pm_data_1026558.pl:

    #!/usr/bin/env perl use strict; use warnings; my $input = do { local $/; <> }; $input =~ y/":/',/; my $data = eval $input; for my $file (qw{ny_loc nj_loc}) { print "$file file =\n"; for my $key (sort grep { ref $data->{$_} eq 'HASH' } keys %$data) +{ my $home_link_mpt = $data->{$key}{'home_nfs'}{'home-lnk-mpt'}; print join(':', $key, $home_link_mpt->{'%filer_device'}{$file}, $home_link_mpt->{'%filer_volume'}{$file}, $home_link_mpt->{'%export_name'} ), "\n"; } print "\n"; }

    Here's the output. (The file pm_data_1026558.dat contains the first block of data you posted, verbatim.)

    $ pm_data_1026558.pl pm_data_1026558.dat ny_loc file = @dev:nydevnfstest10_links:/vol/linkstest10:/links10 @fred:nydevnfs_links:/vol/links:/links nj_loc file = @dev:njdevnfstest11_links:/vol/linkstest11:/links10 @fred:njdevnfs_links:/vol/links:/links

    -- Ken

Re: Parsing a hash of hashes using a schema file
by Anonymous Monk on Apr 02, 2013 at 01:59 UTC

      Thanks alot for you help. I will take a look at all of these now.

        I have looked at all of these modules but unfortunately we dont have any of these installed :-(

Re: Parsing a hash of hashes using a schema file
by NewLondonPerl1 (Acolyte) on Apr 02, 2013 at 01:22 UTC

    This is what I have tried so far:

    while (my($tagkey, $tagvalue) = each %{$my_hash_of_hashes}) { if (ref $tagvalue eq 'HASH') { print "TAG:::> $tagkey value $tagvalue\n"; push(@tag_name, "$tagkey"); } foreach my $hash_tag_names (@tag_name) { while (my ($tagkey1, $tagvalue1) = each %{$my_hash_of_ +hashes->{$hash_tag_names}}) { if (ref $tagvalue1 eq 'HASH') { print "TAG:::> $tagkey1 value $tagvalu +e1\n"; } } } }

    As I am not allowed to hardcode hash names like @dev, @fred etc I thought the best way would be to be to first parse this hash of hashes and work out what keys were had values that were 'HASH' and then populate an array called @tag_name and then once I have got all the tags (@dev, @fred etc etc...) in this array I could then do a loop on this array and then for each hash then check for more hashes within these and until I get to the level where all the keys are defined as per whats in the schema %definition and then start outputting data to the files ny_loc and nj_loc. I am just stuck as how to proceed further with what I have done so far. Please can you help me as I haven't done anything like this before?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1026558]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (3)
As of 2024-03-29 02:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found