Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Having an Issue updating hash of hashes

by Laurent_R (Canon)
on Jul 05, 2014 at 17:33 UTC ( [id://1092383]=note: print w/replies, xml ) Need Help??


in reply to Having an Issue updating hash of hashes

Given that $cntr is just a counter that is incremented for each data element of your list, you should probably use an array of hashes, rather than a reference to an hash of hashes. For example you could change your main procedure as follows (untested):
sub getPeople { my ( $id, $first, $last, $age ); my $file = 'list.txt'; my @people; # using directly an array # my $cntr; -- now useless open( LIST, "< $file" ) or die "Can't open $file : $!"; while (my $row = <LIST> ) { # $cntr++; -- now no longer useful my ($id, $first, $last, $age ) = split( /\s/, $row ); $id = (split( /=/, $id ))[1]; $first = (split( /=/, $first ))[1]; $last = (split( /=/, $last ))[1]; $age = (split( /=/, $age ))[1]; push @people, { 'id' => "$id", 'first' => "$first", 'last' => "$last", 'age' => "$age" }; } }
I think that getting the individual values to be stored in the hashes could be made significantly simpler, but that's not what you asked for, I don't want to get off-topic at this point. If you're interested, other monks and myself can of course help you on that.

Replies are listed 'Best First'.
Re^2: Having an Issue updating hash of hashes
by perlguyjoe (Novice) on Jul 05, 2014 at 17:48 UTC
    I would love a way to neaten up the splits to get the proper data for the fields! I did it this way because it works, but I definitely am not getting brownie points for beauty.
      OK, one possible way:
      while (my $row = <LIST> ) { my ($id, $first, $last, $age) = (split /[\s=]/, $row)[1, 3, +5, 7]; push @people, { 'id' => "$id", 'first' => "$first", 'last' => "$last", 'age' => "$age" }; }
      It could be done in an even shorter way (one single instruction), but I do not think this would be a good idea, because it would become somewhat more difficult to understand and to maintain. Whereas I think the above remains fairly clear and quite easy to understand and to maintain. Using a regex could also do the job, but I doubt it could be clearer or simpler than the above.
        'id' => "$id",

        Quite frequently around the monastery of late, I've noticed this practice (idiom? tic?) of (apparently) needlessly interpolating a scalar into a string. I don't understand it. Is there any benefit to be had from it? Where does it originate?

        Update: When I originally posted this, I went looking for some examples of this 'frequent' practice and, of course, couldn't find any, got annoyed, gave up. Here are some recent examples: Perl function calls. and its cousin Perl : Convert a monolithic code to a function. They are both by grasshopper user786, but I'm sure he or she is not the only 'offender'. As you will see in the code of the linked posts, none of the interpolation has anything to do with avoidance of numification.
        Also: System output variables and newline:  print "$dcr"; statement at end;

        I completely agree with your stance on readability. I manage some code written by my predecessor. While he was very creative in writing short compact and efficient code, it can be very hard to follow along, if your just looking in. Here is the final result:
        #!/usr/bin/perl use Data::Dumper; use strict; use warnings; getPeople(); sub getPeople { my $file = 'list.txt'; my $people; open( LIST, "< $file" ) or die "Can't open $file : $!"; while (my $row = <LIST> ) { my ($id, $first, $last, $age) = (split /[\s=]/, $row)[ +1, 3, 5, 7]; $people -> { $id } = { id => $id, first => $first, last => $last, age => $age }; } print Dumper($people); print "The person with ID 3 is $people->{3}{first} $people->{3}{last}\ +n"; close LIST; }

      Contrary to Laurent_R's aversion to using a single regex to extract data fields from a record expressed herein, I find it's often both more robust and more maintainable.

      The trick is to combine record validation and record field extraction in one operation. Of course, in the words of the famous witticism, now you have two problems: coming up with a regex to match an entire data record may not be easy (and robustly matching, e.g., a name, even if the nationality domain is well defined, can be quite tricky, so you often end up with a hack like  \S+ as a 'temporary' expedient), but once defined, the regex, properly factored, can be quite clear and fairly easy to maintain.

      The example below takes liberties with names, those tricky devils, and otherwise assumes much about the OPed dataset, but shows the basic idea.

      c:\@Work\Perl>perl -wMstrict -le "my $record = do { my $id = qr{ \d+ }xms; my $name = qr{ [[:upper:]] [[:lower:]]+ }xms; my $first = $name; my $last = $name; my $age = qr{ \d+ }xms; qr{ \A ID= ($id) \s+ First= ($first) \s+ Last= ($last) \s+ AGE= ($age) \z }xms; }; ;; for my $rec ('ID=1 First=John Last=Doe AGE=42', @ARGV) { my ($id, $first_name, $last_name, $age) = $rec =~ m{ $record }xms or die qq{malformed record: '$rec'}; print qq{id '$id' first '$first_name' last '$last_name' age '$ag +e'}; } " "ID=2 First=Joe Last=42 AGE=Doe" id '1' first 'John' last 'Doe' age '42' malformed record: 'ID=2 First=Joe Last=42 AGE=Doe' at -e line 1.
        Hi AnomalousMonk,

        Contrary to Laurent_R's aversion to using a single regex to extract data fields from a record ...

        I have no aversion whatsoever for regexes, I actually use them very often and I love them. ;-)

        I was only saying that, in that specific case, the use of the split function (which, BTW, uses explicitly a regex in the case in point) would IMHO lead to more concise and probably clearer code. Your suggested code definitely reaches the aims of clarity and ease of maintenance, but not the aim of concision.

        If the aim is concision, then the regex could be something like this (tested under the Perl debugger):

        DB<17> $line = "ID=1 First=John Last=Doe AGE=42"; DB<18> $word = qr/[a-zA-Z]+/; DB<19> ($id, $first, $last, $age) = $line =~ /^ID=(\d+)\s+First=($wo +rd)\s+Last=($word)\s+AGE=(\d+)\s*$/; DB<20> x ($id, $first, $last, $age) 0 1 1 'John' 2 'Doe' 3 42
        or even in one single line:
        my ($id, $first, $last, $age) = $line =~ /^ID=(\d+)\s+First=([a-zA-Z]+ +)\s+Last=([a-zA-Z]+)\s+AGE=(\d+)\s*$/;
        which is now quite concise, but arguably less clear and maintainable than the simple split I originally suggested. Admittedly, the above regex does a bit more data validation than the split version, but whether you actually need validation or not depends on the situation (essentially: where is the input data coming from?), sometimes you don't need (e.g. you produced the data yourself and you really know what it looks like), sometimes you do, but it can be difficult to figure out how extensive your validation process should be. May be the $word regex definition should be something like this:
        $word = qr/[A-Z][a-z]+/;
        or maybe simply:
        $word = qr/[a-z]+/i;
        Notice that this is opening an entirely different subject. Well, I'll leave it there, as this is getting slightly off-topic.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1092383]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2024-03-28 12:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found