Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Complex Data Structure Suggestions Wanted

by Ninthwave (Chaplain)
on Jun 07, 2005 at 11:05 UTC ( #464229=perlquestion: print w/replies, xml ) Need Help??
Ninthwave has asked for the wisdom of the Perl Monks concerning the following question:

The many ways of perl sometimes leave me struggling when facing what is considered complex data structures.

I have a list of items represented by a unique number. Entries for this item may be duplicated. So I want to create hash with a key of the items id number.

I then want the value to be an anonymous array that lists the entry id. So that my final data structure is a hash with a key for each unique item. The value will lead me to see the individual entries on that item.

Now comes coding it. I have decided that routine to build the hash should exist in a sub, and for future portabibility it should use the hash by reference. At this point my idea of notation gets tossed around.

So my question is based on the sketch above and the code below what is the best way to write this:

sub ReadSource{ my $DataSource = shift; my $HashRef = shift; foreach my $ItemID (@Data){ #Note @Data is not globally + fed into the sub I have skipped the Data code to focus on the proble +m. if (exists($$HashRef{$ItemID})){ push @{$$HashRef{$ItemID}}, $DataSource; }else{ $$HashRef{$ItemID} = [$DataSource]; } } }

I haven't tested the code as this structure just looks unpleasent. I used as reference (all puns intended as usual) Perl Objects, Reference & Modules Chapter 4 page 40 and the code on Autovivification. I am just wondering as I prefer indirect notation if there is a way to clean this up. I am going to test my code now but was throwing the question out to the perl community to see if I as usual am missing the bleeding obvious.

"No matter where you go, there you are." BB

Replies are listed 'Best First'.
Re: Complex Data Structure Suggestions Wanted
by Roy Johnson (Monsignor) on Jun 07, 2005 at 11:15 UTC
    You don't have to check for existence. You can just use an undefined value as an arrayref and it will autovivify:
    foreach my $ItemID (@Data){ push @{$$HashRef{$ItemID}}, $DataSource; # The usual preferred notation is # push @{$HashRef->{$ItemID}}, $DataSource; }

    Caution: Contents may have been coded under pressure.

      Thank you, I prefer the preferred notation nice to agree with the masses for a change.

      "No matter where you go, there you are." BB
Re: Complex Data Structure Suggestions Wanted
by tlm (Prior) on Jun 07, 2005 at 11:17 UTC

    You don't need the exists test; perl will autovivify to DWYM, so the first branch of your test actually works for both conditions. And if you want to use "indirect" (arrow?) notation, then:

    push @{$HashRef->{$ItemID}}, $DataSource;
    BTW, I could be all wrong on this, but I think that the term "indirect notation" is typically used to refer to stuff like
    my $obj = new Thingie;
    as opposed to
    my $obj = Thingie->new();

    the lowliest monk

      You are correct on the indirect definition an dit is defined at least in the above mention book as indirect object notation. I wonder what the Nomenclature the ->. Something interesting to look up while researching.

      Thank you your notation makes the most sense in one line for what I was trying to do. And I didn't even want to play golf with it ;)

      "No matter where you go, there you are." BB
Re: Complex Data Structure Suggestions Wanted
by monarch (Priest) on Jun 07, 2005 at 11:25 UTC

    First may I say yours was a well written question, and through your research you'd actually got most of the concepts right.. and were pretty close to an answer!

    Because data structures get very complex very fast I almost never use two dollar signs in a row. I always encapsulate in {} braces.

    Having the data would have helped a lot. Because I suspect your structure is more complex than my interpretation of your sub-routine.

    Just cleaning up the subroutine for my own benefit I would have written:

Re: Complex Data Structure Suggestions Wanted
by cLive ;-) (Prior) on Jun 07, 2005 at 11:17 UTC
    I haven't tested this either, but what about this:
    sub ReadSource{ my $DataSource = shift; my $HashRef = shift; foreach my $ItemID (@Data) { defined $HashRef->{$ItemID} ? push @{$HashRef->{$ItemID}}, $DataSource : $HashRef->{$ItemID} = [$DataSource]; } }

    cLive ;-)

    edit - nevermind, I forgot about autovivifying. It's late dammit...

Re: Complex Data Structure Suggestions Wanted
by thcsoft (Monk) on Jun 07, 2005 at 11:19 UTC
    the if... else can be expressed a bit more elegantly:
    @{$HashRef->{$ItemId}} ||= []; push @{$HashRef->{$ItemId}}, $DataSource;
    apart from that: the notation of variable names you chose is not perlish. in perl you mix capital and small letters only for class names: package MyClass; vs:  my $hash_ref;

    language is a virus from outer space.

      Thank you for the help.

      I will have to disagree with you on variable naming convention though. Because perl does not have a rule for variable naming conventions in the language and this is internal code I use a convention that helps me explain what the variable is quickly. I use multiple words that describe the data as exact as possible. I use capitals to keep multiple words from blurring instead of underscores. Now if I wanted this to be a module for CPAN I would use the perl style guide. And I would simplify the variables to be few and one worders, so all lowercase.

      And I do have comments to explain variable so this is a bit overkill but when you are staring at chunks of code a good variable name will save you the second window that is stuck at your table of variable definitions.

      But I think this would be more of a meditation than a discussion on a question.

      "No matter where you go, there you are." BB
Re: Complex Data Structure Suggestions Wanted
by djohnston (Scribe) on Jun 07, 2005 at 17:43 UTC
    Here are a couple more alternatives...
    sub ReadSource{ my ($DataSource, $HashRef) = @_; my @Data = get_data(); for (@Data){ my $item = $HashRef->{$_} ||= []; push( @$item, $DataSource ); } }
    sub ReadSource{ my ($DataSource, $HashRef) = @_; my @Data = get_data(); push( @{ $HashRef->{$_} ||= [] }, $DataSource ) for @Data; }
    There really is more than one way to do it!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://464229]
Approved by jbrugger
Front-paged by jbrugger
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (8)
As of 2017-02-27 14:15 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (386 votes). Check out past polls.