Hash vs constant vs package vs other for data structure

oldtechaa has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Hash vs constant vs package vs other for data structure by haukex (Archbishop) on Mar 27, 2017 at 18:50 UTC
Use an AoAoH instead, and refer to each data member by name Use constant declarations to name each index; Use upper-case variables to show their status as constant names for indices Use a package with setter and getter functions or public data members and use an AoA with the package objects as the contents The first is probably the way I would go. The second is probably mostly useful if performance is a concern; also it would of course mean you don't have to change your data structure. The third OO approach might be useful if your objects not only have properties, but also methods. The only thing to take into consideration with the first solution is that you don't have automatic protection against typos in property names like you do in the other two solutions. One possible solution is locked hashes, like I showed here, but note also what I mentioned lower down in that same thread (here) in regards to that they might be deprecated someday.	[reply]
Re^2: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 27, 2017 at 19:03 UTC
I don't have methods, so I don't believe I'll go the OO route. It seems like the constant or uppercase methods perform better, are just as readable, and don't make me change my data structure. What are the advantages of the hash method?	[reply]
Re^3: Hash vs constant vs package vs other for data structure by haukex (Archbishop) on Mar 27, 2017 at 19:09 UTC
What are the advantages of the hash method? Two that I can think of off the top of my head are that you can have sparse property sets (e.g. if one object has `{foo=>1,bar=>2}` and the other has `{quz=>1,baz=>2}`, whereas you'd need four array elements to cover that, a bit of a waste), and that textual serializations of the data would be self-documenting. But if neither of those are a concern to you, then at the moment I can't think of major disadvantages to using constants for the array indicies.	[reply] [d/l] [select]
Re^4: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 27, 2017 at 19:46 UTC
Re^5: Hash vs constant vs package vs other for data structure by Anonymous Monk on Mar 27, 2017 at 20:10 UTC
Re^5: Hash vs constant vs package vs other for data structure by Anonymous Monk on Mar 27, 2017 at 20:24 UTC
Some notes below your chosen depth have not been shown here
Re^5: Hash vs constant vs package vs other for data structure by BillKSmith (Monsignor) on Mar 27, 2017 at 21:54 UTC
Re^3: Hash vs constant vs package vs other for data structure by FreeBeerReekingMonk (Deacon) on Mar 27, 2017 at 21:14 UTC
`perl -MData::Dumper -E '$A[3][5][0]="foo"; die Dumper \@A'` output: `$VAR1 = [ undef, undef, undef, [ undef, undef, undef, undef, undef, [ 'foo' ] ] ];` [download] Long? Just try 300 by 500! `perl -MData::Dumper -E '$A{3}{5}{0}="foo"; die Dumper \%A'` output: `$VAR1 = { '3' => { '5' => { '0' => 'foo' } } };` [download] smaller, but slower to iterate through if you have lots of entries. Still you can use an intermediate variable that acts like a pointer: `perl -MData::Dumper -E '$A{3}{5}{0}="foo"; $v = $A{3}{5}; $v->{1}="bar +"; die Dumper \%A'` [download] if you have fixed dimensions... you can also try: $NUM = $x + $y$WIDTH so if you have 300 pixels wide, (x,y)=(3,5) becomes 3+5300 = 1503 but check if you will not run above your maxint: 718414 with that method	[reply] [d/l] [select]
Re: Hash vs constant vs package vs other for data structure by AppleFritter (Vicar) on Mar 27, 2017 at 18:43 UTC
Given this... the first two dimensions refer to the location of an object and the third refers to properties of that object such as flags and other data ...I'd definitely do this: Use an AoAoH instead, and refer to each data member by name YMMV, of course.	[reply]
Re: Hash vs constant vs package vs other for data structure by Laurent_R (Canon) on Mar 27, 2017 at 19:57 UTC
The best way to store your data really depends on what you're doing with it afterwards. Also a top level hash might be better if your data points are sparse. Assuming that you'll never use the horizontal coordinate (abscissa) without the vertical coordinate (ordinate), you might even store them as a concatenated value in a hash, thereby simplifying your data structure by removing one level of nested-ness: `my %notes; # notes is now a hash #... $notes{"$x;$y"}[0] = ...` [download] or: `$notes{"$x;$y"}{...} = ...` [download]	[reply] [d/l] [select]
Re^2: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 27, 2017 at 20:34 UTC
This is an interesting technique. It would certainly work and I do have a sparse data set, but it doesn't seem clearer. What are the performance impacts of this?	[reply]
Re^3: Hash vs constant vs package vs other for data structure by Laurent_R (Canon) on Mar 27, 2017 at 21:19 UTC
What are the performance impacts of this? It would have to be measured, i.e. bench-marked, with real data. However, my gut feeling is that removing one level of nested-ness is likely to speed up things a bit, but probably not by a large margin. I doubt that you really care about the difference for what you're doing. So, don't worry too much about performance, unless you really have to. The hash solution (especially with concatenated keys) is very likely to use far less memory, at least with sparse data. Suppose you've got only one data point with coordinates (800, 1200). With an array of arrays, you have to allocate essentially 800 * 1200 array slots, that's quite a lot or memory for just one data piece. But with a hash you need to allocate only one or two hash entries; even considering that a hash entry uses more memory than an array entry, there is a significant win here.	[reply]
Re^4: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 28, 2017 at 13:12 UTC
Re^3: Hash vs constant vs package vs other for data structure by Laurent_R (Canon) on Mar 27, 2017 at 21:50 UTC
but it doesn't seem clearer Granted, but it makes things simpler (and easier) if you need to traverse your entire data structure. You essentially get a better data abstraction if you think in terms of "location", rather than "x-y coordinates".	[reply]
Re^3: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 27, 2017 at 20:41 UTC
I should probably note here that since I do have a sparse data set, the array members are undefined until needed.	[reply]
Re: Hash vs constant vs package vs other for data structure by Anonymous Monk on Mar 27, 2017 at 19:35 UTC
"As you can see, the first two dimensions refer to the location of an object and the third refers to properties of that object such as flags and other data." However you later state: "I don't have methods, so I don't believe I'll go the OO route." Well, too late. You already have and now you are trying to change the rules. Instead you should store the fact that an object is on the canvas and it's location. Use an array to store this info like so: `my @widgets = ( { object => $object, x => $x_location, y => $y_location }, { object => $object, x => $x_location, y => $y_location }, { object => $object, x => $x_location, y => $y_location }, );` [download] Note that those variables names are just placeholders for the real variables and their data. This way you only have to be concerned with the grid points that actually have data.	[reply] [d/l]
Re: Hash vs constant vs package vs other for data structure by Anonymous Monk on Mar 27, 2017 at 21:13 UTC
This thread seems like a lot of premature optimization. If you don't have an actual performance problem (too much memory used, or some operation is taking too long), then don't worry about it, and definitely don't spend any time re-engineering your existing code.	[reply]
Re^2: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 28, 2017 at 00:46 UTC
It actually was originally intended just to find a way to make my 3D array more readable. Optimization is a side-point that I feel must be taken into consideration when choosing a better solution.	[reply]
Re: Hash vs constant vs package vs other for data structure by BrowserUk (Patriarch) on Mar 29, 2017 at 13:22 UTC
FWIW: my preference would be for the use of enum for the constants, and stick with the AoAoA (unless sparsity is required): `use enum qw[ X Y ];` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity. In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 29, 2017 at 21:51 UTC
This was another option I didn't add to the list, but sparse data support would be nice.	[reply]
Re: Hash vs constant vs package vs other for data structure by Anonymous Monk on Mar 27, 2017 at 19:04 UTC
"the first two dimensions refer to the location of an object ..." Why? This seems to be the root of your problem.	[reply]
Re^2: Hash vs constant vs package vs other for data structure by oldtechaa (Beadle) on Mar 27, 2017 at 19:07 UTC
This array is holding the data from a custom GTK widget and I need to know every point in a grid on the widget to store data from. Obviously, a two-dimensional access is the easiest way of managing it.	[reply]
Re^3: Hash vs constant vs package vs other for data structure by Anonymous Monk on Mar 27, 2017 at 19:15 UTC
This makes no sense ...	[reply]


Clear questions and runnable code get the best and fastest answer
	PerlMonks