Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

A question of style: Composite keys to multiple values

by Voronich (Hermit)
on Aug 06, 2012 at 18:46 UTC ( #985798=perlquestion: print w/ replies, xml ) Need Help??
Voronich has asked for the wisdom of the Perl Monks concerning the following question:

Day 1: There's a driver table that has four columns. (note: that's a lie, but a useful one.)

External_Name External_Value Internal_Name Internal_Value
I retrieve rather a lot of data from someplace. It comes in with External Name and Value fields. But we don't want to deal with the 'External_*' values. Instead they need to be translated to their internal equivalents. For instance we could have:
External_Name External_Value Internal_Name Internal_Value GLD_SPT PX_VALUE Gold Price AXXX111 PX_VALUE Frob Spread
and a data row would some in with "GLD_SPT,PX_VALUE" and I need to translate those to "Gold,Price" before sending the data down the pipe line.

But I REALLY don't want to be tagging the database for every row as I iterate across the set (which notably does NOT come from a database.) It's remarkably stupid given the volume. No problem, preload.

So I preloaded a hash with the following(ish.)

$Name_Resolver{lc("$External_Name|$External_Value")} = $Internal_Name;
and it actually works surprisingly well. For every row of data I get, I smack together a composite key and resolve it to Internal Name. The lookup table is prefetched. All happy nice nice.

Day 2: You should've seen this coming because it always happens to you. Yeah ok. But we need Internal_Value now as well....by tomorrow.

Ok, "by tomorrow" means add a parallel "%Value_Resolver" with the same composite string key that resolves to Internal_Value. It (of course) prefetches on the same load as the other hash.

And... sure. It works. But the whole approach makes me want to disavow it. The right "value" should be a simple list (or sub-hash? seems unnecessary) containing both values. But using a "two strings smacked together" key just seems wrong. I'm just not sure what would ACTUALLY be simpler (for values of 'simpler' approximating "easier for a programmer to understand.")

I'm in the peculiar position of being able to spend a (very) little time refactoring this code and cleaning it up a bit.

Thoughts?

EDIT: Been a while. I forget people can't read my mind for the rest of the context on these things. Hopefully this is a bit more clear.

Comment on A question of style: Composite keys to multiple values
Select or Download Code
Re: A question of style: Composite keys to multiple values (context)
by tye (Cardinal) on Aug 06, 2012 at 19:27 UTC

    I can't really make heads nor tails out of your description. Consider this line of code:

    $Field_A_Resolver{lc("$ExternalA|ExternalB")} = Internal_A;

    What is the purpose of $ExternalA? What values might it hold? Did you mean to type $ExternalB instead of ExternalB (without the dollar sign)? What is Interal_A? Is that supposed to be a bareword string or a variable? If a variable, what does it hold?

    - tye        

      Yep. Fair points all. Updated. Thanks o/
Re: A question of style: Composite keys to multiple values (hash)
by tye (Cardinal) on Aug 06, 2012 at 19:40 UTC
    $Fields{lc("$ExtA|$ExtB")} = { A => $IntA, B => $IntB };

    You might also want a subroutine as interface to your %Fields hash so you aren't writing lc("$ExtA|$ExtB") in more than once place.

    - tye        

Re: A question of style: Composite keys to multiple values
by tobyink (Abbot) on Aug 06, 2012 at 19:41 UTC

    Have you seen $;? (perlvar for info.)

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: A question of style: Composite keys to multiple values
by BrowserUk (Pope) on Aug 06, 2012 at 19:50 UTC
    means add a parallel "%Field_B_Resolver"

    Why a parallel hash? Why not just preload one hash with both values:

    $resolver{ {lc("$External_A|$External_B") } = [ Internal_A, Internal_B + ];

    And retrieve they together.

    Or if the table is large and space is tight:

    $resolver{ {lc("$External_A|$External_B") } = "Internal_A Internal_B";

    And split them for use.

    But using a "two strings smacked together" key just seems wrong.

    Again why? Composite keys are a perfectly natural solution used in all manner of DB/dataretreival mechanisms.

    You could use nested hashes $resolver{ External_A }{ External_B}.

    But if the code is working as is, I'd want a reason (beyond is "seeming wrong") before I'd change it.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      The second hash was a matter of expedience based on the incredibly limited amount of time I had to get a change done and in SCM. It fell deep into the land of "unforgiveable mess that I could be absolutely confident about, thereby making the business customer happy as a clam."

      As my business customer is hands down the best one I've EVER had in my career I was able to have the "ok I can get this in for tomorrow. But that means that you get no changes next week while I fix what I had to do to do this" conversation with him. A condition which, along with the desperation on my face, compelled him to concede to.

Re: A question of style: Composite keys to multiple values
by moritz (Cardinal) on Aug 06, 2012 at 19:51 UTC
    And... sure. It works. But the whole approach makes me want to disavow it. The right "value" should be a simple list (or sub-hash? seems unnecessary) containing both values. But using a "two strings smacked together" key just seems wrong. I'm just not sure what would ACTUALLY be simpler (for values of 'simpler' approximating "easier for a programmer to understand.")

    Well, an in-memory database would be nicest, if there is an easy way to query it. And to be fast, it would build two indexes, which would either be b-trees or... *drummroll* hash tables.

    But I wouldn't introduce such a big dependency for a relatively trivial feature.

    So in the end I think it boils down to two hashes, which is the most pragmatic solution in Perl space.

    It works, it isn't very complicated, it's easy to udnerstand for the reader -- what more do you want? Unless memory becomes scarce, I'd stay with the current solution.

      Yeah ok. I'll buy most of that. I'm going to merge the values into a list, since there isn't really any need for multiple hashes like that, and the values need to resolve at the same point in the code. (It's not like there's any semantic difference in the usage at all.)

      The problem about elegance (or, more importantly: simplicity) is that this is the beginning of a code base that's going to be expanding rather a lot. I can see the additional exception logic and strange additional conditions coming in to pollute this, so I'm trying to nip it in the bud.

      But then I'm almost certainly playing "Premature Optimization" games.

      Thanks o/
Re: A question of style: Composite keys to multiple values
by Anonymous Monk on Aug 06, 2012 at 20:02 UTC
    Make it work. Get it done. Build a hash for each. No one's going to be wagging their finger at you and saying that you didn't do it sufficiently elegantly.
      Not other than me, no ;)
        Okay, okay, "besides Voronich." But you already know about Voronich ... please don't judge the whole community by ... ;-) ;-)
Re: A question of style: Composite keys to multiple values
by TGI (Vicar) on Aug 07, 2012 at 17:27 UTC
    Hide the complexity behind an interface.
    package My::FieldMap; our %INTERNAL_VALUES = ( 'foo|bar' => 'fb_val' ); our %INTERNAL_NAMES = ( 'foo|bar' => 'fb_val' ); sub _build_key { return lc( join '|', @_); } sub get_internal_name_value { shift if $_[0] eq __PACKAGE__; # Can call as pkg method my $ext_name = shift; my $ext_value = shift; my $composite_key = _build_key( $ext_name, $ext_value ); # probably want to put some error checking code here. # What do you do if called in scalar context? # What if one or both keys don't exist? return ( $INTERNAL_NAMES{$composite_key}, $INTERNAL_VALUES{$composite_key} ); }

    Use it like this:

    my ($in,$iv) = My::FieldMap->get_internal_name_value( $ext_name, $ext_ +value ); # or my ($in,$iv) = My::FieldMap::get_internal_name_value( $ext_name, $ext_ +value );

    Now, when you have to deal with millions of field/name pairs, and you move to Redis or memcached to store them, you just have to update the lookup function.

    A good interface with horrible code behind it is 10000% better than the most elegant snippets sprinkled everywhere.


    TGI says moo

      Yep. I like the idea. It "seems" heavyhanded at first blush. But the benefit on the aesthetics of the client side code would be marked.

      I expect it won't grow past an absolute outer bound of a few hundred k pairsets. By then the thing to do would be to migrate all of it into stored procs (along with a couple levels farther out so it's all internal to the database.)

      But I'll burn that bridge when I get to it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://985798]
Approved by davies
Front-paged by MidLifeXis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2014-09-21 19:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (175 votes), past polls