in reply to Wildcard for key in hash lookup to skip over level


I spent a little time refactoring your code so I could make better sense of it. I still don't know exactly how the columns map to your text description. That being said, it looks like original suggestion, while often helpful, might not work in your case, if you will be re-using %Pos_overlap for something else.

If that's the case, you might indeed be better served with a more complex data structure (which could indeed mean two hashes (trees, actually), or sub-trees of the current hash). It will essentially be a memory/performance tradeoff; storing multiple representations takes more memory, but can reduce operations to ~O(1) that would otherwise be O(n).

It's looking like memory is probably not a concern, as you already read and store the complete contents both files in memory, in addition to the hash. If you convert your @var = (<FILE>) loops to while (<FILE>) { ... } loops, you can save a good deal of memory right now, for free.

The usual way to accomplish something like this is to write your own hashing function. In Perl, this is roughly equivalent to passing your preliminary key through some sort of filter subroutine before you access it. You probably have done something like $names{lc($name)}++ without even thinking about it.

This has a small cost (depending on how complex your function is), but if your potential wildcard expansion is more than a handful of elements (or even countably infinite...), it's a huge win.

By the way, your code was a bit hard to follow with the 200+ character lines, and would have made more sense if you would have labeled the column names like so:

my ($foo, $bar, $baz, $qux) = split /\t/;

(Of course replacing those names with whatever your columns should be called.)