Sorry for my terseness yesterday, that bloody "real world" kept interfering with my PerlMonks time :-)
Here's a breakdown of what I came up with, with the change that we now split the data only once.
# create a "cache": "is this a NETID?"
# (if I understand your variables correctly...)
#
# map takes in the list of values of %totalnetname, which
# I am "assuming" are all of the "interesting" net ID's.
# In order to look them up quickly, we build a hash by
# emitting (from inside the {}) key=>value pairs of the
# form $id => 1.
#
# If there were "uninteresting" ID's, now would be the
# ideal time (IMHO) to "filter" them out. This could be
# most easily done by changing the BLOCK to something like
# { &is_it_interesting($_) ? ( $_ => 1 ) : () }
#
# Better yet, never put in the "uninteresting" values
# (via %totalnetname) to begin with...
#
# (As a guess: is %totalnetname perhaps a tied DB_File?
# That would probably make editing it directly unwise in
# this situation. If it's not, and the NETIDs are fairly
# constant, putting it into a DB_File would likely be a
# good idea, YMMV/TIMTOWTDI/standard disclaimers...)
#
my %ids = map { $_ => 1 } values %totalnetname;
#
# Clear out the 'output' hash, declaring it lexically
# (lexically ~~ "my")
#
my %ECLDATA = ();
#
# This brace starts a lexical scope so that our messy
# temporary variables are garbage-collected when we're
# done with them. There's likely a way to do this without
# temporary variables -- it seems that there always is --
# but that will have to be [merlyn]'s problem to right ;-)
#
{ # lexical scope for leftovers
#
# Where do bad records go? If the dataset is known (or at
# least expected) to contain only NETID|data... records,
# there's no need to have a @leftovers array.
#
my @leftovers = ();
#
# This is almost a 'for my $entry (@ECL_STAT),' but it
# isn't. Looking at it again today, I don't remember why
# it isn't, so it probably could be, or even should be,
# if only for readability.
#
# UPDATED: It's not, because, of course, shift removes its
# value from @ECL_STAT, leaving it empty for @leftovers.
# This is marginally unimportant, since @ECL_STAT will be
# clobbered afterwards anyways, but might decrease memory
# usage somewhat since each record is in RAM only ~~ 1ce
# at a time.
#
while (my $entry = shift @ECL_STAT)
{
# ignore non-|-delimited lines
#
# This 'rejects' any lines which don't contain |'s.
# Reading 'inner-first:'
# 1. split $entry on | (note my typo yesterday of $_!)
# 2. assign (in list context) to $netid and @stuff
# 3. assignment (in list context) returns the number
# of items assigned, so the 'unless' is getting
# a number >= 1 (since split will return $entry
# if we have no |'s). The missing >1 was another
# hasty mistake of mine yesterday :-P
# 4. go through the loop again if we got 1 (or 0?)
next unless
(my ($netid, @stuff) = split /\|/, $entry) > 1;
# push lines which begin with any NETID
#
# This stores a reference to a new list in either
# $ECLDATA{$netid} (if the $netid we got, above, is in
# %ids) or @leftovers (if not). This could be simplified
# to push ( @{ $ECLDATA{$netid} } ), [ $netid, @stuff ];
# if every possible netid is "interesting." (This would
# also remove the my %ids = map... above)
#
# NB. that @{ } is the 'array dereference' operator,
# which just means 'give me the array which this is a
# reference to, rather than the reference.' I read it
# aloud as "the list/array referred to by ..."
#
# The use of references is due to the fact that an
# array cannot be (directly) the 'value' of an hash
# element.
#
push ( $ids{$netid}
? @{ $ECLDATA{$netid} }
: @leftovers ), [ $netid, @stuff ];
}
#
# For compatibility with the original script, we push
# back 'leftovers' into @ECL_STAT; this might be
# unnecessary.
#
@ECL_STAT = @leftovers;
#
# At this point, @leftovers goes out of scope and is
# removed from memory.
#
}
#
# This routine doesn't need to split the data, or any of
# that, because it's picking up its data from the @ECLDATA
# array.
#
for my $key (@totalkeys)
{
my $NETID = $totalnetname{$key};
#
# Don't try to do any work if no data was found.
#
next unless exists $ECLDATA{$NETID};
#
# This could also assign to the various 'field'
# variables you might have, eg.
# my ($name, $address, $city) = @{ $ECLDATA{$NETID} };
#
my @ECLDATA = @{ $ECLDATA{$NETID} };
# ... lots, as before.
}
Hopefully, that's a bit easier to understand. Good luck in your efforts! |