Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Sorry for my terseness yesterday, that bloody "real world" kept interfering with my PerlMonks time :-)

Here's a breakdown of what I came up with, with the change that we now split the data only once.

# create a "cache": "is this a NETID?" # (if I understand your variables correctly...) # # map takes in the list of values of %totalnetname, which # I am "assuming" are all of the "interesting" net ID's. # In order to look them up quickly, we build a hash by # emitting (from inside the {}) key=>value pairs of the # form $id => 1. # # If there were "uninteresting" ID's, now would be the # ideal time (IMHO) to "filter" them out. This could be # most easily done by changing the BLOCK to something like # { &is_it_interesting($_) ? ( $_ => 1 ) : () } # # Better yet, never put in the "uninteresting" values # (via %totalnetname) to begin with... # # (As a guess: is %totalnetname perhaps a tied DB_File? # That would probably make editing it directly unwise in # this situation. If it's not, and the NETIDs are fairly # constant, putting it into a DB_File would likely be a # good idea, YMMV/TIMTOWTDI/standard disclaimers...) # my %ids = map { $_ => 1 } values %totalnetname; # # Clear out the 'output' hash, declaring it lexically # (lexically ~~ "my") # my %ECLDATA = (); # # This brace starts a lexical scope so that our messy # temporary variables are garbage-collected when we're # done with them. There's likely a way to do this without # temporary variables -- it seems that there always is -- # but that will have to be [merlyn]'s problem to right ;-) # { # lexical scope for leftovers # # Where do bad records go? If the dataset is known (or at # least expected) to contain only NETID|data... records, # there's no need to have a @leftovers array. # my @leftovers = (); # # This is almost a 'for my $entry (@ECL_STAT),' but it # isn't. Looking at it again today, I don't remember why # it isn't, so it probably could be, or even should be, # if only for readability. # # UPDATED: It's not, because, of course, shift removes its # value from @ECL_STAT, leaving it empty for @leftovers. # This is marginally unimportant, since @ECL_STAT will be # clobbered afterwards anyways, but might decrease memory # usage somewhat since each record is in RAM only ~~ 1ce # at a time. # while (my $entry = shift @ECL_STAT) { # ignore non-|-delimited lines # # This 'rejects' any lines which don't contain |'s. # Reading 'inner-first:' # 1. split $entry on | (note my typo yesterday of $_!) # 2. assign (in list context) to $netid and @stuff # 3. assignment (in list context) returns the number # of items assigned, so the 'unless' is getting # a number >= 1 (since split will return $entry # if we have no |'s). The missing >1 was another # hasty mistake of mine yesterday :-P # 4. go through the loop again if we got 1 (or 0?) next unless (my ($netid, @stuff) = split /\|/, $entry) > 1; # push lines which begin with any NETID # # This stores a reference to a new list in either # $ECLDATA{$netid} (if the $netid we got, above, is in # %ids) or @leftovers (if not). This could be simplified # to push ( @{ $ECLDATA{$netid} } ), [ $netid, @stuff ]; # if every possible netid is "interesting." (This would # also remove the my %ids = map... above) # # NB. that @{ } is the 'array dereference' operator, # which just means 'give me the array which this is a # reference to, rather than the reference.' I read it # aloud as "the list/array referred to by ..." # # The use of references is due to the fact that an # array cannot be (directly) the 'value' of an hash # element. # push ( $ids{$netid} ? @{ $ECLDATA{$netid} } : @leftovers ), [ $netid, @stuff ]; } # # For compatibility with the original script, we push # back 'leftovers' into @ECL_STAT; this might be # unnecessary. # @ECL_STAT = @leftovers; # # At this point, @leftovers goes out of scope and is # removed from memory. # } # # This routine doesn't need to split the data, or any of # that, because it's picking up its data from the @ECLDATA # array. # for my $key (@totalkeys) { my $NETID = $totalnetname{$key}; # # Don't try to do any work if no data was found. # next unless exists $ECLDATA{$NETID}; # # This could also assign to the various 'field' # variables you might have, eg. # my ($name, $address, $city) = @{ $ECLDATA{$NETID} }; # my @ECLDATA = @{ $ECLDATA{$NETID} }; # ... lots, as before. }

Hopefully, that's a bit easier to understand. Good luck in your efforts!


In reply to Ra: Grep Effeciency by baku
in thread Grep Effeciency by ImpalaSS

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-19 21:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found