Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

The basic answer is to make a hash from the relationships in input1, and use that to parse and process the information you need from input2. If I understand your problem, in this case I would probably create a hash of arrays, keyed on the values from column4, so I'd have something like this:

%hoa = ( 'frog-n' => ['alligator-n'], 'crocodile-n' => ['alligator-n'], );

(I'd use a hash of arrays instead of a simple hash because I assume other values from column1 could have a relationship with 'frog-n'. If that's not true, then this could be a simple hash.) Even if input1 is 4GB, since you're only interested in parts of certain lines, your hash may be much smaller.

Then I'd start going through input2, building a new multilevel hash based on the array elements from %hoa, with sub-keys from the new file, so I would be assigning values like this:

# from the first line: frog-n about adage-n 8.8016 for $key (@$hoa{frog-n}){ $newhash{$key}{about}{adage-n} += 8.8016; }

That will sum up repeated patterns as it goes, and it won't matter if they are consecutive. When it's done, go through that second hash and print it out in whatever format you like. There are still details to work out (like if you really want the sum elements displayed next to the sum like that, you may want to store them as an array and sum them in the last step), but that's the basic structure.

Aaron B.
My Woefully Neglected Blog, where I occasionally mention Perl.


In reply to Re: Select only desired features from a text by aaron_baugher
in thread Select only desired features from a text by remluvr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others browsing the Monastery: (17)
    As of 2014-09-23 12:53 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      How do you remember the number of days in each month?











      Results (221 votes), past polls