http://www.perlmonks.org?node_id=1029807

devi has asked for the wisdom of the Perl Monks concerning the following question:

I have large data set that has to be processed. when i tried with small data set, awk one liner seem to work. however i have a problem with memory issue and it is getting killed. How to include it in perl script for qsub?

Replies are listed 'Best First'.
Re: calling awk one liner from perl
by igelkott (Priest) on Apr 22, 2013 at 08:23 UTC

    Consider using a2p to help "upgrade" your awk command to Perl. Actually, I'd convert this by hand but a2p might help you get a first approximation.

Re: calling awk one liner from perl
by 2teez (Vicar) on Apr 22, 2013 at 07:12 UTC

    why would you want to call 'awk' one liner in perl? Have you tried solving the problem in perl at all?
    However, if you insist, you can check perl function system and it's several cousins...

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: calling awk one liner from perl
by Anonymous Monk on Apr 22, 2013 at 08:34 UTC
Re: calling awk one liner from perl
by hdb (Monsignor) on Apr 22, 2013 at 07:30 UTC

    If your awk command runs out of memory, calling it from Perl does not help as it will run out of memory as well. You probably need to translate it into Perl and then subdivide it in smaller, less memory-intense pieces.

    As always, people will be helpful, if you provide details and show what you have tried so far.

Re: calling awk one liner from perl
by Rahul6990 (Beadle) on Apr 22, 2013 at 07:12 UTC
    Hi vroom, sed and awk cannot handle large amount of data, because they try to load the entire file at once into the memory and then apply the changes on them. Even if you call a awk command from perl script same thing happen.

    But here is an example that you can try:
    $command = <awk command>; system($command);
      ... because they try to load the entire file at once into the memory and then apply the changes on them.

      No, they don't. Perhaps you're saving too much data into an associative array? Anyway, that's enough guessing games w/o any actual examples.

      sed and awk cannot handle large amount of data, because they try to load the entire file at once into the memory and then apply the changes on them

      That is quite some assertion. I'd be interested (and very surprised) to see any evidence which backs this up.

      ... sed and awk ... load the entire file

      No. Quite the converse. You talk about "sed" but don't know it is short for STREAM editor ... RTFM.