http://www.perlmonks.org?node_id=1077006


in reply to Re^2: header footer
in thread header footer

so am i missing something in thinking that read line by line check if pattern HDR exists - cut from end of header to rest of line and > outfile rest of lines > outfile keep going till find pattern > FTR - cut from FTR to end of line > outfile would this not work?

Replies are listed 'Best First'.
Re^4: header footer
by Kenosis (Priest) on Mar 04, 2014 at 23:34 UTC

    No, I don't think you're missing a thing, and it may look like:

    use strict; use warnings; while (<>) { s/^HDR.{47}|\KFTR.+//; print; }

    In fact, it may be faster than substr on the 3GB file, but am not sure. Pattern matching to remove the header seems just fine. However, if FTR exists anywhere else in the record, a substitution will mess up the record--which is something substr will not do.

Re^4: header footer
by graff (Chancellor) on Mar 04, 2014 at 23:34 UTC
    would this not work?

    Translate all that into perl code and see. If it doesn't work, post the perl code; if you don't know how to translate that, let us know where you're stuck.

    I think the code posted above by kenosis (at about the same time when about 10 minutes before you posted this question) should be a pretty good start, if not the full answer. It uses the "input record separator" to use "HDR" instead of new-line.

    On the first read, it'll just get "HDR", and output nothing. On each subsequent read, it will get a whole record (including the next occurrence of the string "HDR"), skip the first 47 characters (the rest of the header string), trim off the "FTR" and following text, and output just the remaining record content (including whatever line breaks it contains).