record is arbitrary in length yes. But like i said the header and footer lenght is always 50 and 30. Individual record is only about 10kb. But there are way too many records. record size is not encoded in the header.
so am i missing something in thinking that
read line by line
check if pattern HDR exists - cut from end of header to rest of line and > outfile
rest of lines > outfile
keep going till find pattern > FTR - cut from FTR to end of line > outfile
would this not work?
In fact, it may be faster than substr on the 3GB file, but am not sure. Pattern matching to remove the header seems just fine. However, if FTR exists anywhere else in the record, a substitution will mess up the record--which is something substr will not do.
Translate all that into perl code and see. If it doesn't work, post the perl code; if you don't know how to translate that, let us know where you're stuck.
I think the code posted above by kenosis (at about the same time whenabout 10 minutes before you posted this question) should be a pretty good start, if not the full answer. It uses the "input record separator" to use "HDR" instead of new-line.
On the first read, it'll just get "HDR", and output nothing. On each subsequent read, it will get a whole record (including the next occurrence of the string "HDR"), skip the first 47 characters (the rest of the header string), trim off the "FTR" and following text, and output just the remaining record content (including whatever line breaks it contains).