http://www.perlmonks.org?node_id=188133


in reply to Re: Re: Extracting Header
in thread Extracting Header

I'm guessing that you want to "cut" the header lines out of the file, editing the file and leaving it with only the data in it. Here's the gradual evolution of something that will do that:

First off, we need to get a general plan. We want to read in the file line by line, and make a new file as we do so -- the trick is to not print the line if it's a header line. This means our new file will only have the data in it. So here's something that does that:

#!/usr/bin/perl -w; open IN, "file.in" or die "file.in: $!"; open OUT, ">file.out" or die ">file.out: $!" while (<IN>) { if ($_ =~ /^\d/) { print OUT $_; } } close IN; close OUT;

Next step is to make this code a little classier. First off, we can use the -n and -i command line options to make this a lot shorter. The -n does the work of the while loop, and the -i does the word of making a new file, and then replacing the original. It'll even keep a backup for you! Here's what it looks like using -ni, then:

#!/usr/bin/perl -ni.bak if ($_ =~ /^\d/) { print $_; }

..Which can be condensed and made a lot more canonical:

perl -ni.bak -e 'print if /^\d/' file.dat

Voila!

Update: Manythanks to ChemBoy. I said -ni in two places, then gradually drifted off to saying -pi. Oopsie.

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

Replies are listed 'Best First'.
Re: Re: Re: Re: Extracting Header
by ChemBoy (Priest) on Aug 07, 2002 at 04:02 UTC

    I think you meant -ni.bak, not -pi.bak, there. As it stands you'll print out every line in the file once, and the interesting ones twice, which is suboptimal. :-)

    As an alternative solution, if you know the number of lines in your header ahead of time, you can skip that number of lines using the flip-flop operator (..), thus:

    #!/usr/bin/perl -w # I'll assume for the sake of variety that we want to process the info +rmation # in the file, not just delete the fluff my $header_lines = 3; while (<>) { next if 1 .. $header_lines; chomp; my @data = split; &munge(@data); }

    Note that I used split to divide the line up, but if the data fields are fixed-width and could contain embedded spaces, you'd be better off using unpack to split the line up (if they're variable-width and could contain embedded spaces, I recommend a different data format).



    If God had meant us to fly, he would *never* have given us the railroads.
        --Michael Flanders