Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Extracting Header

by kvale (Monsignor)
on Aug 06, 2002 at 18:49 UTC ( #188118=note: print w/replies, xml ) Need Help??

in reply to Extracting Header

I'm not sure what you mean by cut out the header, but here is a way to ignore it. Suppose header lines start with an octothorpe  # :
open FILE, "<file.txt" or die "Couldn't open file.txt: $!"; while (<FILE>) { if (/^#/) { # ignore the header } else { # Do some processing } }
If your goal is to remove the header from the file, the easiest approach is to copy from the original file to a new file, leaving out the header lines in the process.


Replies are listed 'Best First'.
Re: Re: Extracting Header
by raj8 (Sexton) on Aug 06, 2002 at 19:26 UTC
    For Example:
    media media robot robot robot side/ ------------------------------------------ 000381 8MM NONE&nbssp; 12 06/20/2002 16:44 000443 8MM TL8 0 2 11/02/2001 13:12

      I'm guessing that you want to "cut" the header lines out of the file, editing the file and leaving it with only the data in it. Here's the gradual evolution of something that will do that:

      First off, we need to get a general plan. We want to read in the file line by line, and make a new file as we do so -- the trick is to not print the line if it's a header line. This means our new file will only have the data in it. So here's something that does that:

      #!/usr/bin/perl -w; open IN, "" or die " $!"; open OUT, ">file.out" or die ">file.out: $!" while (<IN>) { if ($_ =~ /^\d/) { print OUT $_; } } close IN; close OUT;

      Next step is to make this code a little classier. First off, we can use the -n and -i command line options to make this a lot shorter. The -n does the work of the while loop, and the -i does the word of making a new file, and then replacing the original. It'll even keep a backup for you! Here's what it looks like using -ni, then:

      #!/usr/bin/perl -ni.bak if ($_ =~ /^\d/) { print $_; }

      ..Which can be condensed and made a lot more canonical:

      perl -ni.bak -e 'print if /^\d/' file.dat


      Update: Manythanks to ChemBoy. I said -ni in two places, then gradually drifted off to saying -pi. Oopsie.

      perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

        I think you meant -ni.bak, not -pi.bak, there. As it stands you'll print out every line in the file once, and the interesting ones twice, which is suboptimal. :-)

        As an alternative solution, if you know the number of lines in your header ahead of time, you can skip that number of lines using the flip-flop operator (..), thus:

        #!/usr/bin/perl -w # I'll assume for the sake of variety that we want to process the info +rmation # in the file, not just delete the fluff my $header_lines = 3; while (<>) { next if 1 .. $header_lines; chomp; my @data = split; &munge(@data); }

        Note that I used split to divide the line up, but if the data fields are fixed-width and could contain embedded spaces, you'd be better off using unpack to split the line up (if they're variable-width and could contain embedded spaces, I recommend a different data format).

        If God had meant us to fly, he would *never* have given us the railroads.
            --Michael Flanders

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://188118]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2018-05-21 19:57 GMT
Find Nodes?
    Voting Booth?