Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Re: Extracting Header

by raj8 (Sexton)
on Aug 06, 2002 at 19:26 UTC ( #188128=note: print w/ replies, xml ) Need Help??


in reply to Re: Extracting Header
in thread Extracting Header

For Example:

media media robot robot robot side/ ------------------------------------------ 000381 8MM NONE&nbssp; 12 06/20/2002 16:44 000443 8MM TL8 0 2 11/02/2001 13:12


Comment on Re: Re: Extracting Header
Download Code
Re: Re: Re: Extracting Header
by Chmrr (Vicar) on Aug 06, 2002 at 19:35 UTC

    I'm guessing that you want to "cut" the header lines out of the file, editing the file and leaving it with only the data in it. Here's the gradual evolution of something that will do that:

    First off, we need to get a general plan. We want to read in the file line by line, and make a new file as we do so -- the trick is to not print the line if it's a header line. This means our new file will only have the data in it. So here's something that does that:

    #!/usr/bin/perl -w; open IN, "file.in" or die "file.in: $!"; open OUT, ">file.out" or die ">file.out: $!" while (<IN>) { if ($_ =~ /^\d/) { print OUT $_; } } close IN; close OUT;

    Next step is to make this code a little classier. First off, we can use the -n and -i command line options to make this a lot shorter. The -n does the work of the while loop, and the -i does the word of making a new file, and then replacing the original. It'll even keep a backup for you! Here's what it looks like using -ni, then:

    #!/usr/bin/perl -ni.bak if ($_ =~ /^\d/) { print $_; }

    ..Which can be condensed and made a lot more canonical:

    perl -ni.bak -e 'print if /^\d/' file.dat

    Voila!

    Update: Manythanks to ChemBoy. I said -ni in two places, then gradually drifted off to saying -pi. Oopsie.

    perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

      I think you meant -ni.bak, not -pi.bak, there. As it stands you'll print out every line in the file once, and the interesting ones twice, which is suboptimal. :-)

      As an alternative solution, if you know the number of lines in your header ahead of time, you can skip that number of lines using the flip-flop operator (..), thus:

      #!/usr/bin/perl -w # I'll assume for the sake of variety that we want to process the info +rmation # in the file, not just delete the fluff my $header_lines = 3; while (<>) { next if 1 .. $header_lines; chomp; my @data = split; &munge(@data); }

      Note that I used split to divide the line up, but if the data fields are fixed-width and could contain embedded spaces, you'd be better off using unpack to split the line up (if they're variable-width and could contain embedded spaces, I recommend a different data format).



      If God had meant us to fly, he would *never* have given us the railroads.
          --Michael Flanders

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://188128]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (11)
As of 2014-10-31 09:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (215 votes), past polls