Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

scanning a text file and deleting lines that are double

by theroninwins (Friar)
on Sep 03, 2004 at 06:27 UTC ( #388210=perlquestion: print w/replies, xml ) Need Help??

theroninwins has asked for the wisdom of the Perl Monks concerning the following question:

Hey everyone.
My question is a simple one but I somehow got stuck. I have a very large text file with single lines. The text is divided into blocks with an empty line above and one below. How can I compare the lines with eachother line in the same block but not with the others in teh other blocks in that file, and delete everyline that is 2 or 3 times in that block leaving only the first. It shouldn't be that difficult, but somehow I can't get it to work. Hope you can help me on this one.

everyone is there
i am there
everyone is there <- should be deleted
everyone is there <- should be deleted

hello
everyone is there
what is is
  • Comment on scanning a text file and deleting lines that are double

Replies are listed 'Best First'.
Re: scanning a text file and deleting lines that are double
by ysth (Canon) on Sep 03, 2004 at 06:34 UTC
    Paragraph mode, and unique check (untested):
    $/ = ""; while (<IN>) { my %seen; print OUT grep !$seen{$_}++, split /^/m, $_; }

      Very snazzy.

A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://388210]
Approved by ysth
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2022-05-24 19:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (84 votes). Check out past polls.

    Notices?