Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: to avoid redundacy in a file

by amphiplex (Monk)
on Jul 15, 2002 at 09:31 UTC ( #181722=note: print w/replies, xml ) Need Help??


in reply to to avoid redundacy in a file

You could remove all duplicate lines with something like this:
my %seen; while (<>) { next if $seen{$_}; print; $seen{$_}++; }

---- amphiplex

Replies are listed 'Best First'.
Re^2: to avoid redundacy in a file
by tadman (Prior) on Jul 15, 2002 at 09:43 UTC
    To avoid redundancy in your code, you could do this:
    my %seen; while (<>) { next if ($seen{$_}++); print; }
    You can do it in one shot, so you might as well. Note that this code eliminates all duplicate lines, not just repeated ones. If you want to just ditch repeats, use this:
    my $last; while (<>) { next if ($_ eq $last); $last = $_; print; }
    Thus lines "A A A B B B A A C C" will be "A B A C" not "A B C" as in the previous bit.
      You can do it in one shot, so you might as well.
      That makes your second snippet
      my $prev; while (<>) { next if ($_ eq $prev); print $prev = $_; }
      :^)

      Wait, we can shorten that..
      my $prev; while (<>) { print $prev = $_ unless $_ eq $prev; }
      Hmm..
      my $prev; $_ ne $prev and print $prev = $_ while <>;
      Err.. sorry, got carried away for a second.. Perl is just too seductive. Sigh. :-)

      Makeshifts last the longest.

        I was going to condense it down to a single line:
        % perl -ne '$p ne$_&&print;$p=$_' ...
Re: Re: to avoid redundacy in a file
by Purdy (Hermit) on Jul 15, 2002 at 15:00 UTC
    This won't do exactly as the AM wants - some lines will be duplicate to the user, but not to Perl:
    $ more file.txt N AB TX NC AB N TX NC FOO BAR N AB TX NC $ perl test.pl file.txt N AB TX NC AB N TX NC FOO BAR

    The first two lines of the file.txt file are "the same" to the user, but not to your program. zejames' solution works to the AM's needs, as it creates an unique key for the hash, based on the AM's definition of a duplicate.

    Jason

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://181722]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2019-12-15 20:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?