You perhaps know that redundancy of DNA sequences doesn't totally imply a 100% conservation of the sequences. Even sequences that are not 100% identical maybe considered redundant and the few differences between them may not result in any functional impact. So if sequences code for a protein, the selection is such that mutations may still preserve the encoded protein due to the degenracy of the genetic code. That means, you're probably filtering 100% identical sequences but there could be, biologically speaking, other redundant sequences you did not look at.
My answer will be similar to what choroba and Athanasius have suggested, but with a slight modification. The modification is, I list every ID for which sequences are identical and to ease my life a bit I am using BioPerl. Then you can easily just include into your analysis one ID to represent that cluster of sequences
use strict;
use warnings;
use Data::Dumper;
use Bio::SeqIO;
my %hash; #updated.
#Reading sequence files in Fasta format
my $in=Bio::SeqIO->new(
-file=> "sequences.fa",
-format=>"fasta",
);
#getting the IDs of the identical sequences into a data structure
while(my $seq=$in->next_seq()){
#print $seq->id,$/;
push @{$hash{$seq->seq}}, $seq->id;
}
#print each group of identical IDs into a separate line
foreach my $key(keys %hash){
if(scalar @{$hash{$key}}>=1){
print scalar @{$hash{$key}},"\t";
print "@{$hash{$key}}","\n";
}
}
If your dataset is really really huge then you may want to think of clustering based on sequence-similarity as opposed to sequence-identity since you won't lose so much of the biological signals if you define a sensible similarity threshold to cluster around. There are routinely used tools that you can explore towards that purpose like cd-hits-est and uclust for example.
UPDATE: 09/09/2015: Predeclared the hash in response to the suggestion provided by
Not_a_Number. Since I wrote the code out of my head without testing it I missed the variable declaration.
David R. Gergen said "We know that second terms have historically been marred by hubris and by scandal." and I am a two y.o. monk today :D, June,12th, 2011...