Re: Find duplicate values in hash

in reply to Find duplicate values in hash

I'd be keeping track of the keys when you're assigning to the definition hash. That is, keep a separate hash with duplicate keys. There is probably a more efficient way to do this, but some random puttering around while I'm waiting for my work script to finish:

#!/usr/bin/perl
use strict;
use warnings;

my (%hash, %dup_hash);
# Minor tweak to read from DATA rather than a file
while (my $line = <DATA>) {
    chomp($line);
    my ($enu, $deu) = split /\t/, $line;
    
    $hash{$enu} = $deu;

    # Keep a list of all duplicate values
    push @{$dup_hash{$deu}}, $enu;
}

for my $key (keys %hash) {
print "$key\n";
}
for my $value (values %hash) {
print "$value\n";
}

print "\nDuplicate definitions:\n";
for my $deu (keys %dup_hash) {
    if (scalar @{$dup_hash{$deu}} > 1) {
        for my $en (@{$dup_hash{$deu}}) {
            print "$deu => $en\n";
        }
        print "\n";
    }
}


__DATA__
Retire a document    Dokument deaktivieren
Remove a document from the knowledge base    Dokument aus der Knowledg
+e Base entfernen
Promote document retirement    Dokument deaktivieren
Document Expired    Dokument abgelaufen
[download]

Gives the output:

Remove a document from the knowledge base
Document Expired
Promote document retirement
Retire a document
Dokument aus der Knowledge Base entfernen
Dokument abgelaufen
Dokument deaktivieren
Dokument deaktivieren

Duplicate definitions:
Dokument deaktivieren => Retire a document
Dokument deaktivieren => Promote document retirement
[download]

Edit: Renamed some of the variables to accurately reflect their contents. Second edit Pretty much the same thing as what JavaFan has, just different syntactical approach.

In Section Seekers of Perl Wisdom