Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Help with removing duplicates in array

by hdb (Monsignor)
on Mar 27, 2015 at 12:51 UTC ( [id://1121499]=note: print w/replies, xml ) Need Help??


in reply to Help with removing duplicates in array

Can you try with

use strict; use warnings;

As a minimum it will tell you that in the line where you use $#filtered, the underlying array @filtered is already out of scope. Probably you will see more problems with your code.

Replies are listed 'Best First'.
Re^2: Help with removing duplicates in array
by beanscake (Acolyte) on Mar 27, 2015 at 13:29 UTC
    good i have done that'
    use strict; use warnings; #use List::MoreUtils qw(uniq); // i have tried this to handle the dupl +icated emails use Data::Dumper qw(Dumper); my $Directory = $ARGV[0]; #Directory to scan for emails my $Filename =$ARGV[1]; # where to write out found emails note: avoid + forever loop make sure it's not the same directory my $success = "\n [+] $0 is Scanning For E-mails \n\n"; my $tryagain = "\n [?] perl $0 Directory fileto.txt \n\n"; if (@ARGV != 2) { print $tryagain; exit(); } else { print $success; } sub uniq { #and this to handle the duplicated emails return keys %{{ map { $_ => 1 } @_ }}; } #sub uniq { #with this to handle the duplicated emails # my %seen; # grep !$seen{$_}++, @_; #} my $total_filesscanned = 0; my $total_email = 0; my @files = grep ( -f ,<$Directory*.txt*>); #scanning directory open(my $fh, '>>', $Filename); foreach my $file (@files) { $total_filesscanned++; # begin to count numbers of file to be scanned open my $open, '<', $file or die $!; while (<$open>) { chomp; my @findemails = split(' '); my @filtered = uniq(@findemails); # meant to avoid duplicates #my @filtered = join(" ", uniq(@findemails)); also took this +aproach foreach my $emails (@filtered) { if($emails =~ /^\w+\@([\da-zA-Z\-]{1,}\.){1,}[\da-zA-Z-]{2 +,6}$/) { #grab the emails $total_email++; # begin to count emails print $fh "$emails\n"; # write the emails to file } } } close $file; # close files print "$file\n"; } close $fh; # close the file to write #my $removed = @findemails - @filtered; # am expecting it to avoid + duplicate emails but it's not working print "Files Scanned: $total_filesscanned\n"; print "E-mail Found: $total_email\n"; #print "Filtered Total: $removed\n"; print "done\n";
    The beginning of knowledge is the discovery of something we do not understand.
        Frank Herbert (1920 - 1986)

      And what do you get? Does it run w/o warnings? Do you still get duplicates? Are you aware that you only remove duplicates from the same $file but that there could be duplicates across? Change your printing to

      print $fh "$file:$emails\n"; # write the emails to file

      to check from which file your emails are retrieved from.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1121499]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-24 06:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found