http://www.perlmonks.org?node_id=989655

diamondsandperls has asked for the wisdom of the Perl Monks concerning the following question:

My unless statement toward the bottom of the code simply is not working any help would be appreciated.

My goal is to only process a given src address once and not again I do not want to print this line again.

use strict; use warnings; use Cwd; use LWP::UserAgent; use DateTime; use File::Slurp; print "Type the malicious IP: "; my $IP = <>; chomp $IP; #calculating times and dates my $dt_now = DateTime->now; $dt_now->subtract( hours => 5 ); my $now_Hour = sprintf("%02d",$dt_now->hour_12()); my $now_Year = $dt_now->year(); my $now_Month = sprintf("%02d",$dt_now->month()); my $now_Day = sprintf("%02d",$dt_now->day()); my $now_Min = sprintf("%02d",$dt_now->minute()); my $am_pm = $dt_now->am_or_pm(); my $oldSSO = qx{whoami}; chomp $oldSSO; my ($sso) = $oldSSO =~ /.*(\w{2}\d{5})/; my $ua = new LWP::UserAgent; my $response = $ua->get("http://referencedatasite/$sso"); my $content = $response->content; my ($newcontent) = $content =~ /<geid>(\d+)/; my @textfiles = <*.txt *.log>; my $input_file; my $input_fh; my $src; my $dst; my @srcs; my %seen; my $output_file = "simon.csv"; open(my $output_fh, '>', $output_file) or die "Failed to open $output_file - $!"; print {$output_fh} "uploadfiles,submitter,description,SIP,DIP, +Date_occurred_detected,Time_occurred_detected,Report_Severity,Inciden +t_Type_Details\n"; close $output_fh; foreach my $textfile (@textfiles) { if ($textfile =~ /(\d+.\d+.\d+.\d+)/) { my ($ipaddy) = $textfile =~ /(\d+.\d+.\d+.\d+)/; print "Processing $textfile\n"; my @lines = read_file( $textfile ) ; open($output_fh, '>>', $output_file) or die "Failed to open $output_file - $!"; foreach my $line (@lines) { ($src) = $line =~ /\d{4}-\d+-\d+\s\d{2}:\d{2}:\d{2}\s\d+\s(\d+ +.\d+.\d+.\d+)/; %seen = (); unless ($seen{$src}++) { if ($line =~ $IP) { print {$output_fh} "$src.zip,$newcontent,Malicious + activity found when mining proxylog data,$src,"; ($dst) = $line =~ /SG-HTTP-Service (\d+.\d+.\d+.\d ++)/g; print {$output_fh} "$dst,$now_Month/$now_Day/$now_ +Year,$now_Hour:$now_Min $am_pm,3,24\n"; } } } } }

Replies are listed 'Best First'.
Re: Parsing out uniques
by GrandFather (Saint) on Aug 25, 2012 at 00:21 UTC

    As a general thing declare variables where they are first needed so their scope is clear. In your code you declare a bunch of variables outside the outer for loop, almost none of which are used except in the innermost for loop.

    Your problem however is with the one variable that does need to be global to the outer for loop. Even though declared there, you reset it immediately before you test it! The immediate fix is to simply remove %seen = ();.

    True laziness is hard work