Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Parsing out uniques

by diamondsandperls (Beadle)
on Aug 24, 2012 at 23:04 UTC ( #989655=perlquestion: print w/replies, xml ) Need Help??
diamondsandperls has asked for the wisdom of the Perl Monks concerning the following question:

My unless statement toward the bottom of the code simply is not working any help would be appreciated.

My goal is to only process a given src address once and not again I do not want to print this line again.

use strict; use warnings; use Cwd; use LWP::UserAgent; use DateTime; use File::Slurp; print "Type the malicious IP: "; my $IP = <>; chomp $IP; #calculating times and dates my $dt_now = DateTime->now; $dt_now->subtract( hours => 5 ); my $now_Hour = sprintf("%02d",$dt_now->hour_12()); my $now_Year = $dt_now->year(); my $now_Month = sprintf("%02d",$dt_now->month()); my $now_Day = sprintf("%02d",$dt_now->day()); my $now_Min = sprintf("%02d",$dt_now->minute()); my $am_pm = $dt_now->am_or_pm(); my $oldSSO = qx{whoami}; chomp $oldSSO; my ($sso) = $oldSSO =~ /.*(\w{2}\d{5})/; my $ua = new LWP::UserAgent; my $response = $ua->get("http://referencedatasite/$sso"); my $content = $response->content; my ($newcontent) = $content =~ /<geid>(\d+)/; my @textfiles = <*.txt *.log>; my $input_file; my $input_fh; my $src; my $dst; my @srcs; my %seen; my $output_file = "simon.csv"; open(my $output_fh, '>', $output_file) or die "Failed to open $output_file - $!"; print {$output_fh} "uploadfiles,submitter,description,SIP,DIP, +Date_occurred_detected,Time_occurred_detected,Report_Severity,Inciden +t_Type_Details\n"; close $output_fh; foreach my $textfile (@textfiles) { if ($textfile =~ /(\d+.\d+.\d+.\d+)/) { my ($ipaddy) = $textfile =~ /(\d+.\d+.\d+.\d+)/; print "Processing $textfile\n"; my @lines = read_file( $textfile ) ; open($output_fh, '>>', $output_file) or die "Failed to open $output_file - $!"; foreach my $line (@lines) { ($src) = $line =~ /\d{4}-\d+-\d+\s\d{2}:\d{2}:\d{2}\s\d+\s(\d+ +.\d+.\d+.\d+)/; %seen = (); unless ($seen{$src}++) { if ($line =~ $IP) { print {$output_fh} "$,$newcontent,Malicious + activity found when mining proxylog data,$src,"; ($dst) = $line =~ /SG-HTTP-Service (\d+.\d+.\d+.\d ++)/g; print {$output_fh} "$dst,$now_Month/$now_Day/$now_ +Year,$now_Hour:$now_Min $am_pm,3,24\n"; } } } } }

Replies are listed 'Best First'.
Re: Parsing out uniques
by GrandFather (Sage) on Aug 25, 2012 at 00:21 UTC

    As a general thing declare variables where they are first needed so their scope is clear. In your code you declare a bunch of variables outside the outer for loop, almost none of which are used except in the innermost for loop.

    Your problem however is with the one variable that does need to be global to the outer for loop. Even though declared there, you reset it immediately before you test it! The immediate fix is to simply remove %seen = ();.

    True laziness is hard work

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://989655]
Approved by GrandFather
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2018-06-22 21:34 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (124 votes). Check out past polls.