Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Parsing out uniques

by diamondsandperls (Beadle)
on Aug 24, 2012 at 23:04 UTC ( #989655=perlquestion: print w/replies, xml ) Need Help??
diamondsandperls has asked for the wisdom of the Perl Monks concerning the following question:

My unless statement toward the bottom of the code simply is not working any help would be appreciated.

My goal is to only process a given src address once and not again I do not want to print this line again.

use strict; use warnings; use Cwd; use LWP::UserAgent; use DateTime; use File::Slurp; print "Type the malicious IP: "; my $IP = <>; chomp $IP; #calculating times and dates my $dt_now = DateTime->now; $dt_now->subtract( hours => 5 ); my $now_Hour = sprintf("%02d",$dt_now->hour_12()); my $now_Year = $dt_now->year(); my $now_Month = sprintf("%02d",$dt_now->month()); my $now_Day = sprintf("%02d",$dt_now->day()); my $now_Min = sprintf("%02d",$dt_now->minute()); my $am_pm = $dt_now->am_or_pm(); my $oldSSO = qx{whoami}; chomp $oldSSO; my ($sso) = $oldSSO =~ /.*(\w{2}\d{5})/; my $ua = new LWP::UserAgent; my $response = $ua->get("http://referencedatasite/$sso"); my $content = $response->content; my ($newcontent) = $content =~ /<geid>(\d+)/; my @textfiles = <*.txt *.log>; my $input_file; my $input_fh; my $src; my $dst; my @srcs; my %seen; my $output_file = "simon.csv"; open(my $output_fh, '>', $output_file) or die "Failed to open $output_file - $!"; print {$output_fh} "uploadfiles,submitter,description,SIP,DIP, +Date_occurred_detected,Time_occurred_detected,Report_Severity,Inciden +t_Type_Details\n"; close $output_fh; foreach my $textfile (@textfiles) { if ($textfile =~ /(\d+.\d+.\d+.\d+)/) { my ($ipaddy) = $textfile =~ /(\d+.\d+.\d+.\d+)/; print "Processing $textfile\n"; my @lines = read_file( $textfile ) ; open($output_fh, '>>', $output_file) or die "Failed to open $output_file - $!"; foreach my $line (@lines) { ($src) = $line =~ /\d{4}-\d+-\d+\s\d{2}:\d{2}:\d{2}\s\d+\s(\d+ +.\d+.\d+.\d+)/; %seen = (); unless ($seen{$src}++) { if ($line =~ $IP) { print {$output_fh} "$,$newcontent,Malicious + activity found when mining proxylog data,$src,"; ($dst) = $line =~ /SG-HTTP-Service (\d+.\d+.\d+.\d ++)/g; print {$output_fh} "$dst,$now_Month/$now_Day/$now_ +Year,$now_Hour:$now_Min $am_pm,3,24\n"; } } } } }

Replies are listed 'Best First'.
Re: Parsing out uniques
by GrandFather (Sage) on Aug 25, 2012 at 00:21 UTC

    As a general thing declare variables where they are first needed so their scope is clear. In your code you declare a bunch of variables outside the outer for loop, almost none of which are used except in the innermost for loop.

    Your problem however is with the one variable that does need to be global to the outer for loop. Even though declared there, you reset it immediately before you test it! The immediate fix is to simply remove %seen = ();.

    True laziness is hard work

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://989655]
Approved by GrandFather
[thezip]: How goes it, Corion?
[Corion]: thezip: Quite good, after a one week vacation in the sun ;)
[Corion]: I hope you're well too!
[thezip]: Where did you go? We've already set a record for all-time February rainfall here.
[Corion]: thezip: The Canary Islands - they have a constant climate
[Corion]: Maybe a bit too many old people, but for some do-nothing vacation for R&R exactly the right thing ;)=

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2017-02-23 17:01 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (349 votes). Check out past polls.