Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

array processing

by Anonymous Monk
on Dec 06, 2005 at 10:54 UTC ( #514399=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

As a newcomer to perl, I'm trying to figure out the best way of doing the following.
I have an error log file . Here's a sample
00:00000:00014:2005/11/30 10:01:23.77 server Configuration file '/opt +/sybase/SERVER1.cfg' has been written and the previous version has been renamed to '/opt/sybase/SERV +ER1.030'. 00:00000:00014:2005/11/30 10:01:23.80 server The configuration option + 'log audit logon success' has been ch anged by 'sa' from '0' to '1'. 00:00000:00015:2005/11/30 10:01:37.36 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00016:2005/11/30 10:10:50.91 Logon Login succeeded. User: xy +z, Client IP address: '169.123.26.124'. 00:00000:00017:2005/11/30 16:31:31.02 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00020:2005/11/30 16:51:11.90 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00021:2005/12/01 09:49:23.44 Logon Login succeeded. User: ab +c, Client IP address: '169.123.26.124'. 00:00000:00022:2005/12/01 09:49:23.90 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00023:2005/12/01 09:51:29.65 kernel Cannot read, host proces +s disconnected: SERVER1 spid: 23 00:00000:00025:2005/12/01 09:52:27.74 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00026:2005/12/01 09:55:24.06 Logon Login succeeded. User: qw +r, Client IP address: '169.123.26.124'. 00:00000:00027:2005/12/01 09:55:47.15 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00026:2005/12/01 10:02:25.95 server Configuration file '/opt +/sybase/SERVER1.cfg' has been written and the previous version has been renamed to '/opt/sybase/SERV +ER1.031'.
I need to be able to extract the records which have "Login succeeded" but specifically only the user name from that record. e.g. sa, xyz, abc,qwr

I actually just need one occurence of each different user that has successfully logged in. So in the above snippet sa has logged in several times but all I need to know is that they have logged in at least once
I started off by doing this

use strict; my $line; my @array; # Open the errorlog file open(errlog,"/opt/sybase/logs/errorlog_SERVER1") or die "Can't open er +rorlog file!"; # Open th eoutput file open(loginf,">/tmp/login.dat"); while ($line = <errlog>) { chomp $line; next unless ($line =~ /Login succeeded/); push @array, $line; } foreach (@array) { print "$array[7]\n"; } close errlog; close loginf;
This just prints the whole line out. I was under the impression that the array is naturally split by a space ?
Anyhow , how do I get just one distinct occurence of each user into another file ?
Thanks

Comment on array processing
Select or Download Code
Re: array processing
by Tomte (Priest) on Dec 06, 2005 at 11:04 UTC

    Use a hash. Something like the following

    # opening of files etc. my %users = (); while ($line = <errlog>) { chomp $line; next unless ($line =~ /Login succeeded/); $users{(split / /, $line)[6]} ||= 1; } foreach (sort keys %users) { print LOGINF $_, "\n"; } # closing fhs etc
    This is untested.

    NB: consider using lexical filehandles.

    hth,

    regards,
    tomte


    An intellectual is someone whose mind watches itself.
    -- Albert Camus

      I preferred a regex over a split since we are only getting the user out of the data and we don't need any of the other values. You can increment the user's login count in one line inside the while loop:
      $users{$1}++ if $line =~ /Login succeeded. User: (\w+)/i;
Re: array processing
by tirwhan (Abbot) on Dec 06, 2005 at 11:07 UTC

    split does not happen automatically, you need to call it. If you're only trying to keep one record of each user who has logged in you could use a hash. So your loop could be:

    my %user_login; while (my $line = <errlog>) { chomp $line; next unless ($line =~ /Login succeeded/); my ($date,$time,$username)=(split(' ',$line))[0,1,6]; $user_login{$username}="$date $time"; } for my $record(sort keys %user_login) { print "$record logged in at $user_login{$record}\n"; }
    This will print out a sorted list of users who have logged in, along with the date/time of the last login.

    Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
      I obviously have a lot to learn. Out of interest could this have been done with an array instead of a hash, or would it have been too messy ?

      Thanks for your help

        Yes, it would be possible, but wasteful.

        use List::MoreUtils qw(uniq); my @client_list; while (my $line = <DATA>) { chomp $line; next unless ($line =~ /Login succeeded/); push (@client_list,(split(' ',$line))[6]); } @client_list = uniq(@client_list); print join("\n",@client_list);

        Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
      OK having taken this on board , I decide to add a line number count so I don't go processing the same bit of the file every time this utility is run. e.g.

      my $linenum = 1; my $lastline = 200; my %user_login; while (my $line = <errlog>) { chomp $line; next unless ($linenum = $lastline); next unless ($line =~ /Login succeeded/); my ($date,$time,$username)=(split(' ',$line))[0,1,6]; $user_login{$username}="$date $time"; ++linenum; } for my $record(sort keys %user_login) { print "$record logged in at $user_login{$record}\n"; } print "line number is $linenum\n";
      This doesn't work though since $linenum is still set to 1 at the print above. How can I use the linenum variable outside the while loop to achieve this ?

        You've got an error in testing for your $linenum. Also, you're incrementing your variable after you test and loop back, so it never gets incremented. Change

        next unless ($linenum = $lastline);
        to
        next unless ($linenum++ >= $lastline);

        Update: sorry, I didn't do this carefully enough, changed to work


        Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
Re: array processing
by McDarren (Abbot) on Dec 06, 2005 at 11:26 UTC

    Update: Arg!! I just realised I completely misread. You want unique users, not IP's!
    Update 2: Ok, fixed so that it returns Users rather than IP's

    I was under the impression that the array is naturally split by a space ?

    Yes, but you must implicitly call the split function.

    In any case, a hash is probably more suited to this task. Here is one solution, using a hash for your unique IP's users, and a regex to pull them out of each line:

    #!/usr/bin/perl -w use strict; use Data::Dumper::Simple; my %users; while (<DATA>) { next unless /Login succeeded/; chomp; if (/User: ([\w]+),/) { $users{$1}++; } } print Dumper(%users); __DATA__ 00:00000:00014:2005/11/30 10:01:23.77 server Configuration file '/opt +/sybase/SERVER1.cfg' has been written and the previous version has be +en renamed to '/opt/sybase/SERVER1.030'. 00:00000:00014:2005/11/30 10:01:23.80 server The configuration option + 'log audit logon success' has been changed by 'sa' from '0' to '1'. 00:00000:00015:2005/11/30 10:01:37.36 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00016:2005/11/30 10:10:50.91 Logon Login succeeded. User: xy +z, Client IP address: '169.123.26.124'. 00:00000:00017:2005/11/30 16:31:31.02 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00020:2005/11/30 16:51:11.90 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00021:2005/12/01 09:49:23.44 Logon Login succeeded. User: ab +c, Client IP address: '169.123.26.124'. 00:00000:00022:2005/12/01 09:49:23.90 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00023:2005/12/01 09:51:29.65 kernel Cannot read, host proces +s disconnected: SERVER1 spid:23 00:00000:00025:2005/12/01 09:52:27.74 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00026:2005/12/01 09:55:24.06 Logon Login succeeded. User: qw +r, Client IP address: '169.123.26.124'. 00:00000:00027:2005/12/01 09:55:47.15 Logon Login succeeded. User: sa +, Client IP address: '169.123.26.124'. 00:00000:00026:2005/12/01 10:02:25.95 server Configuration file '/opt +/sybase/SERVER1.cfg' has been written and the previous version has be +en renamed to '/opt/sybase/SERVER1.031'.

    Which gives:

    %users = ( 'qwr' => 1, 'abc' => 1, 'sa' => 6, 'xyz' => 1 );

    Hope this helps,
    Darren :)

Re: array processing
by holli (Monsignor) on Dec 06, 2005 at 11:26 UTC
    while (<errorlog>) { $logins{$1}++ if /Login succeeded\. User: ([^,]+)/; } print map { "$_: $logins{$_}\n" } sort keys %logins;
    Outputs usernames and number of logins:
    abc: 1 qwr: 1 sa: 6 xyz: 1


    holli, /regexed monk/
Re: array processing
by McDarren (Abbot) on Dec 06, 2005 at 11:51 UTC
    Of course, if you are in a *nix shell, you could do this straight from the command line without using Perl at all. I'm no whiz at this sort of thing, so the following could probably be shortened quite a bit - but it should give you the idea:
    grep 'Login succeeded' log | awk '{print $7}' | sed s/,// | sort | uni +q -c 1 abc 1 qwr 6 sa 1 xyz

    Cheers,
    Darren :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://514399]
Approved by ysth
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2014-12-25 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls