Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^3: log parser fails

by aitap (Deacon)
on Aug 07, 2012 at 11:40 UTC ( #985942=note: print w/ replies, xml ) Need Help??


in reply to Re^2: log parser fails
in thread log parser fails

After moving some brackets in your regular expression it looks like this:

m@^(\S+?), (\S+) (\S+) - - (\[\d{2}/\w+/\d{4}:\s*\d{2}:\d{2}:\s*\d{2} +\+\d{4}\]) "(.*?)" (\d{3}) (\d+) "(.*?)" "(.*?)"@;
I can run the following code:
#!/usr/bin/perl use warnings; use strict; use File::ReadBackwards; use Data::Dumper; my $fh_in = File::ReadBackwards->new($ARGV[0]) or die("Unable to open +\"$_\": $!\n"); my ($line,$source_host,$my_host,$internal_redirect,$date,$url_with_met +hod,$status,$size,$referrer,$agent,$end_time,$check_time,$vhost_name) +; while (defined($line = $fh_in->readline())) { chomp($line); print "$line|\n"; ($source_host,$my_host,$internal_redirect,$date,$url_with_meth +od,$status,$size,$referrer,$agent) = $line =~ m@^(\S+?), (\S+) (\S+) +- - \[(\d{2}/\w+/\d{4}:\s*\d{2}:\d{2}:\s*\d{2} \+\d{4})\] "(.*?)" (\d +{3}) (\d+) "(.*?)" "(.*?)"@; print Data::Dumper->Dump( [ \$line, \$source_host,\$my_host,\$internal_r +edirect,\$date,\$url_with_method,\$status,\$size,\$referrer,\$agent ] +, [qw(*line *source_host *my_host *internal_redi +rect *date *url_with_method *status *size *referrer *agent)], ), qq{\n}; print "SH:$source_host, MH:$my_host, IR:$internal_redirect, D: +$date, U:$url_with_method, S:$status, SZ:$size, R:$referrer, A:$agent +\n"; }
on your sample data, and it seems to detect all fields properly.

// Удачи!
Sorry if my advice was wrong.


Comment on Re^3: log parser fails
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://985942]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (14)
As of 2014-07-30 08:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls