http://www.perlmonks.org?node_id=1130779


in reply to Re^2: Question about the most efficient way to read Apache log files without All-In-One Modules from CPAN (personal learning exercise)
in thread Question about the most efficient way to read Apache log files without All-In-One Modules from CPAN (personal learning exercise)

...we're pretty much beyond what split can do...

Mh, we know the format:

use Data::Dump; use feature qw(say); my $line =qq(127.0.0.1 - - [22/Apr/2015:13:35:04 +1000] "GET /bin/admi +n.pl HTTP/1.1" 401 509); my @bits = split /\s/, $line; dd\@bits; say qq(Host: $bits[0]); say qq(Logname: $bits[1]); say qq(User: $bits[2]); say qq(Time: $bits[3] $bits[4]); say qq(Request: $bits[5] $bits[6] $bits[7]); say qq(Status: $bits[8]); say qq(Size: $bits[9]); __END__ monks>apache.pl [ "127.0.0.1", "-", "-", "[22/Apr/2015:13:35:04", "+1000]", "\"GET", "/bin/admin.pl", "HTTP/1.1\"", 401, 509, ] Host: 127.0.0.1 Logname: - User: - Time: [22/Apr/2015:13:35:04 +1000] Request: "GET /bin/admin.pl HTTP/1.1" Status: 401 Size: 509

Regards, Karl

«The Crux of the Biscuit is the Apostrophe»

  • Comment on Re^3: Question about the most efficient way to read Apache log files without All-In-One Modules from CPAN (personal learning exercise)
  • Download Code

Replies are listed 'Best First'.
Re^4: Question about the most efficient way to read Apache log files without All-In-One Modules from CPAN (personal learning exercise)
by lulz (Initiate) on Jun 17, 2015 at 19:13 UTC
    Thanks for your reply!

    A quick question dealing with the internal workings of what you wrote:

    I understand that the split function can take any expression as its element then operate on the scalar, but what would be the more nuanced differences, particularly with memory usage and processing speed, if any, between using split and a general pattern match?

    Thanks!

      You can measure the speed of your code with the time command. Or use Time::HiRes. Or Benchmark. See also Devel::Size and Devel::NYTProf.

      And don't forget to try Super Search. I'm sure that you will find many examples that use time, Benchmark, Time::HiRes, Devel::Size and Devel::NYTProf.

      Regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

        Thank you!
Re^4: Question about the most efficient way to read Apache log files without All-In-One Modules from CPAN (personal learning exercise)
by wrog (Friar) on Jun 17, 2015 at 15:43 UTC
    I'm not sure I'd want to bet my life that none of logname, user or the request URI can have spaces in them.
      "I'm not sure I'd want to bet my life..."

      I guess a beer would be fair. Please see also URI scheme ;-)

      Best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

        the scheme isn't the problem (and doesn't show up in the log anyway); it's the path and query parts.