Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: [off-site] Bash + Perl oneliners basics

by merlyn (Sage)
on Mar 17, 2005 at 07:06 UTC ( #440282=note: print w/ replies, xml ) Need Help??


in reply to [off-site] Bash + Perl oneliners basics

cat /var/log/httpd/access_log | perl -l -a -n -e 'print $F[6]' | sort +| uniq -c | sort -n | tail -10
Hmm. A Useless Use of Cat, using Perl like it was awk, and then chaining together a few other tools like forking is free. Hmm.

I'd probably have written that as:

@ARGV = qw(/var/log/httpd/access_log); my %count; while (<>) { my ($f) = (split)[6]; $count{$f}++; } my $n = 0; for (sort {$count{$b} <=> $count {$a}) { print "$_\n"; last if ++$n >= 10; }
I bet mine runs with 1/4th the CPU.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.


Comment on Re: [off-site] Bash + Perl oneliners basics
Select or Download Code
Re^2: [off-site] Bash + Perl oneliners basics
by grinder (Bishop) on Mar 17, 2005 at 08:22 UTC

    I know you're just tossing that code off quickly, but I'm curious to know why you chose to write:

    while (<>) { my ($f) = (split)[6]; $count{$f}++; }

    ...rather than...

    while (<>) { $count{(split)[6]}++; }

    It makes me wonder if there's some robustness principle at work that eludes me. And of course, there is even...

    $count{(split)[6]}++ while <>;

    ... but then we are getting into the realms of the cryptic, and I don't seen a more concise way of printing the top N values that doesn't sacrifice economy.

    - another intruder with the mooring in the heart of the Perl

Re^2: [off-site] Bash + Perl oneliners basics
by Anonymous Monk on Mar 17, 2005 at 09:50 UTC
    I bet mine runs with 1/4th the CPU.
    Except that for small to medium sized files, it doesn't matter and the additional programming (and debugging) time dwarves the running time. And for really long files, your program might actually be slower, or even fail to finish as it will consume significant amounts of memory. The elegant one-liner, consisting of several tools that do one thing well won't suffer from memory problems, as 'sort' knows when to switch to using temporary files.

    Having said that, I would have written the one-liner as:

    awk '{print $6}' /var/log/httpd/access_log | sort | uniq -c | sort -n | head -10
Re^2: [off-site] Bash + Perl oneliners basics
by thor (Priest) on Mar 17, 2005 at 12:32 UTC
    chaining together a few other tools like forking is free
    By your reasoning, doing anything with the computer is not free, so why try at all? In my opinion, the cost of something like this is akin to the cost of gum balls: individually, they're so cheap that almost no one has a hard time justifying quantities of less than 100. And if you find yourself arguing with someone over the cost of a gum ball or 1000, just walk away. Your time is better spent.

    thor

    Feel the white light, the light within
    Be your own disciple, fan the sparks of will
    For all of us waiting, your kingdom will come

Re^2: [off-site] Bash + Perl oneliners basics
by gellyfish (Monsignor) on Mar 17, 2005 at 13:08 UTC

    TBH I'd lose the Perl altogether:

    awk '{ file[$7]++ } END { for ( v in file ) print file[v], v }' /var/ +log/httpd/access_log | sort -n | tail +10
    I'm sure you could lose the rest of the pipe too but I never got my head around AWK's asort() for cases like this.

    /J\

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://440282]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2014-07-25 21:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls