Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
#! /usr/bin/perl -w # # david landgren 24-apr-2001 use strict; my %domain; my $total_size; foreach my $file( @ARGV ) { open F, $file or die "Cannot open $file for input: $!\n"; while( <F> ) { chomp; my( $size, $command ) = (split)[4,8]; if( my( $dom ) = ( $command =~ /^DIRECT\/(.*)/ )) { $total_size += $size; $domain{$dom}{SIZE} += $size; $domain{$dom}{HITS}++; } } close F; } my $count; my $cum_percent = 0; foreach my $d ( sort {$domain{$b}{SIZE} <=> $domain{$a}{SIZE}} keys %d +omain ) { ++$count; $cum_percent += (my $percent = $domain{$d}{SIZE}*100/$total_size); my $percent_rounded = sprintf '%0.3f%%', $percent; my $cum_percent_rounded = sprintf '%0.3f%%', $cum_percent; print "$count\t$domain{$d}{HITS}\t$domain{$d}{SIZE}\t$percent_roun +ded\t$cum_percent_ro unded\t$d\n"; } =head1 NAME topweb - Determine biggest targets of inbound HTTP traffic =head1 SYNOPSIS B<topweb> filespec [filespec...] =head1 DESCRIPTION Generate a snapshot of direct web traffic recorded by a Squid proxy. Scan the Squid access logs specified on the command line looking for D +IRECT connections Accumulate the number of hits and and bytes transferred for each FQDN. + Sort and print the results based on bytes transferred. The goal is to see how much re +al traffic is coming in due to cache misses. =head1 OUTPUT This program outputs a tab-delimited text file. The fields are as foll +ows =item * rank -- from 1 to n, the rank in terms of bytes transferred for the do +main. =item * hits -- the number of seperate transfers logged. =item * bytes -- the total number of bytes transferred from the above hits. =item * percent -- the percentage that this site represents in terms of the to +tal traffic. =item * cumulative percent -- the percentage that this site and all busier sit +es represent in terms of the total traffic. =item * fqdn -- the fully qualified domain name of the host, or numeric IP add +ress if the address does not resolve. Here is an a sample output, which indicates, among other things, that +the four most demanded sites in this data sample represent 10% of incoming traf +fic: 1 25226 106606531 2.877% 2.877% www.cadremploi.fr 2 15996 104380579 2.817% 5.693% mailv2.voila.fr 3 24842 97149410 2.621% 8.315% www.apec.asso.fr 4 16861 81954034 2.211% 10.526% www.voila.fr =head1 EXAMPLES C</usr/local/bin/topweb /home/squid/logs/access.log* | head -25> C</usr/local/bin/topweb /home/squid/logs/access.log* E<gt>topweb.yyyym +mdd> =head1 SEE ALSO topwebdiff - A report tool to analyse the day to day changes of the ou +tput from topweb. =head1 COPYRIGHT Copyright (c) 2001 David Landgren. This script is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 AUTHOR David "grinder" Landgren grinder on perlmonks (http://www.perlmonks.org/) eval {join chr(64) => qw[landgren bpinet.com]} =cut

In reply to topweb - Squid access.log analyser by grinder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others making s'mores by the fire in the courtyard of the Monastery: (4)
    As of 2015-07-04 09:38 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (59 votes), past polls