in reply to Poor Man's Web Logger
Nice script! A few notes:
- For better portablity, consider making the "my ISP" IP numbers into
variables that can be easily changed at the top, like $home and $logfile
are.
- Use the $^T variable instead of calling 'time'
- You could also just say $date=scalar localtime; :)
- Put the gethostbyaddr line before the open and flock. That way,
the data can be dumped in very quickly and the file closed again. Even better,
write all the data into one variable and have a single print LOG "$data\n";
line between the lock and the close.
- If you can't get to your access logs, the die
statement at the end of 'open PIC' does not really help much. On a similar note, you
should check the return value of OPEN in 'open LOG' and bypass the rest
of the loop if it fails.
- Some servers will report REMOTE_HOST as well as REMOTE_ADDR. If your server is
lucky enough to do that, you can remove all the socket stuff!
- Put the "Content-type" line at the start of the script,
so the browser knows as
early as possible that it is receiving an image.
- You should also tell the browser the size, with a Content-Length header. This info is easily grabbed with
-s $pic
- For even better speed, consider hard-coding the image into the script itself. A small gif
can be squeezed as small as 35 bytes!
- Output the gif, then do your log file. Not only does the
client not have to wait for the log file writing, but you
can simply die if 'open LOG' or 'flock LOG' does not work.
- Using all of the above, I can squeeze it down to a 6 line script!
RE: RE: Poor Man's Web Logger
by comatose (Monk) on Apr 06, 2000 at 21:53 UTC
|
Thanks for the good feedback. I took some of your comments
and incorporated them into a new version and expanded on
some ideas as well.
For example, the reason I constructed the $date and $hostname
into a single variable was so that I could put a different
label on each entry. With the %entries hash, you can now
setup different pattern matches and the corresponding log
entry.
I left it doing the hostname lookup everytime simply because
every sane web server administrator has hostname lookups for
access turned off, leaving that variable empty.
If I were cruel, I'd have it send SERVER_ADMIN an email
everytime it got a REMOTE_HOST variable. :)
| [reply] |
|
A question about this REMOTE_HOST stuff--generally, server
admins disable the hostname translation because it slows
down the server, correct? I agree that it's the "sane"
thing to do--but in this case, aren't you sort of defeating
the purpose by doing the hostname translation yourself?
The whole point, I thought, was to disable runtime hostname
translation, because it slows things down; you seem to
agree, so why re-introduce this step? Why not just
log IP addresses, then run a cron job later as part of your
stat analysis to do the hostname translation?
Just something to think about. If I'm missing the main
issues, let me know.
| [reply] |
|
I think in this case, the server parses more than just his
page, and more than just his domain, perhaps. This is just
his way of "turning it back on" even though the sysadmin
for the server that hosts his site has it turned off. Just
a guess.
I have sites that have it both ways. btrott has a
good point, however: doing it later would also allow
you to cache the answers, freeing some (perhaps a lot)
of calls to gethostbyaddr. Unless it is a really, really
busy page, however, it probably does not hurt too much
to look it up each time.
| [reply] |
|
What you say is true to a certain extent. However, this logger
is for when you don't have access to regular log files. My
dialup ISP actually lets me do CGI in my home directory
but doesn't let me have access to log files.
And since my site gets about 4 or 5 visitors a day, the
system hit is almost nil. If I were getting a visitor a
minute, I might change it. Also as it is, it's only
doing one lookup per page. That's a lot less than the
standard number of lookups for a full page (html plus
images).
| [reply] |
|
| [reply] |
|
|