|Just another Perl shrine|
Re: General perl question. Multiple servers.by mwah (Hermit)
|on Oct 06, 2007 at 16:05 UTC||Need Help??|
dbmathis: I have a group of about 150 linux application
servers that a process runs on nightly and then a SUCCESS gets written
to a logfile of each of the servers when the process completes.
Currently I have to log into each server via ssh and grep each
log to see if the process completed.
YMMV, but I had (and have) to deal with a similar problem
in a "computational chemistry" environment. The number of servers
or nodes is about one half of yours.
What I learned from all that: "keep it dead simple" try to get it installed OOTB -if possible.
My current solution:
1. programs & logging
- One of the (older) boxes poses as server and holds the
node cluster in a subnet (a private one in my case)
- The server exposes (NFS,SMB possible) its /usr/local/bin (ro-mode) and
its /srv/cluster (rw-mode) to the subnet,
- The nodes load their applications from the central mounted
/usr/local/bin and write logs with date and ip
(in filenames) into seperate files in /srv/cluster
2. job overview
- The server has some perl scripts for job overview,
if required, the number and respective
ip's of running nodes are found by "nmapping" the subnet:
This (nmap -sP) will run very fast (at least here, from
a non-root account) and may provide a
"real time" info on running nodes per html page, eg.:
The found nodes might then be rsh'ed (if its a private
subnet, you won't be killed for using rsh/rexec then)
In the end, you'll have a browser-interface to the
running processes (build a nice html table in the "map"
above) and a central directory full of log files, which
might even be exported (smb) to windows machines for
coworker preferring the explorer ;-)
The only "complication" (additional work per node) would
be "installing and enabling the nfs client".