http://www.perlmonks.org?node_id=547474


in reply to Re: 'A' web server takes another "time out"
in thread 'A' web server takes another "time out"

I'd be a bit disappointed if a mature system like FreeBSD contained this feedback loop in resource allocation. No system is perfect, but I'd come to expect better behavior when memory becomes scarce than such a feedback loop that makes the problem keep getting worse while trying to let each part continue to fight to do its thing such that nothing at all can get done and it takes so long before the system finally gives up and reboots (or is it that the system never gives up and pair.com notices the lock-up and eventually cycles power?). I recall much older systems noticing a problem and selecting processes to be completely "swapped out" (different from "paging", a more accurate term for what is often mislabeled "swapping") such that they stop fighting and other luckier processes get a chance to finish such that the resource exhaustion might pass or at least the system is capable of getting something done such that someone can "get in" in order to clean up "by hand". Note that when this happens to the 'A' web server, there is no hope of logging in to the system.

But perhaps this is just a case of bad tuning such that Apache fights too hard and it takes a while for FreeBSD to overcome it... Perhaps that is why many processes go to "0K" resident memory usage, though I'd expect a state much different than "RUN" to be reported for a swapped-out process. This lead me to notice again the angle brackets such as on "<httpd>" and searching "man top" for what those mean I find "COMMAND is the name of the command that the process is currently running (if the process is swapped out, this column is marked '<swapped>')" which isn't completely clear but somewhat supports that interpretation.

Since I don't have root access, I don't think trying to roll my own replacement for 'top' or 'sar' will be possible. At least, my assumption was that I'd not have access to what 'top' and 'ps' use to get all of that information about other processes. Indeed, I don't have any access to /proc (symlink to /root/proc and I have no access to even /root). But I see that neither 'top' nor 'ps' are set-UID nor set-GID so I'm not sure how the security is arranged. 'man ps' mentions needing procfs mounted (and referencing /proc and /dev/kmem). So would a self-built 'top' on an unprivileged FreeBSD account be useful? If not, I think just adding "ps" output to the existing "top" output would be one of the next steps.

- tye        

Replies are listed 'Best First'.
Re^3: 'A' web server takes another "time out" (root)
by Hue-Bond (Priest) on May 04, 2006 at 18:58 UTC
    I don't have any access to /proc [...] 'man ps' mentions needing procfs mounted [...] I think just adding "ps" output to the existing "top" output would be one of the next steps.

    If top took 5.5 minutes in showing output between two given snapshots above, I think adding ps won't improve the situation because ps data won't be correlated at all with top's. My bet would be to play with ps o argument, which allow you to get the information of top and more. Setting PERSONALITY to "bsd" on this Linux machine allows me to run ps as I were on a FreeBSD. I hope...

    $ PERSONALITY=bsd ps faxo pid,euid,egid,ni:2,vsz:6,rss:6,pcpu,pmem,sta +t:3=ST,tname:6,stime,bsdtime,args PID EUID EGID NI VSZ RSS %CPU %MEM ST TTY STIME TIME C +OMMAND 1 0 0 0 1924 652 0.0 0.0 S ? 19:24 0:00 i +nit [2] 2 0 0 19 0 0 0.0 0.0 SN ? 19:24 0:00 [ +ksoftirqd/0] 3 0 0 -5 0 0 0.0 0.0 S< ? 19:24 0:00 [ +events/0] [...] 1368 111 111 0 26580 912 0.0 0.0 Ssl ? 19:26 0:00 / +usr/sbin/ippl -c /var/run/ippl/ippl.conf 1423 0 0 0 4800 1608 0.0 0.1 Ss ? 19:26 0:00 / +usr/lib/postfix/master 1428 101 104 0 4812 1604 0.0 0.1 S ? 19:26 0:00 +\_ pickup -l -t fifo -u -c

    You can s/args$/comm/ in order not to show parameters of commands:

    1368 111 111 0 26580 912 0.0 0.0 Ssl ? 19:26 0:00 i +ppl 1423 0 0 0 4800 1608 0.0 0.1 Ss ? 19:26 0:00 m +aster 1428 101 104 0 4812 1604 0.0 0.1 S ? 19:26 0:00 +\_ pickup

    HTH.

    --
    David Serrano

      Heh, but that doesn't show me the one thing I'm interested in, the parent PID. The 'top' and 'ps' output don't have to be in sync; I just need a snapshot of 'ps' output at some point during the "bad time" in order to see who owns the newest 'httpd' processes.

      FYI, your hoping wasn't enough (:

      ps: euid: keyword not found ps: egid: keyword not found ps: ni:2: keyword not found ps: vsz:6: keyword not found ps: rss:6: keyword not found ps: stat:3: keyword not found ps: tname:6: keyword not found ps: stime: keyword not found ps: bsdtime: keyword not found ps: args: keyword not found PID %CPU %MEM 0 0.0 0.0 1 0.0 0.0 2 0.0 0.0 ...

      - tye        

        ps -axo pid,ppid,command

            --k.


        ps: euid: keyword not found ps: egid: keyword not found ps: ni:2: keyword not found

        Great :^(. It seems that that ps doesn't support field width. The field for the PPID is surprinsingly ;^) called "ppid". After searching for the manpage on google, I'd try something like ps -j, ps -l and ps -a -x -o pid,ppid (this last one is just for testing if ppid works).

        --
        David Serrano

        Well, "ppid" is one of the values to be specified for option "-o", "ps(1)". Try something like (tested on one of the Pair shared hosts running FreeBSD 4.8-STABLE) ...
        ps -wwax -o ppid,pid,pgid,rss,vsz,nice,%mem,%cpu,rgid,ruser,user,stat, +command \ | sort -k1,1n -k2,2n
        ... there are other options listed related to paging & swapping, and (real & saved) user & group id. If you specify the "-c" option along with "-o command", only command name will show up (w/o the arguments).
Re: 'A' web server takes another "time out" (root)
by jonadab (Parson) on May 05, 2006 at 02:19 UTC
    I'd be a bit disappointed if a mature system like FreeBSD contained this feedback loop in resource allocation

    Oh, is the perlmonks server running FreeBSD? I didn't realize. In that case, top doesn't appear to show parent process IDs, unless I'm missing something. There are things I like about FreeBSD, but its version of top is not one of them. The ps that comes with FreeBSD is rather better, but in a scenario where you can't start a new process, top could be already running, and I don't know of a way to make ps do that (i.e., be already running and report output periodically).

    I recall much older systems noticing a problem and selecting processes to be completely "swapped out"

    I've observed on my desktop that FreeBSD will kill a process if it consumes too much RAM (in situations where Linux wouldn't, although Linux since circa 2.2 will also do this if the entire system is low on RAM, which is better than the Linux 2.0 behavior; but FreeBSD will kill a process for this even when there's unused swap space, if it surpasses some per-process memory usage quota). However, one process using lots of RAM is a very different scenario from many processes being spawned. I don't know what FreeBSD does with that. I could test that here with a forkbomb, I suppose...

    Indeed, I don't have any access to /proc

    That could make it hard to get a good look at the process tree.

    So would a self-built 'top' on an unprivileged FreeBSD account be useful?

    I don't know. It also seems like there _ought_ to be a tool designed to prepare a process ahead of time (preload it into RAM , go ahead and ask the operating system for a process table entry, and so forth) to be launched quickly, which might allow you to set up ps to run and then, when the problem is noticed, trigger it to go ahead. I do not, however, actually know of such a utility.

    I feel your pain. Having to work around the lack of root access to accomplish things that would be much easier _with_ root access is certainly something that can be annoying. (I can also understand why the hosting company doesn't want to hand out root access, of course, but that doesn't make your situation any less frustrating.)


    Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. Why, I've got so much sanity it's driving me crazy.