Disk usage by customer

by rdfield (Priest)
on Oct 09, 2002 at 09:59 UTC

in reply to Disk usage by customer

Maybe I'm off track here but what's wrong with good ole du -sk * from the top level directory?


Re: Disk usage by customer
by Preceptor (Deacon) on Oct 09, 2002 at 10:10 UTC
    Well, for starters, du -sk suffers from some of the same problems - large numbers of nfs getattr calls (I'm not entirely sure that it does exactly the same thing, but it's got to count filesizes somehow).
    The second is, we were looking for some flexibility in the system. The 'top level' is owned by the customer, but the lower levels by each of the departments. We need to aggregate the results of the du accordingly. Of course, du -sk /fs001/* followed by 'for i in `ls /fs001`; do du -sk /fs001/$i/*;done' would provide approximately similar results (until we reached the point of 'polluted' directories), but the problem is then you are running a du _twice_. We are talking approx 2.5 Tb of data, and so I'd rather avoid doing it that way.
      Do du -kx, parse the results, then postprocess. Your usage is the usage of your directory minus all of the subdirectories that are owned by someone else. Which is just simple arithmetic.

      Plus du already should handle tricky issues, like the space it takes for a directory to hold the names of the files in the directory.

      Beyond that, you need to get a certain amount of information over your network. See how du performs. If you can, then run du from a machine right next to your NAS to save what you can from the network. Perhaps you can even run it on the NAS box?

