|P is for Practical|
SANScape - A Storage Consumption Monitor and Capacity Planning Toolby bpoag (Monk)
|on Feb 06, 2014 at 19:28 UTC||Need Help??|
A couple years ago where I work, we purchased a rather large amount (at least by 2011 standards) of additional storage to cope with the growing needs of the business; this purchase included the mandate that we track space consumption on an ongoing basis, so that when the next round of purchasing happens, we have the raw numbers to back up any subsequent purchase requests with.
We developed an in-house tool called Napalm (NetApp Provision And Allocation Log Monitor) for this, and set it to work gathering configuration and usage metrics on our NetApps; what was made, what was destroyed, what took up what, everything, even down to the level of individual spindles. It's done a good job of continually documenting everything that's happened, the results of which now reside in a rather large MySQL database for us to draw upon.
2 years later, it's time to look at our growth statistics in preparation for the next round of purchasing. This presented a bit of a challenge, because our storage consumption is given by the NetApp(s) in terms of volumes, not applications. Management has requested a per-application breakdown of the data, to determine where the bulk of our storage is being used up. This required building a quick script in Perl that queried Napalm's database, collected a list of every single volume that has ever existed, and then attempted to determine what application that volume belonged to via simple regex pattern matching. The results of this are then stored back in the database as well, for future reference.
SANScape exists as a web-based app. It presents the user with a simple interface at the top that allows them to specify a date, and a series of checkboxes for different reports that can be run against the data in realtime. From this data, we can construct a picture of what our per-application storage consumption numbers were, on any given date over the past 2 years.
Now, here's the fun part. About 6 months ago, Napalm was moved to a newer monitoring server.. And a bug emerged. The code we'd written to gather up raw statistics off the NetApps stopped returning any results, but the script itself still appeared to be working. As a result, we have a pretty sizable chunk of the "recording" that is completely absent. We have the before picture, and the after picture, but nothing but silence over a period spanning several months until we caught the error and fixed it.
What began as a Perl script to simply fetch numbers and draw a resulting set of human-readable bar charts then became something a bit more intelligent. We modded the code such that for any date the user supplied, the resulting numbers will be interpolated against the last known "good" values before and after the date specified. This effectively filled in the giant hole in the data, allowing for reasonably accurate numbers to be presented regarding overall growth trends.
The interpolated data being generated during these queries is actually useful from a graphing perspective as well. Obviously, lacking real data points, interpolated data is as close as we can get to what actually occurred growth-wise during the blackout; Thankfully, the fact we're missing this data and are relying on interpolated data doesn't appear to change the overall picture that much.
The net result is a 100% Perl-based single pane of glass for doing realtime historical and statistical analysis on 2 years worth of SAN growth data. The main script underpinning it all, everything from the SQL calls to perform the data collection, parsing, and resulting HTML to render a respectable interface weighs in at less than 17KB; To do the equivalent in something like Java or some other monstrosity not only would have taken ages to develop and deploy, but it would have been slower, and the code base practically obese by comparison.
Here's a link to how SANScape looks in the wild (application names blurred out..it's my employer's info, after all.)
Bowie J. Poag