|Welcome to the Monastery|
I've been hesitant to respond because I know you've put a lot of work and thought into this essay, but there is something here that just isn't ringing true for me. I'm struggling to put it into words, but I think it might be this.
The entire essay relies very heavily on three assumptions:
The problem with such undefined and assumed distinctions is that they are hard to apply to any sort of real world situation or even to discuss and debate.
Rewriting vs. Refactoring
I think once you start trying to define what one really means by "rewriting" vs. "refactoring" one quickly begins to realize that neither "rewriting" nor "refactoring" is really the issue here. Suppose I start with a clean empty file - am I rewriting or refactoring? Now suppose I fill that file and three others carefully, one subroutine at a time as I identify function points and data in the original code and apportion them between the three class files. Now, am I refactoring or rewriting? Now suppose the decision to split the former single file into three is based on a sub-system model developed on the basis of 3 months research into alternative models for that subsystem, the history of failed and successful standards for that problem domain, plus some original research work. Now am I merely refactoring or rewriting?
I would hazard to guess that the well-researched, well-investigated and well-thought out rewritten module is likely to last longer and require less maintenance than the code that was merely refactored to meet the latest client need using no more input than old code and current needs. If that is the case, then the determining factor is not rewriting vs. refactoring, but rather "well-thought out and researched" vs. "consider only what is before my eyes".
What is over-design?
Is over-design anything beyond extracting and studying customer stories? Is over-design researching standards? Is over-design understanding the business theory and context of a customer problem? Is over-design exploring alternate architectures and assessing them in terms of growth potential? When is research too much and when is it just enough?
The classic answer is "over-design" is going beyond the customer needs or trying to nail down details in your head that are best nailed down by doing. But this only brings us to the next question.
What exactly do we mean by "customer needs"?
Can customer needs really be assessed without research and knowledge apart from the customer? How is acquiring that knowledge different from over-design? In my experience those needs can look very very different depending on the questions you ask and the background knowledge you bring to those questions.
Suppose I go to customer who has only used cash-basis accounting and say "what do you need?" Since they've only ever used cash basis accounting, they describe to me a cash system.
Now if I know nothing about accounting, I'm liable to take that at face value and build/buy them a cash accounting system. However, if I know something about accounting theory and typical growth patterns of small businesses I'm likely to probe further. Most expanding businesses (not all) outgrow cash basis accounting rather quickly and I'd rather get a read on their near term needs. In this case, after a brief conversation it becomes clear that the potential customer should have been doing accrual accounting long ago. They just had a conversation about this with their accountant and they are simply afraid to make the switch. If I give them a cash system, it needs to have a migration path to accrual basis. If I give them an accrual system, it needs to have a strong training program. This is a very different set of requirements from my first "tell me what you want and I'll do just that" set of questions.
I've seen this "know the right questions" story many many times. Until you ask, people aren't always aware of cross department information flows that the new system needs to satisfy. Until you ask, people often forget that some larger customers have multiple accounts from a sales point of view but a single account from a legal/billing point of view. Until you ask, people aren't always aware that with a small tweak in their way of viewing information, a whole new product or service line could open up. Though this isn't something they want to do now, if there is a way to build the system just as quickly from the start and give the ability to plug in new modules to support these services, they would very much like that. Further probing shows that product lines come and go very quickly so that if not these products, then some other batch of product opportunities is likely to develop and quickly obsolete a fixed system. (Gasp - they have a business need for a framework architecture?).
What really motivates people?
Are all people driven by this velocity you talk about? By the opportunity to have a BMW in the right spot in the parking lot? Some are, but others aren't. If not, how does that affect the way you manage a project? Will focusing so much on velocity promote incentive or unintentionally kill it? I think it depends very much on the team. There is no formula and no way around tuning management practices to the individuals involved in the work. That is what makes good management hard work.
When I was doing my MSc. in management one of my professors had us take a test on our own work motivations. Then he presented studies carried out over a much wider range of the work force. The point he drove home was that the people in our classroom (all upper-mid and senior managers) had very different motivations than the larger workforce.
During the height of the dot.com boom, a friend of mine did a compensation study for a lagging start-up. The founder wanted to know what he could do to get better performance from his team. He'd tried options. He'd tried raises. It didn't work. What did they want? Toys. Cool graphics equipment that was a pleasure to use and write code on. They spent so many hours at work that what they wanted most was for the work environment to be a fun place to be.
What is success?
If assessing customer needs requires at least some background research, then perhaps we need a deeper rule for sorting out what counts as good research/design vs. over-design. Perhaps success should be the measure? If so, what measure of success do we use?
Let's take an extreme but concrete example. Is *nix a success story or a failure? By one measure it is a total failure. Most of the companies involved in supporting it have either gone out of business or been absorbed by larger entities (most recently Sun eaten by Oracle). By another standard it is a crazy success story.
By common definitions of design, Unix and its *nix cousins could possibly be one of the most overdesigned systems in computing history. It had a Gig of addressable memory when 1Meg of RAM was considered huge and the largest chips came nowhere close to a Gig. It identified every system resource with a path even when they looked nothing alike to the end user. There was no market research, story boards, or end-user discussions. Its design was driven by deep, even academic, reflection and questioning on the nature of operating systems, how operating systems could and should work.
Microsoft DOS on the other hand is a good example of design what you need. According to a book written in the 90's on the history of the operating system (title is buried in storage in the US so can't cite), the original DOS operating system was stuck with 16 bit pointers because it was designed for a chip that had only that much addressable space. Why waste computing power on a pointer that was longer than the addressable space of any known memory chip at the time?
Because of its design-to-need decisions Microsoft has spent literally billions of dollars first patching and then discarding operating systems. It has spent billions more on marketing explaining away product delays, bloated resource requirements, breakage of software that depended on older operating system features and assumptions, security breaches, and so on. Thankfully, for Microsoft, it has the money for all that spin, but is this success?
Obviously, the Microsoft family of operating systems have been a commercial success. However, there is more than one kind of success in the world. Remember that poem about Ozymandias:
And on the pedestal these words appear: "My name is Ozymandias, king of kings: Look on my works, ye Mighty, and despair!" Nothing beside remains. Round the decay Of that colossal wreck, boundless and bare The lone and level sands stretch far away
Survivorship is another form of success. I wager to guess, given the huge and increasing resource requirements of each successive generation of Microsoft OS (we're now up to a minimum of 2Gig's recommended for Win7), Microsoft's OS will eventually end up in the history books. If the company survives, it will be by virtue of re-invention, along the lines of IBM who saved itself by leveraging strong corporate relationships and sacrificing an 80+ year history of selling business hardware to turn itself into a consulting services company at the end of the 20th century.
The *nixes on the other hand, despite their lack-luster commercial success seems to have staying power. I would not at all be surprised if it 100 years from now the next generation of operating systems traces its ancestry to some form of *nix. It is often the system of choice for the increasing array of smart personal accessories and home equipment.
This is partly due to it being open source and partly due to its small footprint and stability. Code bases come and go, but the core architecture hasn't changed in literally decades.
In my own work, I've always been aggressively focused on customer needs, but I've always also coupled it with intense research. I don't think one can exist without the other. The research helps me (and my team) put the needs in context and avoid short-sighted decisions. The customer needs keep the research focused on a particular problem rather than interesting tangents. I'm uncomfortable with the sense of either/or that seems to pervade many discussions of scrum/agile. The technique seems to be becoming more important than the cooperative judgment calls it was meant to facilitate.
Gradual refactoring is a technique I often use (refactor a little, test a little), but it only works when you have (a) a clear vision of the end goal and (b) that end goal is a better architecture than you started with rather than one with just different mistakes.
I think 'scrum' has its place, but not as a model for all project management every where and anywhere. It works very well when either (a) some critical mass of people involved (either the user or the team) has broad familiarity with the problem domain and can integrate specific needs and background knowledge (b) no one really knows the domain, but the team members are determined creative learners who without even thinking much about it sweep wide for knowledge and reflect deeply on the problem at hand.
When the above criteria are met, then scrum enables such workers to thrive. When the above criteria are not met, scrum/agile design can quickly turn into a case of the blind leading the blind. You'll still end up with a Jenda tower even if you refactor a little bit everyday.
Update: struck out disputed line for which I can't find supporting citations.
Update: reworded the description of IBM's reinvention. After reflection the wording was confusing.