Thank you for a well thought out response.
While the newer word "refactoring" seems to be pretty well-defined,
I feel that the older word "rewriting" is not.
From Martin Fowler's original Refactoring book:
Refactoring is the process of changing a software system in such a way
that it does not alter the external behavior of the code yet improves
its internal structure. ...
In essence when you refactor you are improving the design
of the code after it has been written.
Refactoring is a disciplined technique for restructuring an existing body of code,
altering its internal structure without changing its external behavior.
Its heart is a series of small behavior preserving transformations.
Each transformation (called a 'refactoring') does little, but a sequence of
transformations can produce a significant restructuring.
Since each refactoring is small, it's less likely to go wrong.
The system is also kept fully working after each small refactoring,
reducing the chances that a system can get seriously broken during
Hopefully, most folks will agree with those definitions.
Now it gets much harder.
For example, your opinion:
Subversion, git, and Mercurial are not rewrites of CVS.
does not agree with mine.
My personal view is that Subversion was
a "rewrite" of CVS, while the other two were not.
I don't feel strongly though.
I may well be "unorthodox", as you claim, yet I was pleasantly surprised
to discover that many others, including Joel Spolsky, share my opinion.
From Joel Spolsky
You may also want to look into Subversion, a ground-up rewrite
of CVS with many advantages.
From Open Source Software Development (wikipedia)
A good example of a complete rewrite was the Subversion version control system,
whose developers started from scratch: they believed the codebase of CVS
(an older attempt at creating a version control system), was useless and
needed to be completely scrapped.
From Concurrent Versions System (c2.com)
SubVersion is a project to rewrite CVS from scratch,
in a more flexible and extendible way - and then to extend it.
Finally, a probing (and relevant to this thread) question from
Shlomi Fish interviews Ben Collins-Sussman
Subversion was a re-write from the grounds up done by many of the original CVS workers.
Do you think it could have been faster to replace CVS (or CVSNT) component by component,
thus yielding Subversion?
To take another example, while I view Perl 6 as a "rewrite" of Perl 5,
I suspect many monks would disagree with that view;
a couple of them have already made that plain in this thread.
Note however that Larry Wall
at least seems to view Perl 6 as a "rewrite" of Perl:
Perl 5 was my rewrite of Perl.
I want Perl 6 to be the community's rewrite of Perl and of the community.
Admittedly, that quote was taken from State of the Onion, TPC4,
and the direction of Perl 6 has changed a bit since then.
I'd be interested to know if Larry still views Perl 6 as a "rewrite" of Perl 5.
Open Source Software Development (wikipedia) neatly summarizes the available rewrite/refactor options:
Often open source developers feel that their code requires a revamp.
This can be either because the code was written or maintained without
proper refactoring (as is often the case if the code was inherited
from a previous developer), or because a proposed enhancement or
extension of it cannot be cleanly implemented with the existing codebase.
A final reason for wishing to revamp the code is that the code "smells bad"
(to quote Martin Fowler's Refactoring book) and does not meet the
developer's standards. There are several kinds of revamps:
- Refactoring implies that the code is moved from one place to another, methods, functions or classes are extracted, duplicate code is eliminated and so forth - all while maintaining an integrity of the code. Such refactoring can be done in small amounts (so-called "continuous refactoring") to justify a certain change, or one can decide on large amounts of refactoring to an existing code that last for several days or weeks.
- "Partial rewrites" involve rewriting a certain part of the code from scratch, while keeping the rest of the code. Such partial rewrites have been common in the Linux kernel development, where several subsystems were rewritten or re-implemented from scratch, while keeping the rest of the code intact.
- Complete rewrites involve starting the project from scratch, while possibly still making use of some old code. A good example of a complete rewrite was the Subversion version control system, whose developers started from scratch: they believed the codebase of CVS (an older attempt at creating a version control system), was useless and needed to be completely scrapped. Another good example of such a rewrite was the Apache web server, which was almost completely re-written between version 1.3.x and version 2.0.x.
Apart from arguing over semantics, the interesting strategic decision
we face is whether to extend an existing legacy code base
or throw it away and start from scratch.
There is no one "right" answer to that question: it depends on the
project, the team, the quality of the existing code base, and many other factors.
Perhaps the most important thing is striving to prevent legacy
code degenerating into a tangled mess in the first place.