|Problems? Is your data what you think it is?|
The approach that I have seen with this much code is to start with statistical tools that estimate the number of defects in various subsystems. Test and fix some sections of the code to see if the estimated defects are really there. Refine your estimates by using the results of the fixes.
Figure out some sort of reasonable quality goal that uses the same measures as your estimates. Then develop an estimate of the effort needed to reach the goal. Share your estimates with others to see if you are on the right track. Agree on a plan to improve the quality, and execute it.
Testing this much code requires a good build system to automate compilation and the running of the tests. Without this infrastructure, the other test tactics that you have listed will be difficult to implement in a reasonable way. A daily build and smoke test is a great way to evaluate the system quality.
Version control needs to be working, also. If the code changes are not measured it will be difficult to estimate the code quality and to focus the effort.
In a large system like this, once the build is automated, the versions are controlled, and the quality estimates are done, there are usually one or two subsystems that obviously have the most problems. These are your hot spots. Much of the time the hot spots are already known, but sometimes there are surprises, such as when a problematic low-level data store causes what appear to be UI problems.
I like the various books by Steve McConnell which touch on this topic. There are other worthwhile books that I'm sure others will recommend.
It should work perfectly the first time! - toma