Can Microsoft build better code?

What steps might Microsoft take to improve the reliability of its products?

As we reported last week, Microsoft has hatched a plan to patch up its patching procedures. It has also delayed the next release of SQL Server, apparently in an effort to improve its reliability. And these moves follow the firm's high-profile Trustworthy Computing campaign, which began with a scrub of the Windows source code.

Clearly Microsoft is working hard to shed its reputation for building shoddy software, a reputation based on its undeniable record of actually shipping shoddy software.

But what might Microsoft realistically do to improve matters?

An obvious source of "inspiration" is the open-source model that has Microsoft worrying about quality in the first place. The firm can't adopt open-source practices wholesale, but it can learn lessons.

An obvious fact is that open-source code can be inspected. It's possible - though not easy - to spot errors by examining code, a technique called static testing. Linux tends to employ expert eyeballs, but it is also possible to use tools to find errors. Microsoft, by contrast, prefers dynamic testing: running software to see what breaks.

Both dynamic and static testing have their merits and ideally vendors should do plenty of both. So I suspect that Microsoft may have elevated the role of static testing.

A further step is to avoid or at least detect programming structures that are known to foster bugs. The automotive industry, for example, has created a set of guidelines called Misra C to identify "those aspects of the C language which should be avoided in safety-related systems".

This kind of change can make life harder for programmers, but a lot better for users.

Other possible changes may be counter-intuitive. A comparison of open-source methods with the formal methods often used in corporate development was published in 2000 by Les Hatton, professor of software reliability at Kent University. He notes that despite a "chaotic" rating under the Carnegie Mellon Capability Maturity Model, open source succeeds in producing high-quality software. Formal processes, by contrast, show little correlation with defect rates.

In other papers Hatton has shown that the development language makes little difference to final quality. Linux bears this out: although mostly written in C, it includes parts built in a host of other languages.

Hatton's data also suggests an interesting theory: software flaws are fundamentally human errors and may be governed by our short-term memory.

In any large program, components of between 150 and 250 lines of code are by far the most reliable, irrespective of the language used. Smaller and larger components contain disproportionately more errors. According to Hatton, this distribution has been verified for Ada, C, C++, Fortran, Pascal and various assembly programs in a variety of industries.

I haven't found data about component size in Linux, but it is built by people who take on work as they see fit. It would be interesting to see if Linux has unconsciously made frequent use of the optimum size.

Overall, it is clear there are many practices Microsoft might use to improve its software quality. But Hatton's data does show one other significant trend: complex software that starts out unreliable tends to stay that way, no matter how hard authors try to patch things up.

Have your say: reply to IT Week