Building software for a global mass market

In the comments to my intellisense blog entry people asked a reasonable question: why not just ship some of the imperfect T-SQL intellisense Microsoft developers have built internally and thus help our customers be more productive. I thought an explanation might give you some interesting insights into how Microsoft works and what it takes to build software for a global mass-market.

It is easy to forget that Microsoft is composed of ordinarily fallible people thus no matter how sound a business decision is, pride of work and other human foibles will likely get in the way. As well, shipping global mass-market software is time consuming and expensive which discourages shipping throw away code. Below are some of the details of what shipping global mass-market software at Microsoft entails.

Because we ship to and localize for numerous locales (e.g. cultures and regions) and make our products highly accessible (e.g. supporting screen readers and high contrast mode for the vision impaired) the development and testing requirements of products are greater than you may first expect. As well, all our products must meet our SDL (Security Development Lifecycle) criteria before release. All this requires a huge investment of effort in creating even seemingly simple applications. The benefit for Microsoft is that we can sell our software a greater proportion of the world population and so can amortize the cost over many more users.

Because we ship to numerous cultures we need to test to make sure the product is appropriate for those cultures. When we get this wrong it is both embarrassing and costly but in some countries Microsoft employees can go to jail for the mistakes of their colleagues in other countries. As well as having comprehensive guidelines to ensure products are not offensive or culturally insensitive, we have a tool called policheck that automatically scans code for known issues. Often we build in support for special features for some locale (e.g. the really cool office assistant characters for some East Asian cultures) which makes the product more locale-appropriate but adds to the cost of building it.

To support localizing products for various locales, strings that are locale specific need to loaded through resources (and only those strings to avoid bugs caused by localizing non-locale specific strings). Dialogs must be built to handle labels etc changing size and perhaps right-to-left text and even bidirectional text. Our software must internally use Unicode strings and take care to process them in locale neutral ways and perhaps convert them to and from ANSI strings for some database servers. This requires careful engineering and testing. We use pseudo-localization tools to automatically localize products in a way that mimics various locales and tests for common bugs. This exposes numerous bugs that must be fixed. Trial localization into a couple of languages (usually and East Asian and a European language) typically exposes some more bugs which must also be fixed. Then there is the cost of localization itself, glossaries automatic tools and careful development help reduce costs but each locale requires considerable costs.

For each locale a product ships in it needs to be retested for all supported operating system versions. This makes it more economic for us to create automatic tests that can be rerun on the localized versions. Since the user interface can change by locale these need to work against APIs like the IAccessible API and not mouse positions. All this requires about one tester per developer to write the entire set of customer facing automated tests (developers write automated unit and integration tests). These testers must themselves be good developers and active participants in the design process.

Criminals can devote considerable time and resources to find exploitable security issues in Microsoft’s software. To reduce the number of exploitable security issues in our products, Microsoft has developed the SDL—a comprehensive set of security guidelines processes and tools. Creating a threat analysis, running automatic analysis tools, doing security code reviews and fixing any issues for even seemingly innocuous code is time consuming. This lengthens the time and raises the costs of building software but is, of course, better than customers having to deal with security issues.

By the way if you are interested in Microsoft’s recommendations for building secure enterprise applications have a look at the new Security Guidance site on MSDN. The guidance was put together by the Patterns and Practices team which also puts out the popular Microsoft Application Blocks. It is easy to follow and is the result of detailed thinking about best practices.

In some circles, Microsoft has a partially deserved reputation for shipping buggy code. I am biased but I think, compared to most other software companies, Microsoft generally ships very good code. With better tooling and processes we hope to raise this quality bar further. Automatic crash and hang reporting gives us great data on the real causes of crashes and hangs for Windows and Windows applications. Third party device drivers, malware, third party add-ons and hardware problems feature strongly in the data.

In practice, we test and fix code until it reaches a high level of quality. This means that we spend significantly more time driving quality into the product than adding features. No one wants to ship buggy code but because our customers have high expectations and worldwide customer support is so expensive, it makes good business sense to have this level of quality since the cost of a high quality bar is amortized over all the happy customers and support calls we avoid.

As you can see Microsoft’s business model makes developing software very expensive. Much of this cost can be amortized over subsequent versions (e.g. threat analysis, automated tests, documentation and globalized dialogs) which discourages not getting it right the first time. Microsoft’s processes requires a lot of organization, training, individual excellence and team work. To support designers, developers, documentation writers, localizers, marketing, program managers, support staff, testers and usability engineers all working together requires a lot of process documentation and coordination. With this scale of investment it is understandable why every version of a product needs to be a solid step forward that builds on the previous version.

Another reason we are unlikely to ship imperfect intellisense is the way the company approaches new markets. Many large software companies tend to market an idea and if the market is enthusiastic then they try to build the technology. Microsoft tends to take the opposite approach—make sure the technology is right, do detailed market and customer analysis and then publicly market the idea. With community efforts like CTPs we are trying to find a middle ground where we get earlier broad customer feedback but without promising vaporware. However the philosophy of making sure you are providing significant customer value and are on the right technical path before shipping anything is still very strong in the company. This makes it almost impossible to ship anything that is not technically sound even where there is clear customer value.

This posting is provided "AS IS" with no warranties, and confers no rights.