Home > Samples > Update > November 2003
  Microsoft Windows Sustained Engineering in Spotlight    
   

[bio]

The following is the full text of an article published by Directions on Microsoft, an independent research firm focused exclusively on Microsoft strategy & technology. Each month we make one or more key articles available to non-subscribers.

As security issues and patches garner unwelcome headlines for Microsoft, the immediate task of creating a fix falls increasingly on the shoulders of a little-known group in the Windows division. Called Windows Sustained Engineering (WSE), this team of program managers, software design engineers, and testers works closely with Microsoft’s Product Support Services (PSS), the Security Response Team, and the Windows product groups to research, develop, and test any changes to the shipping products. The ongoing barrage of security vulnerabilities, focus on new releases, and decreasing resources could put this team under increasing pressure.

Windows Sustained Engineering Responsibilities

Security fixes are not WSE’s only concern. In fact, once a version of Windows is released to manufacturing—or declared "golden"—the product team that developed it transfers the source code to the group. WSE then has primary responsibility for any further work over the next seven years (the supported life of the product), including hotfixes, security patches, updates (critical and noncritical), security rollups, feature packs, and service packs. WSE is also central to Microsoft's efforts to improve the patching process itself. (For definitions of these deliverables, see the chart "Sustained Engineering Deliverables".)

Although it is primarily responsible for ongoing modifications, WSE does maintain links with the product design team. All changes to the released code are reviewed by the appropriate developer ("buddy") on the core development team. Ideally, this developer was involved in writing the particular area of code that is the subject of the WSE effort.

Triage, Fix, and Test

Although WSE uses a team and process similar to the Windows product group, it differs in how it works on bugs and triages new problems. Rather than creating specifications and designs for new features, WSE typically reviews any bug fixes that were postponed during the development process and works with PSS and the Security Response Team to triage and fix new problems.

(For an illustration of the process, see "Sustained Engineering Process".)

Triage includes understanding the problem, vulnerability, or bug; reproducing it; and then supplying a fix. During this triage phase, WSE may create a "private" version of the fix for a customer to ensure that the fix addresses the reported problem.

Once the problem is understood, a public fix will be developed and WSE will begin to build the patch for all the required languages, versions, and installation types (e.g., full install versus upgrade).

Testing, which is performed across all languages and versions simultaneously, parallels the original release testing and falls into three basic categories: depth, integration, and setup testing.

Depth testing. First, fixes are tested to ensure that the reported defect has indeed been fixed. In addition, they are tested for functionality (the rest of the feature still works), security (the fix does not create a new vulnerability), interoperability (a fix to a single feature does not break other features), and code coverage (to ensure that all paths through the source code have been tested and that every instance of the code was in fact fixed).

Integration testing. Second, WSE tests the entire product, including the fix, for application compatibility, by using the product internally (self-hosting, or as Microsoft often calls it, "dogfooding"), stress testing (running multiple applications and services simultaneously), long-haul testing (running test suites for long durations without restarting), and performance and scalability testing.

Setup testing. Third, the fixed code is run through the full matrix of installation variations, such as new installations and upgrades over existing versions.

Only service packs and feature packs receive the same full testing cycle that a released product receives. The main difference in testing between a service pack and a hotfix is the number and duration of the tests. Microsoft also runs service packs through a beta and release candidate process similar to the process used for a new version of the product. This provides customer feedback and exposure to hardware and software environments to which WSE might not have access. The full test cycle for a service pack can take from six to nine months.

In contrast, the test cycle for a security patch (in the absence of an exploit) or critical update can take five weeks, and a hotfix less than a week. The amount of testing that a security patch receives can depend on when an exploit for the vulnerability starts to circulate. The more imminent the threat of an exploit is, the shorter the testing cycle is likely to be.

Publishing Fixes

Most fixes are published for all customers via Windows Update, but there are cases in which a fix may be so unique to a particular customer circumstance that it will not be generally available until a full set of tests are complete and the fix becomes part of a service pack.

One other mechanism for publishing fixes is worthy of mention: the security rollup CD. Just after announcing its Trustworthy Computing initiative, Microsoft announced that it would release bimonthly security rollups, which would package together fixes for known security vulnerabilities. However, only one such rollup was released. Microsoft says that although some customers were positive about the initial security rollup, others expressed differing views indicating they could cope with the proliferation of fixes by using tools such as the introduction of Microsoft Software Update Services (SUS) and improvements to Microsoft Systems Management Server (SMS). Although these methods work for corporate customers, they do not address the needs of consumers with dial-up access to Windows Update.

Future Directions and Challenges

In addition to delivering fixes, WSE is developing technologies that ease their deployment. Among other initiatives, WSE is overseeing the effort to move Microsoft from eight to two patching technologies—one for OSs and one for applications—and trying to reduce the overall size of patches to improve the time it takes to download and install fixes and service packs.

WSE will face two big challenges in the future. First, Microsoft is putting a lot of focus on the next release of Windows, code-named Longhorn. As the Windows development team works toward releasing Windows, the WSE team could find it more challenging to support the current versions from the continuing onslaught of security and other problems, such as requests to extend the functionality or support new hardware—for example, the buddies who help develop and review changes will be focusing on Longhorn work. The second challenge may come from reduced resources for WSE, Microsoft is starting to put less money aside from the current sales of Windows and other products to cover the costs of future support. This may be a natural consequence of the age of the products and the efficiencies of the sustained engineering process, but it could also mean fewer resources for support in the future.

For more information on how bugs in Windows are automatically collected and reported to Microsoft, see "Windows Error Reporting Tracks Down Bugs" on page 3 of the July 2003 Update.

For more information on bugs in software, see "A Bug's Life" on page 19 of the Aug. 2003 Update.