On the Art of Predicting Compatibility Regression when Assessing Risk


In my last blog post: (https://blogs.technet.microsoft.com/gladiatormsft/2018/02/20/determining-an-applications-impact-when-assessing-risk/) I discussed my first recommended vector for assessing an application’s risk: impact to business or organization. In this post, I will discuss the second recommended vector: probability of regression. Internally at Microsoft, we refer to these as technical blockers. Throughout the first 5 iterations of Windows 10, we’ve been able to get a good grasp on the overall potential for application compatibility regressions in terms of predicting potential blockers. Over the next 2-3 months, you will likely see more specific guidance in those areas coming from myself and other experts in the compatibility space. These technical factors should be incorporated into an overall organizational strategy for assessing regression probability specifically tailored to your organization.

In general, we’ve learned that Major-Version-to-Major-Version compatibility has drastically improved over the last few releases of Windows. Significant decisions were made and for the purposes of making the Windows operating system more stable and secure, the Windows XP to Windows 7 upgrade (well, technically Vista too) broke a lot of applications. You will hear ranges from 25% to 50% depending on the application portfolio’s age and technologies leveraged. For Windows 7 to Windows 8.1 as well as Windows 7 to Windows 10, the percentage was less than 10% even as a worst-case scenario (assuming applications were properly remediated & no technical debt was retained.)

For those customers who have been canaries in the coal mine and have traveled with us for these first iterations of Windows 10, the percentage of technical compatibility regressions between feature releases have been extremely low in comparison (around 1-2%.) It *IS* important to point out, however, that even if the percentage is low, if an application critical to the business falls in the bucket, it is still a blocker – which is why we preceded this post with the previous with the one on business impact. J

Factors Determining Regression Probability

Organizations often promote applications to the top of the list with histories of regressions:

Internal Test History Data: Pre-Windows 10, test history does not span more than 3 or 4 operating system major upgrades although many organizations treated Service Pack Upgrades in the same manner. For example, an internal LOB (Line of Business) application developed in 2000 [NOTE: These are not nearly the oldest apps out there] might have been thoroughly regression tested for at least these operating systems with these example initial testing results:

  • Windows 98 - PASS
  • Windows 2000 Terminal Services SP1 – FAIL [Remediated]
  • Windows 2000 Terminal Services SP2 - PASS
  • Windows 2000 Terminal Services SP3 - PASS
  • Windows 2000 Terminal Services SP4 - PASS
  • Windows XP - PASS
  • Windows XP SP1 - PASS
  • Windows XP SP2 – FAIL [Remediated]
  • Windows XP SP3 – FAIL [Remediated]
  • Windows 7 – FAIL [Remediated]
  • Windows 7 SP1 - PASS
  • Windows 2003 Terminal Services - PASS
  • Windows 2003 Terminal Services SP1 - PASS
  • Windows 2003 Terminal Services SP2 – FAIL [Remediated]
  • Windows 2008 R2 RDS – FAIL [Remediated]
  • Windows 2008 R2 RDS SP1 - PASS

In the above scenario, three out of eight or 37.5% of the time, it fails warranting it to likely be elevated to a higher risk category.

Dependent Middleware Regression History: In some cases, groups of application sharing common middleware might get hit with periodic regressions simply when these middleware or component dependencies are patched. For example, I recently visited a customer who had a few examples of and Office-integrated LOB application that was broken after the following MDAC (legacy ODBC data access) patches:

  • Q823718 Update
  • Q832483
  • KB92779
  • KB954326 Update
  • KB952287
  • MDAC 2.8 SP1 Update

In this customer’s case, as in most, it’s difficult and too much overhead to track percentages so only patches where regressions occurred were noted. However, 6 patch regressions across the lifetime of the application in question (in this example 17 years) was enough to warrant elevation of risk priority when coupled with the impact ranking.

Known Technical Factors

When the Windows operating system evolves, efforts are made on multiple fronts to ensure compatibility. One of the most prominent is providing methods within the operating system to maintain compatibility through technical means that is seamless to the user. In most cases this is automatic (as in recent releases of Windows) although in some cases it requires tweaking and configuration on the application managers part. Another front is through developer evangelism where encouragement to use modern development technologies and platforms ensures better odds at ongoing compatibility. This is part of an ongoing pledge to make Windows 10 the most compatible Windows operating system to date. So far, the numbers are proving that out with greater and greater improvement as subsequent releases of Windows 10 evolve (going into our sixth 1803 as of this writing.)

There will be exceptions alongside of those that fall into a single-digit percentage of overall applications. Those exceptions and elements that fall into that small percentage of issues include:

  • Application which rely on kernel mode drivers (filters, devices, etc.)
  • Security Software: Often because of drivers.
  • 3rd-party firewalls/Win32 VPN Clients: Also often because of drivers
  • Use of Undocumented APIs

These technical risk factors will likely elevate the application to a higher prioritization for testing.

Retention of Technical Debt

One of the most important (and only major) exception to the Windows 10 ongoing compatibility vector revolves around retention of technical debt. We refer to the term “technical debt” to describe those applications which leveraged some type of last-mile technology that allowed you to “punt” [to use an American football reference] and borrow more time. Doing this creates technical debt that will need to be paid at some point and their will be a cost to servicing that debt. In many cases extending the life of that mortgage could create costs that could exceed retiring that technical debt earlier. For example, we’ve seen cases where the cost of limping along an application for extra years exceeding the cost of redeveloping the application – hence the analogy.

  • Remaining on legacy operating systems through Custom Support Agreements
  • Moving to Session Hosts (large example of this was extended XP-compatible application into 2015/2016 using Server 2003 Terminal Services.)
  • Leveraging virtualization or containerization technologies to provide compatibility through emulation or OS virtualization (i.e. in-line virtual machines.) to preserve older applications.
  • Turning off UAC: This was a large one leveraged for Windows 7. Turning off UAC in Windows 10 is unsupported and untested [and will likely break a lot of other things in Windows 10.] If your organization resolved a lot of Windows XP/Windows application compatibility issues by turning off UAC, you now will have to deal with truly remediating these application issues.

In the scope of application compatibility testing, these applications will need to be tested early and here’s the bad news: many will likely need remediation. The good news is that if the remediation is solid, it will likely not run into further compatibility issues during iterative feature updates if the technical debt is truly retired. This will allow many to possibly be placed in lower risk groupings.