Software Testing Cage Match: Amazon.com vs. Microsoft
While I previously made some comparisons between Amazon.com and Microsoft's different approaches to software testing in Building Services: The Death of Big Up-Front Testing (BUFT)?, I think now would be a fun and interesting time to do a deeper dive on this.
Before I joined Amazon.com in 2005 as an SDET, while I was interviewing for said position in fact, I was told about the “QA situation” there. I was told “it’s improving”. Improving from what? you may ask. Well, the perception was QA got short shrift with the 1 to 10 (or 1 to 7, or 0 to infinity) Test to Dev ratio held up as proof.
“Improving, eh?” Did I buy that? Not necessarily, but I quickly came to a realization: I have previously used Amazon.com a lot, and had rarely noticed any problems…it seemed to work. Even so, after I joined the QA Team there, it was still a frequent source of grousing that Amazon as a whole did not value the Quality Assurance profession, otherwise they would surely fund more of us. I later shared this grouse with my former director of engineering from my previous company over a game of cards. I expected sympathy, but instead he simply asked “And is this the right thing for Amazon?” Between that eureka moment, and years more of experience, it taught me that it’s not about the ratio, but about what do you expect from your software teams (Dev, QA, product specialists, and managers) and how do you handle and mitigate risk.
In 2009 I changed companies and moved to Microsoft (across Lake Washington from Amazon’s Seattle HQ). Microsoft has a reputation as a place where testers were respected and software quality was given priority. I was eager to see the “Microsoft way” of quality…the fabled 1 to 1 ratio. Turns out that’s the Office and Windows way, but a nascent realization of how we test such “shrinkwrap” product versus how we test services was taking hold and experiments in quality processes abounded. However I think there are still fundamental differences to how Amazon.com approaches software quality versus Microsoft.
Head to Head
I manage a Test Team at Microsoft with a 1.5 to 1 Dev to Test ratio. At Amazon I had 1 SDET for every 7 or so Devs. So my new job must be easier right? Nope. Ratio is not an input into the equation, it’s an output. You set your quality expectations and you employ processes to get you there. One path takes 10 SDETs and another takes 1. How can this be? Well, let’s compare and answer the question:
How does Amazon get by with so few hours spent by its QA teams relative to Microsoft?
1. At Amazon.com whole features, services, code paths went untested. Amazonians have to pick and choose where to apply their scarce resources. At Microsoft, other than prototypes or “garage” projects you can expect the complete “triad” of Development-Test-Product Management teams to be engaged at every step of he way.
Exclusion of code from testing cuts down your need for testers. Maybe SDETs to lines of code tested is a more interesting ratio than Test to Dev? If you exclude untested features at Amazon, then the test to dev hours ratio is going to increase closer to Microsoft standards
2. Amazon has “Quality Assurance” teams while Microsoft has “Test” teams. However QA at Amazon almost never got involved in anything but testing. That is to say Microsoft and Amazon should swap the names they use for their QA teams since Amazon's are much more "test" only teams while at Microsoft we seem to achieve more actual QA.
Saving time by not reviewing the design or designing for testability is not saving time at all.
3. Functional-only testing was common at Amazon.com, however performance testing was either not done, done by developers, or given second class status.
Performance testing was often done by the dev teams, so these test hours were actually spent, just not by the test team.
4. A high operations cost was considered acceptable (developers carried pagers), so releasing a bug was OK because it was relatively quick to fix it. (lower quality bar for release). Also Amazon had better tools and processes for build and deployment which enabled rapid deployment of hot fixes.
Essentially a form of testing in production (TiP). Again tally up the hours and put them on Dev's tab
5. Better self-service analysis tools. Any issue that was found in production was easier to analyze and turn-around quickly due to better tools for monitoring servers and services, and sending alerts.
Reducing cost through automation (and tools)... this is a real savings.
6. Cheap Manual testing. I am of mixed mind listing this since I spent a great deal of energy encouraging the manual testers to automate their tests, but Amazon employs overseas teams to bang on the product via the black box interface and find problems before production users do. This had a decent yield for finding defects.
Hidden test hours. When people talk about the test to dev ratio at Amazon they often do not count these off shore teams.
A friend of mine who is a QA manager at Amazon recently lamented:
“The test to dev ratio [is] insanely stretched …. there's soo much more we could do, but no we just rush and rush and cut things and get [blamed] when we miss something”
So maybe my “head to head” comparison does not explain away all the differences, but the message I would like to convey is that it is about expectations. I originally wrote the above list in response to a Dev manager who asked me why we couldn’t be more like Amazon and “pay less” for QA. Amazon has one expectation and Microsoft has another about quality and about risk… that’s why.
I’ve made a lot of generalizations about how things are done at Microsoft and Amazon.com, which means what I said is going to be simply wrong when applied to several teams. Feel free to let me know in the comments how I screwed up in portraying your team. But be aware I know it’s not one size fits all…hopefully I’ve captured the big picture.
And to close, I will say that other than the ratio, Amazon did improve while I was there. I saw the QA community come together and start interacting in positive ways. Amazon’s first ever Engineering Excellence forum was organized by the QA community. So that just leaves the final questions: Does Amazon’s ratio need to be improved, and what does improved look like? Do Microsoft’s expectations need to be changed, and what would those look like?