code coverage, what is for after all?

I was reading discussions on software testing code coverage. The discussion was dominated by what the code coverage number for unit test or overall test  are correct or desired. 40%, 60%, 80%, even 100%? some claims that google only requires 60%. some asked me what are code coverage number requirement in microsoft. as far as i know, there isn't code coverage requirement per se in microsoft, but only guidance. although i can't speak for google, but very likely they are the same here. for code coverage, there is only guidance, no requirement. so instead telling him 'no requirement', thought better to write a few thoughts i have regarding code coverage. (disclaimer: when I speak on microsoft or google, i have to use "in general", "in most case", or "in many product teams", because microsoft and google are so big and diverse, have so many product teams, there are no universal ways for every engeering practice.  some teams do it in this way, some teams do it in other way. however there are common practices that are adopted by many or most teams. code coverage is one of them.)

what's the code coverage anyway? code coverage is a to measure how much the product code was covered by a set of test. it can be measured by by lines, by blocks, by arcs, by classes, or by files, etc.. in most case, we use blocks as code coverage unit. note: we only collect code coverage based on automated tests, not consider manual tests.

in most microsoft product teams, we do require to collect code coverage number. there are different code coverage numbers we collect based on different type of tests, for instance, code coverage for unit tests, code coverage for component tests, and code coverage for scenario tests (e2e).  unit test code coverage are automatically collected whenever unit test got run. so dev finishing up writing code/unit tests before checkin, they run a set of tests (checkin quality gate) which includes unit tests. so you got unit test code coverage automatically. code coverage on component tests and scenario tests are collected on code coverage build peroidically, for example once a week or on demand.

there is always argument about the real benefit of code coverage. some says code coverage number represents product quality, the higher the number is, the higher the product quality is. some says higher code coverage doesn't mean higher quality, because 100% coverred code still have bugs, which is true.

here is my takes on code coverage:

1. code coverage is important. it's easy and simple to collect and fast way to let you get  a sense on how the codes are tested. it let you visualize and examine how the codes are tested. kind of like a flash light in the dark which let you see many objects more clearly. does it gurantee you will see the objects in the dark, of course not. but without flash light, it will be quite difficult to see the object.

2. although 100% code coverage doesn't not mean bug free, but 0% code coverage DOES mean a huge risk to product quality.

3. code coverage only measure how the codes are tested, not how the product are tested.


so, do we need a requirement on code coverage number? if yes, what the number is the best?

first off, any number is meaningless without a context. number itself is not the goal. it's the indictor for any action need to follow. it's like you got 100 points school test, is it good? bad? the answer is: it depends. it depends on what's the total points, the easy/difficult of the test, what points do your peers get, etc... it's the same to the code coverage number. 60%, 80% or 100% don't mean anything without context.

then what should i do with it after collect the code coverage? this is exactly the meaning of collecting code coverage number, find out what you should do with your code coverage number, or how to use/interpret the number:

1. test gap analysis. i would say this is most important benefit of code coverage. there are many ways to do test gap analysis, however, throuh analyzing code coverage is one of most effective way.  exam the code coverage number, see which areas are low or zero coverage, decide any test could missed or potenail risk areas.

2. test effectiveness analysis. i have 1000 test cases which takes 1 day to run. can it reduced to 500 test cases, 200 test cases without compromise overall test coverage?

3. test and code change association analysis. one of hard problem to sovle in software testing is: i just change one line of code, you need to run 1000 test cases taking one day to verify no regression? there got be a better way. where, the better way is to run only affect tests. the question is how do you know what tests got affected. there are tools/researchs available to help, by using code coverage.

4. see your code coverage trend over time, is it getting higher and higher or lower and lower.


last, if you must need a number from me, here is my guidence. again, any number is meaningless without context.

1. if you unit test cc is > 80%, you should take action only when you have time

2. if your unit test cc is between 50% and 80%, you should allocate some time within your current milestone to take action on it.

3. if your unit test cc is <50%, you should allocate some time asap to take action on it.

4. if your unit test cc is <30%, you call yourself a developer???? :)

hope this help....




关注我:新浪微博:@billliu_seattle 或twitter: @billliu_seattle