How Effective Are Code Coverage Criteria? An Empirical Analysis of 274 Faults

QRS2015

BugDataset

Written by: H. Hemmati. IEEE International Conference on Software Quality, Reliability and Security (QRS), 2015. 

 

Code coverage is one of the main metrics to measure the adequacy of a test case/suite. It has been studied a lot in academia and used even more in industry. The underlying assumption of using code coverage is that to detect faults one needs to first execute the faulty code. However, running the faulty code does not guarantee catching the fault. Therefore, a test case may cover a piece of code (no matter what coverage metric is being used) but miss its faults. In this paper, we studied several existing and standard control and data flow coverage criteria on a set of 274 developer-written fault-revealing test cases from several releases of five open source projects. For each fault, we identify the single improvement in the test suite (modifying a test case or adding one) that resulted in a failing test case. We examined the difference between code coverage before and after the improvement and found that around 7\% to 35\% of the faults may not be detected by considering any of standard code coverage criteria. We then classified the undetected faults using their corresponding failing test cases. The classification showed that most missed faults are those that are to do with specification (either change in the specification or misunderstanding it).