Monday, February 8, 2016

Unit Tests Assertion Coverage

A unit test case is made of three parts - the "given", "when", "then". The "given" is the piece of code that set's up mock objects, data that needs available and dependencies that need to be injected. The "then" will execute the method under test. The "then" verifies the results of the execution of the method under test.

Too often it is observed that the unit tests do not have enough of the "then" statements that verifies the return results. This happens more often in cases where existing code running in production does not have sufficient unit tests available. When new features are built on top of the existing code it is often difficult to write proper unit tests with good verifications and assertions.

The problem described above is also a symptom of a team's focus on code coverage. While code coverage is a good metric to observe and use it only proves that the application code was executed. A unit test case with "given" & "when" and no "then" will also produce a good code coverage metric. This kind of unit test is not a good test because the verifications do not exist.

I recently tried out a solution to this problem with really good success. The solution was to capture the return value of the "then" piece of the unit test case and run it through the jackson object mapper. The result is a JSON representation of the actual return value of the method under test. Save this JSON in one of the files in src/test/resources folder. This makes up the "expected json" of the unit test. In the unit test, now every time the method under test is executed, take the actual return value, again convert it to json and compare it with the "expected json" that is saved in src/test/resources folder.

This approach is also described as Characterization Tests. As described on the wiki:
In computer programming, a characterization test is a means to describe (characterize) the actual behavior of an existing piece of software, and therefore protect existing behavior of legacy code against unintended changes via automated testing. This term was coined by Michael Feathers. [1]
This approach allows us to gain the critical "Assertion Coverage" from a test case. Assertion Coverage is the measure of how much of the data produced by the method under test is valid. Contrary to what it sounds like, this approach does not produce any metric like the code coverage metrics, its not something we can see in red or green colors. The benefit is clearly visible though.

Another important benefit of this approach is the "expected json" that was produced serves as a documentation of what a specific method produces. The entire object is converted to JSON and thus we can get a good visual of the data that flows through the code.

In the next post I will elaborate more on this approach with an example.