Metrics for Measuring Software Quality

Software quality is vital if you are going to keep your users engaged & satisfied. So, what are the metrics for measuring software quality?

Software quality is vital if you are going to keep your users engaged & satisfied. So, what are the metrics for measuring software quality?

January 13, 2021
Tamas Cser

Elevate Your Testing Career to a New Level with a Free, Self-Paced Functionize Intelligent Certification

Learn more
Software quality is vital if you are going to keep your users engaged & satisfied. So, what are the metrics for measuring software quality?
Test automation is vital nowadays. But sometimes it can be hard to measure its true value. Here, we explain how to measure the impact of test automation.

Test automation has been instrumental in driving modern application development. It helps prevent testing from becoming a bottleneck in your software delivery process. However, it is hard to put a real figure on the value it delivers for your company. In this blog, we introduce new metrics that allow you to accurately assess the impact of test automation. We also explain how you can measure the costs of your testing. Taken together, this will allow you to measure the value of your test automation.

The problem with traditional test metrics

Measuring the effectiveness of testing is definitely not a new idea. People have been trying to do it for years. But all the current metrics have downsides when it comes to measuring the impact of test automation. Here’s why:

Number of tests. This is the classic metric for how much testing you are doing. Simply count how many tests you have. Because more tests must mean better software. Only it doesn’t. Firstly, your tests will constantly need to evolve as your software does. Secondly, it may be better to merge several tests into one larger one. Thirdly, you don’t want to generate a false incentive to create tests just for the sake of increasing a metric. 

Number of tests run. This metric at least measures how much testing you are actually doing. However, it totally neglects whether that testing is effective, necessary, or efficient. For instance, your team probably performs smoke tests continually. That gives the impression that lots of testing is happening, but it could hide the fact that large parts of your application are getting no testing at all. 

Defects found. This is a really misplaced metric. The idea is that the more defects you find the better your team is at testing. But that’s clearly nonsense. If your developers are doing their job well, the number of defects should drop over time as your code evolves. The fact you aren’t finding defects may simply mean your code is stable and mature. 

Code coverage. Project managers love this metric. It measures how much of the code base has unit tests associated with it. But there are two major flaws. Achieving 100% code coverage is often impractical, and it encourages developers to write tests that are designed to pass. For more on this, check out our previous blog.  

Automated test coverage. You often hear companies aiming to automate as many of their tests as possible. Surely, that must be a good thing? Well no, actually some tests aren’t really ideal for automation. For instance, exploratory testing to track down new bugs. Or tests that are only run extremely rarely. So, while a high percentage of test automation is generally better than a low one, your actual target is probably nearer to 75% than 100.

How should you measure the impact of test automation?

So, if these metrics are no good, what should you be measuring instead? Well, there are two distinct forms of testing to consider. Regression testing looks to verify that new code hasn’t broken your existing product. On the other hand, progression testing is about verifying that new features work as expected. In other words, one is about running your existing tests, the other is about creating new tests.

Measuring the effectiveness of regression testing

Regression testing involves running a subset of your existing tests to check whether new code has introduced any bugs. It’s important to stress that regression testing shouldn’t seek to rerun every single test every single time. Some tests will include duplicate steps, for instance when a test needs to put your system in a certain state. More of an issue is how long tests take to run. Most test suites in the real world consist of thousands of tests. There is no way you can run that many tests every time you release a new feature. Not even with the most advanced test automation systems in the world! And of course, a large proportion of your tests may still be manual.

So, what metrics are useful for measuring regression testing? 

Test coverage: Sound familiar? It turns out we can adapt this metric to make it more relevant. The key is to understand where your tests are duplicating effort. What you need to measure is what proportion of your functionality is covered by tests. Of course, this isn’t as simple as testing each functional element once, because you need to test under all conditions. That means using different combinations of data, deliberately using bad data, etc. 

Time needed to test: Also known as Speed of Testing, this metric measures how long it takes you to complete the required tests for each release. The more tests you automate, the quicker you will get through them. Especially, if you use a cloud-based system that allows you to run large numbers of tests in parallel.

Measuring progression testing

Progression testing happens for three main reasons:

  1. When you need to create tests for new features
  2. To identify steps to reproduce a reported bug
  3. For ad hoc testing to try and find new bugs

Measuring progression testing is hard. But here are a couple of metrics we’d recommend if you want to understand the impact of test automation.

Time to create new tests. One of the most important things is being able to create robust tests fast. Ideally, these should be automated, certainly for testing new features or reported bugs. So, you should measure how long each test takes to create. Your aim should be to reduce the average time, without sacrificing test quality of course! 

Ratio of bugs found vs bugs reported: The aim of progression testing is to find bugs before they are actually reported. This ratio measures how well you are achieving this. If you find more bugs before release than get reported by users, you’re doing well. It’s worth contrasting this with Defects Found. In this case, we aren’t trying to find more defects, we are trying to find those defects earlier.

How Functionize improves the impact of test automation

Functionize is an AI-powered test automation platform. Our aim is to help you test smarter, not harder by making test automation easy and more productive. Using our system helps improve all the above metrics. In turn, that equates to a significantly better ROI.

Test coverage

Good test coverage requires two things. Automating as many tests as possible and reducing duplicated test effort. This requires the leadership of a really experienced Head of Testing, who can work alongside the Product Manager to strategically plan your test coverage. Functionize lets you use real production data to identify what tests may be missing. Functionize also allows you to synthesize new tests from existing functional blocks, making it easy to cover more ground. In turn, this increases the impact of test automation.

Time needed to test

The Functionize test cloud allows you to run thousands of tests in parallel. That will dramatically speed up your testing. Moreover, you can create complex test orchestrations, allowing you to run complete test suites autonomously. And because of our SmartFix feature, you can be certain your tests won’t need constant ongoing maintenance.

Time to create new tests

Architect makes it simple and fast for anyone to record new intelligent automated tests. You simply need to install the Chrome plugin and step through your test case. In the background, our system is using machine learning to understand your system and the intent behind the test case. Alternatively, you can create tests in bulk using natural language with our NLP test creation tool. And for systems in production, you can create tests based on how real users are interacting with your application. Taken together, these solutions dramatically speed up the time needed to create new tests.

Ratio of bugs found to bugs reported

The last metric is a really good measure of the impact of test automation. The best way to improve this metric is to find bugs before they get reported. But of course, it’s hard finding bugs before new code is released. This is even harder if you need to spend your time working out which of your test failures were real and which were just because of test maintenance. Functionize helps eliminate the need for test maintenance. But we also make it easy to monitor how users are actually interacting with your system. So, you can make sure you cover all the bases with your testing.

How can I increase the impact of test automation?

Hopefully, you now have an insight into measuring the impact of test automation. In turn, that makes it easier to measure and understand. If you want to learn how Functionize can maximize the impact of your test automation, sign up for a free trial today.