Test debt comes in many forms. Some of these are very visible, like lack of test coverage or tests that run too slowly.
As a result, we often find QA teams that are looking for solutions that speed up test creation or allow them to move their testing into the cloud. However, while these are very real issues, there is a much more significant form of test debt. One that is much harder to solve and often is completely overlooked. We’re talking about maintenance test debt. In this blog, we will use real-world numbers to show why this is such an expensive form of test debt. We hope to convince you that merely servicing this debt isn’t enough. Rather, you should be looking to pay it off as quickly as you can.
Test debt is a pervasive problem for QA teams. In many ways, it is equivalent to the technical debt that besets so many dev teams. But test debt is often harder to solve. Test debt comes in two forms. Intrinsic test debt affects all QA teams regardless of how far they are along the test automation journey. Extrinsic test debt starts to appear once a team begins to automate tests. Curiously enough, reducing intrinsic debt often increases extrinsic debt, as we shall see later.
This form of test debt exists in pretty much any organization. It can be summed up as a lack of test coverage where there should be testing. This can be found in every part of the testing pyramid.
Unit and integration testing. Dev teams are well used to the idea of trying to increase code coverage for unit tests. However, few teams actually achieve better than 80% coverage. There will always be a few functions that aren’t getting tested as thoroughly as they should be. This problem also extends to integration testing. Teams may well have created scripts and fake APIs to test modules or micro-services as they are created. But seldom do these tests cover every eventuality.
System and UI testing. Modern applications are often vast and have complex dynamic UIs. Creating tests for every possible interaction is probably infeasible in many cases. As a result, most teams focus on the most likely user flows and on overall application stability. Note, it’s important to realize this remains true regardless of whether any of your tests are automated or not. Effectively, QA managers have to make a judgement call about how much testing is needed before a release. Of course, if a bug does slip through into production, they will develop a test and reevaluate whether they need to test for this more often.
Unlike intrinsic test debt, extrinsic debt generally comes about as a result of starting to automate your system tests. Essentially, most of the debt is directly related to the framework and infrastructure used for test automation.
Aside: it’s worth noting that extrinsic test debt can also exist in unit testing. Specifically, many teams end up with unit tests that are designed to pass and exist purely to boost test coverage. Fortunately, there is a simple solution that can help reduce this. Namely, start doing mutation testing to validate how good your unit tests are.
There are three forms of extrinsic test debt:
In our experience, teams are typically very aware of their intrinsic test debt. In most cases, they have clear plans in place to manage this. In financial terms, it is similar to a mortgage. Over the years you are able to slowly pay off this debt and reduce it. Of course, you are constantly developing and improving your app, which can serve to increase the debt. You could think of this as the interest on the loan (although maybe we are pushing the financial analogy too far now!).
Extrinsic debt is more problematic for teams. Typically, we see people focused mainly on low automation test coverage. Sometimes, they are also trying to address issues with inefficient test automation. But all too often test maintenance takes a back seat. So, we see people turning to new solutions that help speed up test creation and move test execution into the cloud. Now don’t get us wrong. Both these are important things to do. But if you ignore test maintenance, you actually risk driving up your overall test debt without realizing it.
Test maintenance is nothing new. It happens because automated tests need some way to interact with your application. This is nearly always Selenium Web Driver. But to control the application, you need to be able to specify which elements you need to interact with. Here’s where the problem lies. Choosing elements is done by defining selectors. Typically, this is done in the test script. These selectors are defined when you create the test. But they are prone to being unstable:
The first problem can be solved with clever element selection, the use of multiple selectors and script debugging. But the second problem can only be solved by checking each test failure and updating the test script if the selectors have changed.
To get a better feel for why test maintenance debt is so costly, let’s look at some simple numbers. Imagine you have a test suite with 2,000 tests already automated. Every week you have a new release that needs 10 new tests to be created to cover the new functionality. Creating a new test takes 4 hours. But each release also causes 10% of your existing tests to break on average. Fixing each of these takes 24 minutes (10% the time taken to create them). From this, you can see that it takes:Test maintenance:24mins x 200 tests= 80 hoursTest creation for new features:4 hours x 10 tests= 40 hoursTOTAL= 120 hours
On top of this, your team needs to spend some of its time analyzing any test failures. This is in order to diagnose which tests just need to be refactored and which are true failures. For conservative simplicity, let’s say analyzing each test failure takes 4 minutes. Thus, every week, your team needs to spend 133 hours on test automation tasks. For context, that is ~3.3 FTEs.
But now move the clock forward 12 months. In that period, you have had an additional 52 releases. That means your test suite now contains 2,520 tests. Let’s calculate the numbers again:
Test maintenance:24mins x 252 tests= 101 hoursTest creation for new features:4 hours x 10 tests= 40 hoursAnalysis of test failures:4 min x 252 tests= 17 hoursTOTAL= 158 hours
Now, your team is spending 158 hours on test tasks per week, an increase of 20%. Maintaining more tests will require either increasing your team, slowing down your releases, or reducing quality/coverage.
Roll forward another 12 months. Now you have a test suite with 3,040 tests. If you do the math, you will find that testing now takes almost 182 hours every week, and maintenance and analysis accounts for nearly 80% this time. In other words, you are now paying 3.5 test engineers purely to do test maintenance rather than any real testing. And incidentally, this is based on very conservative figures for how much maintenance you might need or how long it takes. In practice, it would not be uncommon for 20% or more of your tests to break each release depending on the complexity and rate of change in your application.
In the model above, the choice has been made to keep pace with new releases but not to try to increase overall test coverage. In practice, most teams also would have the aim of increasing the overall proportion of tests that are automated. Let’s imagine that the team above employs full-time test engineers automating new tests. Realistically, in the space of 1 year, they should be able to create some 520 new tests. Those new tests require an extra 21 hours of maintenance each week. Essentially, automating more tests has a compound effect on your maintenance overhead. It soon becomes as expensive as taking out a payday loan! So, what should you do instead?
Functionize’s philosophy is that test maintenance shouldn’t really exist. We use AI models to get rid of over 80% of maintenance. This is possible because we don’t rely on static selectors. Instead, we model every element in the UI and use machine learning to identify the correct element you are trying to select. This means that rather than breaking when the element moves or changes, our system naturally self-heals. Over time, we can also speed up test creation as our AI learns about your application. So, what does this do to the numbers above? Let’s look at the case above after 12 months.
Test maintenance:6mins x 164 tests= 16.4 hours(tests are 75% faster to create and 35% fewer break)Test creation for new features:2 hours x 10 tests= 20 hours(tests are on average 50% faster to create)Analysis of test failures:4 min x 252 tests= 10.9 hours(35% fewer tests break, but analysis takes the same time)TOTAL= 47.3 hours
That is a saving of over 110 hours—well over ⅔ of the time you were spending! That time can be used for automating more tests, better exploratory testing, or even increasing your release frequency.
Test debt comes in various forms, all of which are damaging to overall quality and productivity. But by far the most important one to solve is test maintenance debt. If you get rid of this, you will free up key resources to focus on more important tasks. Moreover, you will be able to also reduce other forms of test debt as a direct result. To see how Functionize does this in practice, book a demo today.