Shift Right testing – tools and techniques

Shift Right enables rapid deployment of new features. In part 2 of our miniseries, we explore what you need to enable Shift Right in your company.

Shift Right

The story so far

In the first part of this blog series, we looked at how software development has evolved. We saw how this led to a drive to Shift Left, with testing happening earlier in the development cycle. We also saw how the adoption of the SaaS model triggers a drive towards CI/CD.

In this second (and final) part, we examine the strategies you need to make a success of deploying new features. We will explain why successful companies are adopting Shift Right. We describe some of the approaches in detail, showing how these enable Shift Right. Finally, we will see why the apparent tension between Shift Right and Shift Left is an illusion.

Shift Right

What really is it?

Shift Right is a set of approaches that simplify the process of testing in production. In turn, this speeds up your ability to deliver new features.

We already mentioned Shift Right in the first part of this blog. But you may still be wondering what the term really means. In essence, it is a collection of practices relating to shifting testing into production. This allows you to utilize real users in your testing, to release potentially-risky code, and to experiment with new features.

Many of you may be thinking that Shift Right implies taking unnecessary risks. But far from it – when you adopt Shift Right, the whole team buys into the process. Developers are encouraged to push quality code and are rewarded by seeing their new features in production. Testers are able to make far better assessments of code quality by seeing how it works in production. And DevOps teams become a vital part of the whole product life cycle.

Companies such as Google, Amazon, and Facebook achieve their dominance by the speed with which they can release new features. Central to this ability is their use of production testing and monitoring. This allows them to reduce the risk inherent in releases. These companies are often releasing many new features every day. Without adopting Shift Right this would be risky and impractical.

How do you Shift Right?

Shift Right encompasses a whole plethora of concepts. Below we will look at some of the more important ones in detail.

Dark Launching and Feature Flags

Developing new features can be scary for the product team. Your instinct tells you the new feature will be popular, but how do you know that? The traditional approach was to use focus groups, interactive prototypes, and feedback from existing users. But this has problems. Focus groups are, by definition, small. And all too often it’s impossible to recruit a truly representative focus group. Prototypes are all well and good, but they usually have some limitations on functionality. Finally, feedback only tells you what to improve/add, not how to achieve it.

This is where dark launching comes in. Dark launching involves adding a new feature and releasing it to a set of users without letting them know. It is “dark” because there is no fanfare and no announcements. You then use UX instrumentation to monitor how users are interacting and responding to the feature. You might also monitor their feedback. Some of the more perceptive among you may have noticed how much Google makes use of this approach.

One critical requirement for Dark Launching is the use of Feature Flags. This allows you to turn on and off specific features in your frontend. In turn, it makes it easy to test a feature with a subset of users. And, importantly, allows you to roll back the new feature if it proves unpopular. CD systems such as Spinnaker make Dark Launching and Canary Testing (next section) really easy.

Canary Testing

Dark Launching is all about testing new features in your frontend. But how do you test changes that impact the backend? For instance, moving to a different database to allow scale up. Or checking whether your refactoring has improved the code efficiency. The answer is canary launching. As with Dark Launching, you move a share of your customers over to the new code. You then use instrumentation to monitor performance. If the feature is stable you roll it out to all users, otherwise, you can roll it back. Typically, you might test with 5% of your users to start with, and then increase this in stages.

According to their technology blog, Netflix further refines this process. Rather than compare the performance of the canary servers with the existing production servers, they create new instances of the existing servers as well as the canary servers. This so-called Baseline cluster is the same size as the Canary cluster. The performance of the Canary cluster is compared with the baseline. This gives them a directly comparable set of results against a clean setup. Importantly, it avoids potential issues caused by long-running processes in the production cluster.

Companies such as Amazon use this approach all the time. When they launch new features, it will be tested on an increasingly large scale. Starting with one server, then a rack, then a pod, then a whole datacenter and finally a whole availability zone. Here at Functionize, we have developed a new intelligent Canary Testing solution. We described the technology in a blog post last October, and you will be able to try it out soon.

User monitoring and instrumentation

Decent monitoring and instrumentation are critical requirements for production testing. A couple of months back I heard a talk from Björn Rabenstein, an SRE at SoundCloud and one of the main developers of Prometheus. SoundCloud developed Prometheus in-house as a full production monitoring system, which is now released under the Apache 2.0 license. Björn highlighted that without such a system both DevOps and Site Reliability Engineering are next to impossible.

Another critical requirement is a proper frontend instrumentation. This includes several things. Instrumenting the code allows you to record user interactions. This is important to identify whether users are behaving as you expected. Logging API calls will allow you to analyze how the frontend is being used. Analyzing crash logs is vital to spot previously missed bugs. And, as our new Canary Testing solution demonstrates, you can now automatically track and predict user journeys.

Beta testing and A/B testing

Beta testing is hardly a new concept. But what you might not realize is that it is actually a form of Shift Right thinking. During beta testing, you are releasing potentially-unstable code to users and getting them to test it under real conditions. Often the early versions of beta software include features that are then removed, or extensively reworked. Users understand they are receiving unstable code, and they are encouraged to give useful feedback. Google is notorious for keeping products in beta for months if not years. Gmail, for instance, was in beta for over 5 years.

Dark Launching is about testing a new feature to see how it performs. A/B testing is used when you have two distinct ideas you want to test. The concept of A/B testing comes from the worlds of product design and Human-Computer Interaction. For instance, you might want to test whether it is better to have your Buy Now button at the top of the page or the bottom. So, you create two versions of your frontend code and release them to selected groups. Ideally, these groups should be the same size and should be representative of your user-base. You then see which option gives the best outcome (e.g. which buy-button position generated the most sales).

Combining Shift Left and Shift Right

Shift Right might seem to you to be the exact opposite of Shift Left. However, if you don’t have good quality code you can’t ever expect to be able to Shift Right. Shift Right is also about speeding up your whole development cycle. Both these require you to adopt Shift Left thinking. So I would argue that Shift Left is a pre-requisite of Shift Right.

The central message of both Shift Left and Shift Right is that testing should be done throughout the product cycle. This doesn’t imply Test Driven Development (though that may well be useful for you). What it means is that at every stage you should be aware of the importance of testing. One key thing is to design your system for testability. Shift Right requires good instrumentation, so this needs to be incorporated from the start. Shift Right also needs code that doesn’t trigger regressions, so the use of automated regression testing is vital. Similarly, CI/CD requires Shift Left, as we discussed in a previous blog. Ultimately, everyone in the company needs to buy into the testing process.

Conclusions

Unlike software development methodologies, Shift Right is a collection of different approaches. Which ones are right for you will depend on circumstances. These ideas and approaches are not new. Most of these were developed by the big SaaS and IaaS providers. Their scale and size require them to approach things differently, and their customers react badly when they get it wrong. As a result, they have come up with strategies that are proven to work. However, it is only recently that people have started to adopt similar ideas in smaller companies. Nowadays, even quite small companies can benefit from this approach.