Cloud-native applications function differently from the software you deploy onsite. Testing the software works differently, too.
As enterprise computing shifts to the cloud, so has software development, along with the tools and business processes that employ it. This means that your way of testing software has to change as well.
The term “cloud-native application” has muddy and sometimes contradictory definitions. For the purpose of this article, I default to the description from the Cloud Native Computing Foundation, founded in 2015 by the Linux Foundation: Cloud native applications are an open-source software stack that can be containerized where each part of the application is packaged in its own container, dynamically orchestrated so they are actively scheduled and managed to optimize resource utilization, and microservices-oriented to increase the overall agility and maintainability of applications.
Because of their dynamic nature, cloud-native applications run differently than do their on-premises elder brothers. Some things don’t change, of course, such as attention to user experience and responding to users’ needs. But other differences between cloud-native and the applications running in your data center suggest that you should rethink the way you look for defects. Here’s a summary of the testing adjustments you should consider.
Use DevOps methodologies
In a cloud application development environment, teams work in parallel instead of working independently of each other. Each team works on a different branches of an application or project, says Jason Bloomberg, president at Intellyx, a cloud computing consultancy.
That introduces a new set of challenges for integration testing, which Bloomberg says is addressed by adopting a Continuous Integration/Continuous Delivery (CI/CD) approach. Traditionally, people think in waterfall terms for testing. “But in the cloud it’s a continuous test phase,” he says. “Whatever you are doing with the code involves testing, and more testing has to take place. The cloud native approach says we have to test all the time, in an inherently dynamic and ephemeral way.”
Isaac Sacolick, president of consultancy StarCIO, echoes the importance of DevOps. “Development teams deploying to the cloud are more likely to pick DevOps tools and implement CI/CD and infrastructure as code (IaC),” he says. Developers also must invest in continuous testing. “It’s part of the DevOps culture, and the tools are more readily available for cloud-native applications.”
Test with a component infrastructure in mind
Cloud native applications operate in a dynamic, elastic, and distributed environment. They scale up for increased capacity and scale down when demand falls off. Moreover, the applications are loosely coupled, so they do not depend on infrastructure components.
For example, SOA and microservices generates more individual components. For testers, that means, “You have to test each component individually as well as end-to-end workflow. That adds a new layer of decision making at a micro level and more end-to-end level testing,” says Sacolick.
Because cloud-native apps use so many services, you have to test against each individual service, turning them on and off, just as it occurs in the real world.
Test the software as a collective harmonic. “Think of it as a symphony,” says Sabourin. “Imagine tuning the orchestra without the brass, then with the brass but without the woodwinds.”
Testing a collection of resources, such as dozens of containerized applications, means you have to test each application against every resource as well as each other. For example, in a scenario where 20 containers all use the same database, you should test how the database responds to every permutation of the collection. You might find, for example, that the combination of containers 2,4, 12, and 19 results in a significant performance problem. You will never know until you test against every possible combination.
Cloud-native software also changes the balance between functional and non-functional testing. Sure, you still care that the software meets user requirements. But, says Bloomberg, “Cloud-native application development places an additional emphasis on non-functional testing, where you ensure deployed software meets the non-functional requirements for scalability, flexibility, and resilience.”
Get a troop of chaos monkeys
Netflix introduced “chaos monkeys” for its system testing. On its live system – not test environment but production system! – Netflix randomly kills processes and watches how the system recovers. The goal of chaos monkeys is to test resiliency so you can be confident that your applications can tolerate random instance failures. (Its code is open source, in case you want to explore.) Customers are asked if they want to take part and can opt in, so no one is an unwilling guinea pig.
In other words, Netflix randomly causes chaos. “If you are doing your AWS job correctly, you are building in the failsafe to make sure it recovers. Forcing it to fail is one way to test how well it recovers,” says Rob Sabourin, software engineering consultant with AmiBug.com and adjunct professor at Montreal’s McGill University.
Let’s say you have a microservices environment with a thousand servers. “Go test a live system and crash the server on purpose,” Sabourin suggests. “If your system is designed well, another server will take over.”
This isn’t something you do on a staging server. For failure mode testing, you must be unafraid to do it on live systems. (Though obviously, not in situations where the test creates a life-threatening risk.)
Beware fluid resources
Traditional on-premises testing is done against known quantity of server resources. You know the server where the application resides, along with its CPUs, memory, and network bandwidth, and you can test against those expectations.
Cloud-native applications have the added challenge of never knowing just why kind of resources you are dealing with. There might be a particularly high load on your provider or you might have to deal with a proverbial “noisy neighbor.”
Because of this, we have unknown behaviors that are difficult to plan for due to the dynamic nature of the software and the cloud, and it’s hard to test against these abnormalities.
“In traditional testing you know how the infrastructure will behave. In cloud-native application development, you don’t know how it will behave and there are things you just can’t know about when writing code. So testing includes knowing cloud-native system requirements in production,” says Bloomberg.
The testing tools need to shift to observability, so they understand the behavior of the application, Bloomberg says. Observability leverages logs, traces, metrics, and events to provide SREs or other operators with active control over the operational environment, so they can identify and mitigate issues quickly, or ideally, prevent them from impacting users. “Ask, what is this software supposed to do? Then fix those problems,” he says.
Consider rollback issues
Services like Kubernetes permit you to dynamically change the versions of containers. The result is that developers working with cloud applications need to consider versioning in a different way, notes Sabourin.
When you deploy software, you are trusting that you can go back to the older version, Sabourin points out, including multiple versions of an application at the same time. “The question then becomes, ‘Can I roll back?’” he says.
That can get complicated. If container B contains a data set not found in container A or container C, what happens to that data found in container B? The takeaway: If you test a multi-version scenario, keep track of changes from one version to another.
A shift in importance
The cloud is a different world from software deployed to a datacenter, which means some things grow in importance while others fade. Among the nuances are security and penetration testing.
“The stakes are a little higher,” says Sacolick. “There are a few more risks when you put things in the cloud you that have to test for, like security, depending on whether the application is open-ended [user facing] or closed [non-user accessible].”
However, there are tradeoffs, including matters where you can give up personal responsibility for testing, such as in layers of infrastructure, Sacolick adds. “Serverless computing means you have no responsibility for the operating system and hardware infrastructure. You get an endpoint and deploy a code and everything is done for you. That’s fewer things you have to test for,” he says.
While you’re contemplating process changes: Our white paper discusses how to assess the level of automation your test tool provides.
by Andy Patrizio
Andy Patrizio has been covering the Silicon Valley (and everywhere else tech) for more than 25 years, having written for a range of publications that includes InformationWeek, Network World, Dr. Dobbs Journal, and Ars Technica. He is currently based in Orange County, California.