5 survival tips for when software testing accidents happen

Serious defects happen, despite the best efforts of QA departments. Here are some software testing tips learnt from real world screw-ups.

Serious defects happen, despite the best efforts of QA departments. Here are some software testing tips learnt from real world screw-ups.

October 27, 2020
Tamas Cser

Elevate Your Testing Career to a New Level with a Free, Self-Paced Functionize Intelligent Certification

Learn more
Serious defects happen, despite the best efforts of QA departments. Here are some software testing tips learnt from real world screw-ups.
Serious defects happen, despite the best efforts of QA departments. Here’s what you can learn from real world screw-ups, so you can avoid making the same mistakes.

Accidents in software development and testing have brought headlines ranging from hilarious to horrific. Back in 2013, for example, millions of PayPal users were tickled to learn that the long-standing online payment system had erroneously awarded more than $92 quadrillion – yes, quadrillion – to another user. Yet software mishaps have also been blamed for devastating incursions into credit rating services, massive billing mistakes by health insurers, and tragically, even fatal airplane crashes.

The rise of mobile devices and software apps has also helped to bring software issues more out into public view. On sites like Apple’s App Store and Google Play, consumers rate mobile apps and sometimes leave bug reports, thereby themselves participating in software testing, albeit after the app has already seen the proverbial light of day. 

Software testing accidents happen all the time, and most go unheralded except by company managers, software developers and testers, and any end users directly impacted. 

Generally speaking, companies are tightlipped about the causes of software flaws, even when the error gains public attention. If you’ve kept up with the progress of any mobile apps in online stores, you’ll see that a bug related to this-or-that was fixed in a subsequent release, although nobody would tell you why the bug got there or why it wasn’t eradicated earlier through the company’s quality assurance testing. 

If you’re a software developer or tester, you already know that bugs can slip through the cracks at any of the manifold stages of software testing. There can’t be a tech person on Earth, however, who’s experienced every single sort of software disaster. 

Here are five tips gleaned from lessons learned by seasoned testers and developers, along with the nitty-gritty details on real-life examples of software projects gone awry. Let’s take the positive approach here, with the useful takeaways, rather than indulge wholly in schadenfreude.

Tip #1: Enforce a culture of testing at every level

“Software testing is not foolproof by any means. Apple, Microsoft, Zoom, and almost any company with an app in an app store keeps releasing updates with fixes to holes and flaws in both functionality and security,” maintains David Galownia, CEO of the Slingshot software and app development firm.

“Even when companies have rigorous testing processes, problems can slip through anyway,” Galownia adds. “You’re talking hundreds of thousands of lines of code and untold numbers of different ways users can use the software. It’s impossible to test them all.” 

If a software error is relatively innocuous and rarely evident, chasing it down through testing can be a highly frustrating experience, anyhow. Galownia recalls a time early in his company’s history when testers kept trying, again and again, to discover the cause of a bug in a client’s software. 

“For a long time, we couldn’t duplicate the bug no matter what we did. We tested every scenario we could think of, tested with multiple browsers, and put multiple eyes on it,” Galownia remembers. “When we eventually did stumble on to the cause of the error, we found it happened during a unique series in which the user went forward exactly seven steps in a process, then went back exactly four steps using the backspace button, and then went forward clicking something else. Once we finally found that out, it was an easy fix.” 

Due to all the potential complexities involved, the CEO recommends “enforcing a culture of testing at every level.”  Testing should start with “developers testing their own code and only saying they’re done when they themselves believe they have no bugs,”  Galownia elaborates. 

“Ideally, at that point a QA analyst is testing, but project and product managers should also test. Designers even should test and finally the client or end users should test in an alpha or beta setting,” Galownia says. The key is to have multiple different types of people who come at your software at different angles.

Tip #2: Negotiate test deadlines

We don’t live in an ideal world, though. All too often, projects aren’t given as much time for testing as developers and testers think they need.

In one particularly egregious case, a major statewide health insurance company expanded a new software system before it was ready to go full scale. As a result, more than 25,000 customers were enrolled in the wrong health care plans; the knowledge became public when a company whistleblower approached a local TV network affiliate. Documents suggested that the insurance company management were aware of technical issues and testing delays for months, but they got impatient and deployed the software anyway. 

Alan Zucker, founding principal at Project Management Essentials, once grappled with his own time crunch. He was managing the development of a specialized accounting program that integrated the processing of a dozen applications. 

The accounting program Zucker worked on required the building of a processing hub that received and shared information with the other applications. The accounting process was executed daily, weekly, and monthly with specific timing requirements. 

“Our project timeline was very short because the company needed to quickly enable this capability,” Zucker says. “We did not have time to run an end-to-end monthly test that would allow us to replicate the process. So we created a testing strategy where we simulated the processing cycle.  Rather than having the process triggers execute based on the cycle clock, we manually executed the workflow.” 

So far, so good. In test, everything worked fine. But executing the cycle the first time in production turned out to be a different matter. “Everything fell apart. In order to have the information loaded the application and available to the accountants by 8:00 a.m. the process first process executed at midnight and the second process kicked off at 2:00 a.m. Well, the first process did not take 30 minutes as expected. It took over two hours to complete,” Zucker recalls. 

“The cascading impacts were disastrous,” Zucker says. “We had to take the application out of production and start over. We made the decision to execute a full end-to-end test simulating the actual daily, weekly, and monthly processing cycles.  We spent nearly two months fixing the bugs and retesting the application suite.” 

Zucker learned three lessons from this debacle:

  1. Creating a robust and well considered testing strategy early in the project is essential when building a complicated system;
  2. Simulating a process cycle using test data is a not a substitute for real end-to-end testing;
  3. Compressing project schedules to meet externally mandated timelines can be disastrous. Project managers need to negotiate deadlines and expectations with their project sponsors.

Tip #3: Users come first

Software problems are often caused by not enough testing, agrees David (Grue) DeBry, CTO and co-founder at web startup company Volley. Yet the opposite problem – too much testing – can lead to needless product delays.

Culprits behind excessive testing are often "code cowboys," or "engineers going off the rails by throwing out code,” says DeBry, whose 20 years of industry experience also includes visual effects development for a string of Hollywood movies. 

The “code cowboys” can become so driven by their own needs for meeting defined engineering processes that they lose sight of why the software is being developed in the first place. 

"Unit tests. Smoke tests. Regression tests. The list goes on,” DeBry says. “There's constant feature improvement. You're unable to move forward, and testing can cause other things to happen." 

When in “cowboy mode,” engineers try to eliminate technical debt, or what results when software teams expedite the delivery of a piece of functionality or a project which later needs to be refactored. In other words, they balk at prioritizing speedy delivery over perfect code. Cowboys want “to “pay back the tech debt now,” according to DeBry. 

Yet teams should instead focus on the people who will use the software, DeBry maintains. "Users are humans, and engineers maybe not so much. You don't just write code. Efforts need to be organized around what to test and where. What’s important to do? What are the users’ priorities? There's the argument that you don't spend time testing for something if the user doesn't care about it."

Tip #4: Don’t neglect the negative

"Only testers who do nothing make no mistakes,” echoes Bartek Nowakowski, test lead at career website Zety, “When testing big apps there are a lot of connections between functionalities, so it is easy to miss some of them.”

Zety has run into complications by not performing negative testing, a form of testing that checks to find out how software will behave under unexpected circumstances, such as if a user types letters instead of numbers into a phone number field. 

“So even if functionality works as intended there can be circumstances in which incorrect or unpredicted user behavior creates an edge case situation,” says Nowakowski. “These can occur for months before someone realizes there's an issue.”

Tip #5: Doing live testing? Warn your users!

A variety of industry best practices can and do get overlooked in the heat of the moment during a software crisis.

“After some changes to our database, we wanted to test the new elements. Unfortunately, there was an error that severed the connection between the backend and frontend,” says Reuben Yonatan, CEO of GetVoIP. “We couldn't connect to the database. It took a while, but our IT team found the bug and fixed it. However, we suffered disruption to our services.” 

To GetVoIP, the incident underscored the need to always be prepared with a backup when running a software test. “That way, if something goes wrong, you can shift to the backup system and avoid service interruption,” notes Yonatan. 

What else was learned? “If you’re testing software that is live, informing your users of a possible delay in access to services will help you to avoid losing customers.”

One way to avoid mistakes is to ensure that your software testing suite is up-to-date and scales appropriately in every environment. Won’t you take a few moments to scan through the Functionize product? We believe we can help.

by Jacqueline Emigh

Jacqueline Emigh (pronounced “Amy”) is an award-winning journalist specializing in technologies used by enterprises, small businesses, and consumers. She has worked full time as an editor for TechTarget, BetaNews, and Ziff Davis. Her stories have also appeared in dozens of other major tech publications, including CIO, Linux Planet, and PC World. Jacqueline holds a B.S. degree in mass communications from Emerson College, with a journalism concentration. From 2017 to the present, she has served as a contributing editor to SD (Software Development) Times.