NLP – Teaching A Computer To Understand Test Plans

Learn how you can teach a computer to understand test plans by using natural language processing. Here is how Functionize does NLP-based test automation!

Learn how you can teach a computer to understand test plans by using natural language processing. Here is how Functionize does NLP-based test automation!

April 1, 2019
Tamas Cser

Elevate Your Testing Career to a New Level with a Free, Self-Paced Functionize Intelligent Certification

Learn more
Learn how you can teach a computer to understand test plans by using natural language processing. Here is how Functionize does NLP-based test automation!

How it works and how you can help improve it

Functionize NLP uses natural language processing to convert a test plan into functional test automation. In this blog, we give more detail on how this works and present our vision for the future of NLP.


We introduced our Natural Language Processing engine last year. Since then, customers often ask us how they should write their tests for NLP. Should they use structured language with keywords or is natural language OK? The answer to this isn’t completely straightforward. So, in this blog, we will attempt to explain where we are and where we are planning to go with natural language.

What is Natural Language Processing?

We use natural language processing and machine learning to take your test plan written in English and convert it to a functional test script. When we first released NLP, we suggested that plans written using keywords would work best. For instance, a typical structured plan for NLP might look like this:

  • Click on ‘Buy Now’ button.

This is because our system is based on machine learning. During training, we taught the system to understand and recognize certain keywords like “verify”, “input”, “click”, etc. As a result, when it encounters those, it already knows what to do. However, we have a vision for NLP that goes far beyond this. Our aim is to have a truly autonomous test agent that understands anything and everything you say to it.

Using unstructured text

NLP can already take in unstructured test plans that look more like:

  1. “Check the correct logo and menu items are shown (Summer, Birthday, Sympathy, Occasions, Flowers, Plants, and Gift Baskets & Food”).
  2. Search for “emoticon gifts”.
  3. Make sure sensible results appear and click on the first one.
  4. Check you go to the purchase page.
  5. Click “Don’t know the recipient’s delivery address? Click here”.
  6. Now enter your details …”

But our goal is for you to be able to say “Go to the site, check it’s working, and order an emoticon gift to an unknown delivery address”. This may seem like a utopian dream, but we believe it is well within reach. The thing is, in order to reach this goal, we need our system to be fed with more training data. This means we are happy when customers upload unstructured English test plans.

Of course, the problem with such plans is there will be ambiguity, and this is where we ask for your help. If you upload a plan that the system is confused by, it will ask you to verify what you really meant. By doing this, you are training it to understand how you talk about your system, as well as improving its overall understanding. Let’s take a step back and look at why language can be so ambiguous.

The ambiguity of language

Language is a complex beast. English is especially complex, given its roots in both Anglo Saxon and Medieval French. The result is that English can be extremely ambiguous and abstract. The higher the level of abstraction, the greater the ambiguity. Let’s look at an example relevant for testing. If you have an experienced test engineer, you can tell her “test the login flow”. Using their experience, she will know what is expected. Now imagine that you have a new engineer in training. For him you might need to explain things a bit more:

“Go to the homepage, find the registration form and create a test user. Try using an invalid email, then a valid one. Also, check that the password security check works by setting a simple password. Now use this test user to try logging in. Remember to test what happens when you enter the wrong credentials.”

As you can see, you have to explain things in much more depth. This is because “test the login flow” doesn’t include the detail that you need to test all the ways that flow can break. An experienced engineer knows that; the junior engineer doesn’t.

Now, imagine that you are trying to explain this to your Grandma who isn’t familiar with using computers. What extra detail do you need to add now? Maybe you need to explain where the registration and login buttons are. You definitely need to explain what makes an email address invalid. Likewise, you will need to explain about the password security and how to test that.

Natural versus programming languages

This ambiguity has long been an issue for developers. Everyone knows that directly writing machine and assembly code generates the most efficient programs in terms of size and speed. However, even the most skilled assembly programmer can’t write code very fast. As a result, higher and higher levels of abstraction have been added as programming has evolved. For imperative code, C is more abstract than assembly code and C++ is more abstract still. The aim at each stage has been to move towards something that is easier for humans to understand, and to then use the compiler, assembler, and linker to generate the underlying machine code.

Natural language processing

NLP is actually one of the oldest branches of computer science. People have been working in the field for at least 70 years. However, their problem was that the complexity of human language was so great that computers weren’t powerful enough to understand it. Over recent years that has changed dramatically. This has been driven by the massive improvements in computing power brought about by the cloud.

NLP works by first parsing the text, then trying to extract the underlying meaning. This is a multi-step process. It includes identifying the part of speech in the text such as noun, adjective or verb, along with any tenses or cases. It then uses a set of grammar rules it has learned to work out how the words are associated. Finally, it uses a machine learning model to work out what is the most likely meaning of the sentence.

Typically, NLP uses a specific form of machine learning called reinforcement learning. With this, the system is constantly learning from what it gets right and what it gets wrong. This is much the same way children learn language. Probably the best example to show how this works is Amazon Alexa. In the early days, Amazon asked users for feedback when Alexa made a mistake. Users could go to the app and indicate she made a mistake and then provide details of what they actually said. The recording of the command was then returned to the Alexa team along with the corrections. This was then used to improve the service. Nowadays, Alexa has become more advanced and her models are pretty well trained. However, you can still indicate when Alexa was right or wrong in the app. This is then used to further refine the models being used.

So how does this impact Functionize NLP?

As we said above, our aim is to reach the point where our NLP engine will understand any commands you pass to it. We want to reach the stage where you can just say “Test the login flow”. Clearly, this will take a bit of time and we are not there yet. Just like autonomous driving, we still need a human behind the wheel to direct things from time to time.

In order to improve, we need much more training data and need your feedback to power the reinforcement learning. In other words, we still need a bit of human supervision from time to time. So, we would encourage you to submit test plans that are written in plain English. Sometimes, the NLP engine may struggle to understand your plan. In this instance, it will ask you to confirm what you meant. Each time you correct it, the engine is learning more, both about your specific UI, but also about how tests are described in general. We have already taught it to be better than your elderly grandma. We’d say currently it is about at the level of understanding of a junior test engineer. But over time, we hope it will become a seasoned test engineer, or intelligent test agent, that just understands everything you ask of it.


Functionize NLP is already able to parse quite complex test plans and use these to generate tests.  However, we want to reach the stage where it takes raw user journeys and create a fully functional test suite. To get to this stage we need your input. We need you to help us as we train our intelligent test agent. The more training data it gets, the nearer we can get to our goal.