Selectors allow automated tests to interact with your UI. However, they cause most test failures. Here, we show why Functionize has moved beyond selectors
Ever since Jason Huggins invented Selenium, test engineers have relied on selectors to interact with UIs under test. However, AI and the onset of modern intelligent test agents spells the death of the selector. In this blog, we see how selectors have evolved, becoming more reliable and complex than XPaths or CSS selectors. In the second part of this blog, we will explain why we aren’t writing selectors off altogether.
What is a selector?
Selenium was the first successful general-purpose UI testing framework. What made it so successful was separating out the creation and scripting of the test from the actual UI interaction. The former is handled by your favorite programming language. The latter is handed over to the Selenium WebDriver. Conceptually, this separation was brilliant. It means all the test logic is left to the test script. While the WebDriver just has two simple tasks. Locate and interact with elements on the web page, then report back the results. But how does it do that? How can you actually locate the correct element? How does WebDriver know the difference between the “buy now” button and the “log out” button? The answer is, by using selectors.
What types of selector exist?
There are numerous different selectors you can use in your scripts. Selenium officially supports the following list of selectors:
Name and ID-based selectors
- Id. This lets you locate an element by its HTML ID.
- Name. If you know the name of an element, you can use this to select the element
- Tag name. This locates an element by its tag name.
- Link text. This is used when you know link text used within an anchor tag.
- Partial link text. As a variation on the above, you can use this when you know some of the text within an anchor tag.
- Class name. Here we rely on knowing the class attribute name of the element we want.
- CSS selector. This locates the first element matching a given CSS selector. More on this below.
- Xpath. XPaths are used to locate elements within XML. More on this later.
The name and attribute-based selectors are relatively obvious. They are also, arguably, the least useful form of selector. The complex XPath and CSS selectors are far more flexible and useful.
XML, or extensible markup language, is a family of human-readable markup languages that includes HTML. XML documents are highly structured, but potentially quite complex. XPath queries allow you to locate any element within the XML. In effect, this query lists the set of steps (or path) you need to follow to reach the correct location. The XPath might be absolute (starting from the document root. But more often, you use relative XPaths. That is, ones that start from the current location within the document structure.
Xpaths follow a pretty simple syntax:
SelectedElement = //tagname[@attribute=’value‘]
- ‘SelectedElement’ is a variable name you assign the element;
- ‘tagname’ is the type of HTML element you are looking for
- ‘attribute’ is the way you are asking Xpath to identify the element
- ‘value’ is the specific value you want to match
Let’s look at a simple example. Imagine you want to select a username field in your login form. This field has an ID ‘email’ and is of type input. So, your Xpath looks like:
UserID = //input[@id=’email‘]
XPaths can be much more complex than this. You can string together multiple elements, use logical operators, even navigate up the document structure using ‘parent’. In short, you can always create an XPath that will uniquely identify any element in your UI. This is true even if you are trying to locate 3rd party embedded content.
CSS allows you to define the style of any element in your UI. It does this using CSS selectors to define elements. These selectors can also be used to choose elements for Selenium testing. CSS selectors come in three flavors.
Simple selectors include html elements like h1 and p.
Combinators allow you to specify child elements. For instance the child operator: ul > li; or the adjacent sibling operator: <h1> + <img>.
Pseudo-selectors include pseudo-classes, such as. :active, and pseudo-elements, such as p::first-line.
The syntax is pretty straightforward:
As with XPath. ‘tagname’ is the type of HTML element you are looking for; ‘attribute’ is the way you are asking Xpath to identify the element; ‘value’ is the specific value you want to match.
For example, to find the log in button named ‘loginbutton’ your CSS selector is: input[name=loginbutton].
CSS selectors are great for finding elements within the current DOM. They are also the best choice for selecting transient elements, such as tooltips that only appear when the mouse is hovering over an element.
What is the problem with selectors?
Selectors are great. Right up to the point you start to create complex test scripts for your UI. At that point, you will start to run into some of the classic issues with selectors:
Every selector change breaks the script
Scripts are really dumb. You give your script a pattern to find and it will just return the first occurrence in your UI. Or it returns an array of all occurrences, but you have no idea which one is the one you need. This is a problem. Every time you change your UI, you can change the order in which elements appear. Or to put it another way, any change to the UI changes your selectors. This is sometimes referred to as brittleness of the selectors. It means all your test scripts need to be debugged and updated whenever you make a change. Unfortunately, this issue can’t be avoided with conventional selectors.
Some elements can’t be accessed
Most modern websites use 3rd party content, nested DOMs and embedded widgets. These cause particular problems when you are trying to choose a selector. Selenium runs within the parent DOM (domain object model) of your website. However, the embedded content is within its own DOM. Sometimes, these DOMs can be nested several elements deep. CSS selectors can only reliably select elements in the parent DOM. XPaths can traverse the DOM structure, for instance:
Not only is this really complex, but you have now control over the embedded content. Any change in any of the nested content will break the selector. So, your tests may well break even if you don’t make any changes!
As we saw from the examples above, selectors are just blocks of code. They are often quite complex, and every detail matters. Unfortunately, selectors are often poorly selected and badly written. Developers tend to take shortcuts, such as using Chrome’s developer tools to copy a selector. But even the best developer in test may lack the information needed to create a good selector. That is, one that will reliably work without becoming too brittle.
Some alternatives to selectors
So, what alternatives do you have to traditional selectors? Well, over recent years, artificial intelligence has created new alternatives to traditional selectors. Nowadays, techniques like machine learning, image recognition, and natural language processing offer robust alternatives.
Machine learning (ML) underpins many forms of AI. Here at Functionize, ML forms the bedrock of our technology. ML allows us to create intelligent models of your site that are constantly evolving as your site grows and changes. Selenium selectors are effectively one-dimensional—you just give WebDriver a single XPath or CSS selector. By contrast, the Functionize ML approach creates rich multi-dimensional fingerprints for each element in your UI. Every time we run a test, we record millions of items of data. These are used to uniquely identify each and every element in the UI. This means that when an element changes, the ML model is able to work out what changed and update the test accordingly.
Natural language processing
Natural language processing (NLP) is the technology behind virtual voice assistants. It involves teaching a computer to parse and understand natural (e.g. human) language. UIs are designed to be understood by humans. Most UI elements are labeled or associated with some text. Buttons and CTAs have clear instructions on them. Form-fields have descriptors and help text. Images have alt-text and captions. Even the IDs and underlying code are (usually) designed for a human to understand! As a result, we can use NLP to provide another layer of information that can be fed into the ML model. With NLP, the test will cope if you suddenly change your “log-in” button to a “sign-in” button.
Image recognition is the process of teaching a computer to understand what is shown in an image. Nowadays, deep learning models have got so powerful that a computer is better at diagnosing breast cancer in a mammogram than a human expert. Image recognition is a really good way to increase the robustness of selectors. However, it is unable to work without other ML techniques to back it up. This is because each time a page renders, there’s a chance an element may render differently. This can be down to screen size, browser version, updates, the execution order of parallel threads, etc.
However, with a bit of additional intelligence, image recognition can be a powerful tool. For instance, at Functionize we couple it with template recognition. This means we can identify that a certain field always contains the day of the week and the date. So, we know to ignore this in any test verifications. You can even specify the percentage of visual match you want for any given element.
Selectors are dead, long live selectors!
By now, you might have concluded that the sooner we get rid of selectors, the better. However, selectors still very much have a place in test automation. And, truth be told, they will probably have a place for the foreseeable future. In the next blog in this mini-series, we will see why selectors still have an important role to play in Functionize tests.