Automated visual regression testing might sound like a mouthful at first, but the concept behind it is fairly simple. You have a picture of what your user interface (UI) needs to look like to the user and you automatically run tests on the current UI to see if any “regression” has taken place.
Some people call it visual validation or UI testing or even just visual testing, though they’re all referring to pretty much the same process of comparing pixels from two pictures. The Kubernetes declarative system comes to mind as an analogy since Kubernetes is always trying to match the current cluster situation with a picture it has of what it needs to look like. Similarly, visual tests are continuously run to make sure the current UI isn’t straying away from the corresponding reference screenshots. The only difference here is that instead of configurations, we literally have pictures of what we want our system [UI] to look like.
How often have you opened an app and tried to access a feature—but couldn’t because of overlapping text or some ad that was blocking half the page? The first thing you wonder when something like that happens is ‘how do errors that directly affect the end-user experience get past the people in charge of quality?’ The answer is that someone obviously made some changes to the code that had visual implications that no one from quality control is aware of yet.
This is why visual tests are critical, especially in today’s mobile world where there are hundreds of possible combinations of devices and operating systems that all do the “code-to-pixel” conversion a little differently. This is why the same page can often look and even respond differently on two different devices, which basically means someone didn’t do a visual test for every possibility. Additionally, different screen sizes further complicate matters when what you’re looking for is uniformity.
Function vs. Design
While it’s pretty clear that you need to test your application for functional regression, a lot of people make the mistake of assuming that this covers the visual element as well. There is a clear distinction between the two, and compared to functional regression testing that checks if any new code has impeded the functionality of a page, visual testing is specifically about deviations in appearance.
Understanding this distinction is imperative because a visually “altered” page might still function correctly and pass all the functionality tests. This difference becomes even more obvious when we take into account browser rendering and responsive design where pages are automatically resized and certain elements are hidden, shrunk or enlarged to make them look good/uniform across screen sizes. Functional tests don’t pick up these more subtle design changes with regard to how they affect the end-user experience, which can range from mildly inconvenient to downright annoying and obstructive.
Automating visual testing
Automating visual regression testing isn’t as straightforward as we would like; and while the goal is to find bugs in the layout, it often takes a human to decide what variations are acceptable. A lot of sites with graphical functionality like drop-down menus, clickable charts, and interactive dashboards can have their elements thrown off completely by a very minor variation in pixels. At the same time, sites with a more simplistic design (like the Google homepage for example) are going to look pretty much the same irrespective of visual bugs, considering most of the pixels on that page are white.
This unavoidable human interference notwithstanding, there are a number of tools that are designed for testing platforms like Selenium, PhantomJS, and XULRunner that automate visual testing to a degree. These tools include Wraith, WebdriverCSS, Huxley, Needle and more, and mostly run on a screenshot comparison engine that automatically takes screenshots and compares them in intervals.
While all the above-mentioned tools drive the application being tested, take screenshots, test against baseline images and report the differences, one common factor is the need to configure how much variation is acceptable. If you’re using webdriverIO or WebdriverCSS on Selenium, for example, this is usually done with an assert and the isWithinMisMatchTolerance command. When you run your test script for the first time, it will always pass since you’re effectively just creating a baseline image. The second time you run it, however, it will display the mismatch percentage and fail if it doesn’t meet the tolerance threshold we have previously set.
The default threshold is usually 0.5%, and there are a number of more advanced configurable options as well. You can also have additional “helper” functions with different, more customized thresholds, and even zero-tolerance functions where the slightest deviation will cause the test to fail.
What you see is what you get
Seeing is believing, and that’s pretty much the concept behind automated visual regression testing where you cut out all the middlemen, and directly ensure what your end-users are seeing is in fact exactly what you want them to see. As opposed to just testing functionality and then praying that it’s all visually appealing as well, visual tests are a foolproof way to ensure your UI stays on track.
In conclusion, just like Kubernetes clusters where the end-state of the cluster is all-encompassing and the steps taken to get there are unimportant, visual testing is about confirming pixels, regardless of how they were formed. With a noticeable shift toward the left in the enterprise, especially with GitOps and a more developer-centric approach, automated visual testing is crucial to avoid changes playing havoc with your UI.