automated tests automation flaky tests Games Knowledge qa testing

Questions that Every QA Director must ask themselves #3: What to do about Flaky Tests?

Questions that Every QA Director must ask themselves #3: What to do about Flaky Tests?

Flaky checks are worse than ineffective.

At the very least In the event that they have been simply ineffective, we might ignore them. What makes them worse than ineffective? For one, you can’t depend on the automated check end result, so your group often has to re-run the check manually. So in the long run, was it value it? You took the effort and time to automate that check, however you didn’t get the good thing about the automated run – plus you didn’t save any effort on guide testing.  

However even worse, flaky checks undermine the credibility of all of your automated exams. In case you have a set that isn’t dependable, why ought to your stakeholders (and even you) consider within the different check outcomes?  

If the time period “flaky checks” is new to you, it means an automatic check that provides totally different outcomes for a similar configuration. The check may fail, then cross the subsequent time you execute it – with out altering something.  

Certainly one of my worst experiences with flaky exams was once I was main the Check Automation staff for a monetary software. My workforce was liable for the automated lab, the tooling, and a few of the widespread code utilized in all the automated checks. The Function groups owned the precise check instances. We had a set of automated checks, roughly 1500 of them, that ran each night time. And, guess what?  

Each night time we had some failures. That was in all probability to be anticipated because the Function groups have been pushing a variety of modifications. However, the dangerous information was that roughly 60% of the time, a check failure was not a bug within the product, however a flaw with the check or the testing infrastructure.  

Day-after-day for the subsequent three months at 7:00 am I used to be in a every day rise up assembly to evaluate the in a single day automated checks and determine what to do with the outcomes. Not enjoyable. Additionally, my group was thought-about responsible of each failure till we proved our innocence (and once we have been responsible we needed to re-run these checks manually).

Through the years, I’ve discovered a couple of ideas for tackling flaky checks and keep away from these points.  

1. Make sure that your app is testable for automated checks.
2. Deal with technical debt in your check code.
three. Don’t achieve this a lot work within the UI (utilizing your check scripts).
four. Empower Function groups to run (and personal) the exams.
5. Present your stakeholders with consolidated outcomes (guide + automated collectively)

1. App Testability

The know-how used for creating your apps could be probably the most essential elements in stopping or eliminating flaky exams, however it’s typically probably the most troublesome issue for a QA director to have an effect on. The know-how, structure, and design selections that impression testability are sometimes made nicely earlier than your automated check program started. Nevertheless, all just isn’t misplaced.  

One supply of flaky checks, particularly UI pushed exams, are the locators utilized by the automation framework to seek out the UI parts. The testers typically have to make use of a locator based mostly on XPath, which may change as your UI modifications. As an alternative, if the UI parts are all recognized by a singular ID, that may scale back the probabilities that a UI enchancment will break checks. Ask your builders to make use of distinctive identifiers for all UI parts.  

One other testability space that may trigger flaky checks is establishing check knowledge. With a purpose to examine performance in your app, the automated check must have some knowledge already arrange in your system. With no means to arrange the info reliably, the testers typically use the automated framework to drive the UI to create the info – which will increase the chances that the check will fail within the setup stage, as an alternative of the function being examined.   

If that is your state of affairs, ask your builders for assist. Maybe there’s a “builders API” they use for inner testing, and that may be repurposed.  

2. Technical Debt in Check Code

Identical to manufacturing code, check code is susceptible to endure from technical debt – and sometimes, this causes flaky checks afterward. One of these technical debt is brought on by automation engineers taking shortcuts to get the exams working shortly, then leaving these shortcuts in place to work on further exams.  

Exhausting coded values are sometimes used to shortly verify that a check is working, with the intent to switch that check later to tug the knowledge from a knowledge supply, like an Excel sheet. For instance, in case you are testing an e-commerce app and calculate the gross sales worth with gross sales tax utilized, the preliminary checks might hard-code the tax price. Later, the tax fee may change – inflicting the check to fail. That is perhaps OK for 1 check, however you could be checking costs in lots of checks.  

The answer to hard-coded values is usually to tug your necessary knowledge from a knowledge supply outdoors of the check code, maybe an Excel sheet. Or, to consolidate your values to a single place within the code, making it simpler to replace.  

A frequent explanation for flaky exams is time delays inserted for the app to finish some motion. For instance, a check case that performs a search on a website may insert a 5-second delay earlier than checking for the search outcomes. You’ll be able to see the place that is going – typically the search takes longer than 5 seconds to finish and your check might fail. Even when your requirement is for all searches to finish inside 5 seconds, a greater check design can be to separate these considerations into distinctive exams (one for efficiency, one for search performance).

As an alternative of a delay in your exams, most check frameworks have the power to attend for a situation to happen. In our search check, for instance, we’d look forward to the app to say “x outcomes have met your standards”, then proceed with the subsequent step. So, if it solely takes 1 second, your check runs quicker this time. However, if it takes 10 seconds, your check will nonetheless verify the search outcomes when they’re prepared.

For most of these code-related technical debt, I often ask the testers to repair the code once they encounter these points. For example, if the tax fee modifications and causes a number of checks to fail, I ask the tester to repair the design, not merely replace the hard-coded values. Alternatively, you can dedicate a while at every dash for “code hygiene” duties.  

three. Utilizing the UI an excessive amount of

I already talked about one instance of utilizing the UI an excessive amount of within the App Testability part. It’s often higher to stuff the info into your system utilizing an API or different means somewhat than depend on the UI scripts to pre-populate the info. This follow reduces the quantity of publicity your check scripts need to the UI modifications.

One other alternative is to be sure to check on the proper degree of your know-how stack. A basic stack has a UI that presents the consumer expertise, a “again finish” of enterprise logic, and additional again, a database or persistence layer.  

You need to think about testing the enterprise logic instantly, as an alternative of via the UI. Then, use the UI exams to check the consumer expertise or the end-to-end movement. Testing the enterprise logic immediately is often a set of API exams, which have the additional advantage of operating quicker, being much less vulnerable to false alarms, and make it simpler to increase the protection for permutations.  

For instance, when testing an e-commerce website, the entire quantity will differ based mostly on a lot of gadgets, reductions, delivery expenses, taxes, and perhaps different elements. I might search for a chance to create an API-driven check that may check all of the permutations of those elements, as an alternative of making an attempt to automate all of the permutations by means of the UI. The API exams would concentrate on ensuring the “math” is true whereas the UI check ensures all the things is related finish to finish, and the consumer expertise meets expectations.  

four. Empower Function Groups

As QA Director, you may need an Automation workforce that’s separate from the Function groups.  That is fairly widespread – you need to jump-start automation with the individuals possessing the best expertise to create the code that drives your automation. Additionally, typically, you’ve got plenty of catchups to do when it comes to creating automated exams. The options is perhaps developed by separate groups (or scrum groups, squads, swimlanes, and so on.)

If the primary time the brand new function code is run by way of the automated exams by your Automation workforce, you’re asking for flaky checks. The Function groups are sometimes innovating on the UI layer of your app, the very place that you’re utilizing for check enter. When the automated exams are run and fail due to a change within the product UI, your check may get a nasty popularity for being incorrect. An alternate strategy is to empower the Function groups to run the exams earlier than merging to your trunk department or handing off code for system check. This manner, the Function groups can modify the checks to match the product modifications.

5. Consolidated Outcomes

Typically, we get our check outcomes from totally different locations. The guide checks are recorded in a check case administration device, and automation outcomes may come from the Steady Integration platform.  This doesn’t trigger flaky checks however can create the notion of flaky exams. In case you have the identical function examined with each guide & automated checks, and the outcomes are totally different – your stakeholders gained’t know which end result to consider. Each outcomes may very well be true, nevertheless it’s exhausting to speak that with a standing metric.  

At Testlio, we’ve discovered that its higher to provide stakeholders a consolidated view of the check outcomes, the place we mix the automated and guide check outcomes right into a single supply of fact. The human, on this case, turns the automated check end result into actual strong info.  

I hope the following pointers have been helpful.

Just lately, I attended a software program testing convention the place the viewers was polled: greater than 80% have been presently automating some portion of their checks, however fewer than 10% had a persistently inexperienced dashboard – there was flakiness throughout.  

That is a part of our collection of articles titled “Questions each QA Director should ask themselves”. Do you might have a query for us?  Or, would you want to listen to extra about how Testlio might help you with flaky checks? Drop us a line at [email protected] or [email protected]

About the author