Why test clones mess with your test quality – And how to avoid them

I recently reviewed a manual test suite of one of our customers. One of the first things I check very early in a review is the number of clones (i.e. duplicated parts of a test suite, usually created by copy and paste). In this recent case, I discovered that nearly 70% of the test suite is duplicated. That means, when I take some arbitrary test step, the chance is 70% that the test step is a 1:1 copy of another step. At the top of the post is a tree map that visualizes the amount of clones I found. Each rectangle represents a test, the more red a rectangle is, the bigger the amount of cloning.

In my experience, cloning in test suites is the biggest problem with regard to maintainability of a test suite. Cloning causes considerable costs as the effectiveness of the test suite decreases and the effort for maintenance rockets. In this post I take a closer look on cloning in test suites. I show you an example to illustrate how clones can look like and explain where clones come from. Later, I give you good reasons why you should care about clones in tests and discusss strategies you can employ to avoid or at least deal with clones.

What is Cloning?

Look at the following example of a test for an imaginative software for securely storing passwords (such as KeePass). The following tests target the functionality to create a new password entry in a certain category. The first tests prescribes to enter a password manually, the second test demands to generate a password. Of course, both tests are very similar, they differ only in Steps 7 and 11. Steps 1 to 6 are completely identical, Step 8-11 are the same apart from a small difference.
Hence, in the two tests below we have two clones. One exact clone (green) and one clone with a slight difference (blue).

Test 1: Create new entry with custom password

1 Start the Password Store.
2 Right-click the category panel and select the entry “Create new category” in the context menu.
3 Enter the category name “test” and confirm.
4 Select the newly created category in the category panel.
5 Press the button “Create New Entry” in the toolbar.
6 Enter the entry name “test entry”.
7 Enter the password “abc123%&öäü()”.
8 Press the button “Save”.
9 Check if the new entry shows up in the entry list with the name “test entry”.
10 Open the new entry
11 Check if the saved password is “abc123%&öäü()” and the name is “test entry”.

Test 2: Create new entry with generated password

1 Start the Password Store,
2 Right-click the category panel and select the entry “Create new category” in the context menu.
3 Enter the category name “test” and confirm.
4 Select the newly created category in the category panel.
5 Press the button “Create New Entry” in the toolbar.
6 Enter the entry name “test entry”.
7 Press the button “Generate Password” and write down the generated password.
8 Press the button “Save”.
9 Check if the new entry shows up in the entry list with the name “test entry”.
10 Open the new entry
11 Check if the saved password is the generated password and the name is “test entry”.

Where do Clones Come From?

The test suite I talked about in the beginning is not an isolated case. In fact, a clone coverage of 70% is not even record breaking. I have seen test suites with a clone coverage far above 80%.
But why do we have so much cloning in tests? The top three reason are

  • Same function, different data: Very often we need to test the same functionality with different data. What comes out, are extremely similar test cases that differ only in the test data.
  • Same function, different test levels: In practice, I often find  the same piece of function being tested once in isolation  and again as part of an end-to-end-test (e.g. in a regression test suite) together with other features. That leads to clones in tests as well.
  • Test Setup: Usually a test needs to bring the system in a certain state before performing the actual test procedures. Creating a certain category for password entries would be an example. Such setup steps are often the same for a whole range of tests. The result is a bunch of new clones.

Why should I care?

You should care about clones in your test suites, because they cost you money! First, clones in your test suites drive up the costs you have to spend for maintaining the tests. The more cloning you have, the higher the effort to make a change, as you have to change ervery copy of a clone. Search and replace may help you out in some situations, but only if the clone instances have not been changed individually (which they will, believe me!). Consequently, you cannot be sure to find every occurence of a certain part of  a test.

In my experience, clones inevitably start to differ over time, as people don’t remember all instances where they used copy and paste and make changes to individual instances of the clone without considering all the clones. The test quality thus declines as some tests get outdated. And finally bugs will find their way in the production system as they are not catched by tests anymore.

Before that happens, clones in a test suite obscure the purpose of a test. The purpose of an individual test is not in the similiarities to other tests (such as setup), but in the differences. Here is the core, the real purpose of an individual test. In the past, I found myself several times using a compare tool to find out the difference between two tests. Without the differences I was unable to figure out what the test is actually aiming at.

How and When to Avoid Clones?

The most effective way to avoid clones is to avoid copy and paste. Instead of copying and pasting, try to employ the reuse mechanisms that are available in you test design tool. Most professional tools provide at least basic mechanisms for reuse. For example, in HP ALM this mechanism is called Template. If Microsoft Team Foundation Server is your tool of choice, try Shared Steps. Often, such reuse components can be parameterized (e.g. in case of the template mechanism in HP ALM), which makes them even more versatile.

When time frames are tight and deadlines are approaching, it often seems an overhead to fiddle around with reuse components, when copy and paste (and adapt) is easy and fast. Extracting reusable components can be done later, you think. But most likely this will never happen if you don’t systematically control for clones. There are tools around (and Qualicen Test Scout is one of them) that do that for you: Finding clones that slipped through when designing the tests.

However determined you are to get rid of those clones, there are situations where clones cannot be avoided at reasonable costs. Only recently, I met with a prospective customer, ready to explain all the tools to detect and avoid clones in test suites. Only to learn that in this case, cloning is less of a problem, as we were talking about a one-shot test suite that will not be maintained. Moreover, the customer used a spreadsheet tool to design the tests. Spreadsheet tools are not designed to create reuse components that can be smoothly included from test cases. Hence avoiding clones in this setting is difficult.

When you cannot avoid all clones, you have to deal with them. Your main lever here is to create awareness. When you know about the clones, at least some of the trouble (e.g. introducing inconsistencies) can be avoided. Automated clone detection will help you here, too.

Summary

In my experience, clones are one of the biggest maintenance problems, especially in manual system tests. Duplicating things in tests is somehow natural, as repeatedly executing a system is what we do when we test. However, duplicating with copy and paste comes with a lot of problems and will eventually harm the maintainability and effectiveness of your tests. Most tools come with reuse mechanisms that help you get rid of clones, use them! Automated clone detection helps you to control clones in your tests and to create awareness of clones that are already present.

Feel free to contact me for information on how we detect and control clones in tests over at Qualicen.