Choosing The Right Test Data Generation Tool by Gregg Bolinger on May 29, 2013

In our last blog post I talked about what test data generation is and why every engineering team should care about it.  I closed that article with 5 questions to ask yourself about your test data.  I’d like to expand on these 5 questions a bit.  Test Data Generation is a relatively new market, especially when talking about the tools and platforms being developed to make the process better.  These 5 questions can help you determine what the best tool might be for your needs.

Does my test data take up gigabytes of space when unused?

No matter what your current test data generation strategy currently is, if you even have one, chances are the test data gets created and then stored on a server somewhere.  This might be a database or a file server filled with xml files, csv files, or any number of other formats.  And there it sits.  Waiting to be used, waiting to be copied, just waiting.  Sure, storage is cheap.  But consider wading through the gigabytes of data trying to find exactly what you need for your test.  What if your offshore team needs the data on a different server?  How long are they waiting for transfers to complete?

Your Test Data Generation tool should provide you with test data on demand.  It should create the data when you need it, then be disposable when you’re done.  This cycle should be repeatable any number of times for any number of scenarios.

Does my test data become stale when the rules change?

You’ve just spent a week creating thousands of rows of XML data in hundreds of files.  You’ve copied all the data to the test server.  Feeling accomplished and, quite honestly, relieved that you’re finally finished, you go for some coffee.  Just as you begin to relax you receive an email describing a change to the structure of the relationships of your data.  It’s been 10 minutes and you’re test data is already out of date!

Your Test Data Generation tool should provide you with an easy way of modifying the rules of your data.  And since you’re tool should also be generating your data on demand it’s as simple as running your tests with the new rules in place.

Do I write foreach loops and conditional statements to create my test data?

Remember all the XML you spent a week writing?  How many conditional statements did you have to write to get all the rules working correctly?  How many iteration loops did you have to write to get the number of Users, Addresses, Companies, Roles, and Departments for your test?  Worse yet, did you have to update those iterations depending on the test?

Your Test Data Generation tool should provide you the means to model and generate your data without the need to write any code.  It should provide features that allow you to control iterations based on conditional logic specific for each scenario which in turn allows great autonomy and fine grained control over the data generated for each test.

Am I the only one with access to the test data?

Bill needs your test data for some functional testing.  Nancy needs your test data for some integration testing.  It’s all sitting on your computer.  Gigabytes of it.  Its brittle.  It only works for your tests.  It took you weeks to produce.  It’s your prized possession.

Your Test Data Generation tool should provide you with the ability to easily share test data with anyone, anywhere.  Since your tool is cloud based and anyone with the right credentials can simply log in, download a scenario, and generate test data when running their tests, it’s as easy as telling Nancy which Scenario she needs.  Nancy should even be able to create her own test scenario specific to her needs, if she is using the right tool.

Do product releases create havoc for my test data?

Release 1.0 is out the door.  It was difficult getting here, but you did it!  Now you can finally start on all those cool features for 1.5.  And then the bugs start coming in.  You need your gigabytes of test data to be modified to support Release 1.5 but you have to run these tests to fix 1.0.  Copy this data over here, modify it over there.  This folder is for Release 1.1 and this folder is for Release 1.2.  But you need some of the changes from Release 1.3 in Release 1.1.

Your Test Data Generation tool should provide you with an easy way to version your test data.  The data shouldn’t matter.  The only thing that matters is if the rules for how the data gets generated.  The tool should provide an easy way to switch from one version to the other so its easy to maintain previous releases while continuing to move forward with new releases.

Choosing the right Test Data Generation tool is as important to your team as choosing the right IDE, framework, database, or language.  The right tool should assist you in creating as close to bug free software as possible.  It should allow you to completely test your system.  If you can answer “No” to all 5 questions above, you’ve chosen the right Test Data Generation tool.