Blog

How to Provision Test Data for Continuous Testing by David Zwicker on Mar 19, 2019

How to Provision Test Data for Continuous Testing


Seems like every QA organization is implementing test automation to make their transition to continuous testing. Continuous delivery (CD) is the “new normal” for software development and CD all but mandates the use of automation tools to replace manual testing processes. A survey of more than 1600 QA professionals in 60 countries conducted by QA Intelligence (commissioned by PractiTest) found that 85% of test organizations have introduced test automation into their operation in an effort to replace manual testing.



However, introducing a technology is not the same thing as deploying it across the organization and throughout the continuous delivery process. The same research found that QA teams are struggling mightily with full-scale deployment of test automation. Only 23% of organizations have automated at least half of their test cases and just 4% have automated more than 90% of their testing. The average level of deployment across all testing categories remains under 20% of test cases, and that number is not appreciably changing from year to year.


Current Test Data Provisioning is not Suited for Continuous Testing

The friction encountered during the deployment of test automation is the outdated approach for provisioning test data. Traditional Test Data Management (TDM) is too cumbersome, costly and complicated. It forces testers to spend too much time:

  • Copying, subsetting and retaining useful copies of production data
  • Masking data for security purposes and managing test data set versions
  • Maintaining data quality and consistency across multiple test operations

As a result, many companies (perhaps most) are creating test data manually in the form of excel spreadsheets. This gives testers control over the data they need for a given test. However, manual creation is labor intensive and places limits on the volume and variety of test data that can reasonably be provisioned for a continuous testing model and will never provide full coverage of the code and its data input possibilities.

The use of manual test data erodes the value of test automation’s ability to perform highly repetitive tests in volume, with multiple data variations, in a fraction of the time it would take to perform those tests manually. Test automation requires a reliable data source that provides highly controlled and secure test data at scale, on-demand.


Thinking Differently About Test Data Provisioning

Test Data Generation (TDG) has been advanced as an alternative to TDM and manual creation. It provides synthetic data in place of the Personally Identifiable Information (PII) found in production data. It’s easier than traditional TDM provisioning and much faster than creating test data manually.

The problem with most TDG solutions is their lack of sophistication which introduces other limitations on provisioning test data for continuous testing. For many vendors, TDG tools are an afterthought or an add-on to a traditional TDM platform. By and large, most TDG solutions:

  • Only create randomized data versus controlled, patterned and conditioned data
  • Lack the data formatting options need for testing a broad spectrum of data interfaces
  • Fail to ensure that test data is stateful (returned to its original state after testing)
  • Fails to ensure data consistency when used for multiple tests operations
  • Fail to ensure referential integrity between parent/child/sibling data tables


GenRocket’s TDG platform reimagines the test data provisioning process. It was designed from the ground up to perform as a reliable data source for continuous testing and removes these limitations. It provides controlled and consistent test data in high volume, on-demand.

The GenRocket TDG platform has a very simple mission statement: If you can imagine the test data you need, you can generate it, instantly and at high speed. All test data is governed by the data model used by the application under test and provides data generators and receivers that can produce data structured in any output format. It allows testers to control data patterns and permutations to support all forms of positive, negative and edge case testing. And GenRocket generates test data in real-time at a rate of 10,000 rows of data per second.

Most importantly, the GenRocket TDG platform enables a fully integrated test automation environment that seamlessly integrates with CI/CD pipelines (e.g., Jenkins) and test automation tools (e.g., Selenium).



In the architecture diagram above, CI/CD pipeline and test automation tools are represented on the left and the TDG platform is on the right. The process flow for using these platforms as an integrated environment for continuous testing can be summarized in 4 stages:

  1. Automate Testing: Define the test cases and test data requirements for the application in order to perform functional testing, regression testing, performance testing, etc.
  2. Integrate Platforms: A batch file or shell script can invoke a test data scenario in the GenRocket engine or an API can call for test data via scripting or compiled language.
  3. Generate Test Data: GenRocket will generate test data in real-time and at run-time using pre-determined test data specifications and in the output format required.
  4. Validate Code: Your application can now be continuously tested with comprehensive integration testing to identify and remedy defects before they reach production.


Together, these tools comprise a fully integrated environment that enables full scale deployment of test automation across the QA organization and throughout the continuous delivery development cycle.

To learn more about GenRocket’s integration with CI/CD pipelines, read this GenRocket Case Study on CI/CD Pipeline Integration for Insurance Applications.