How Synthetic Data Enables Fully Automated Test Data Deliveryby admin on Feb 08, 2023
Synthetic data is the hottest topic in AI and machine learning, contributing to explosive market growth now projected at 35% per year and reaching 3.5 billion by 2031 (Allied Market Research). But synthetic data has another critically important role to play outside of AI and ML.
Synthetic data is gradually replacing masked production data as a means to fully automate the delivery of data into continuous delivery pipelines. Synthetic data can be provisioned infinitely faster, with controlled data variety, in any volume or format, without involving sensitive PII/PHI data, as a fully automated and integrated process. And when properly blended with production data, synthetic data offers significant improvements over the traditional TDM approach as it delivers data with superior volume, variety, validity and velocity.
The 2022-2023 World Quality Report recommends organizations explore the automation of data provisioning through the use of synthetic data generation as part of a global test data provisioning strategy. Their annual survey found 31% of organizations have defined an enterprise-wide test data provisioning strategy, but only 20% have a fully implemented one.
And while 49% of organizations have automated the process of provisioning test data, it’s not an integrated process within their CI/CD pipelines. For 42% of organizations, manual data provisioning still remains a barrier to automating the delivery of test data.
GenRocket has developed an approach that fully automates the delivery of synthetic test data into any CI/CD pipeline. A data model combined with a well-defined set of rules for data generation are all that’s needed. Just import a database schema or the metadata that defines the target data model, define the test data requirements, and configure the volume, variety, and output format of the data that is needed for each test case. Synthetic data can be generated on-demand and in real-time, simply by embedding a small instruction set into an automated test case to generate fresh, accurate synthetic data for each test run. Read our latest article to learn more about the delivery of synthetic data into CI/CD pipelines.