Fully Integration Testing Feeds – Why Is It So Challenging? by Admin on Feb 29, 2016

Accelerating Feeds Testing


Lets discuss the challenges that come with integration testing feeds.  And what are feeds you may ask?  Feeds are data contained in files or real time streams that are transferred between one or more endpoints for processing.  Feeds can exist in many formats and may contain small to huge quantities of data.  Organizations that import and export feeds as part of their daily processing may do so for hundreds to thousands of feeds and the processing of any one feed can come with many challenges and pain points.

First, we should all give high praise to QA engineers who have taken the time to implement code, create test data and hundreds even thousands of tests to ensure their feeds are tested and as bug-free as humanly possible.  However, QA engineers often don’t want to modify the tests they have written over years that test their many and complex feeds because it’s hard; it’s very hard to do.

This testing challenge plays out all too often when it comes to managing integration testing for a large number of feeds.  Take inbound feeds for example; inbound feeds can be very challenging to process precisely because they are coming from an outside source and the consumer of the inbound feed has little to no control over the quality of the data being delivered.  An outside source may do its best to deliver valid data in a feed, but it’s not guaranteed the feed data will be clean; so, it’s truly up to the feed consumer to validate the feed data before processing.  Look at some of the challenges and pain that can come with validating an inbound feed:

  • Required data values may be missing
  • Formatting of data values (e.g. date format) may be incorrect
  • Required constraints for values may not be met…
    • Not within the set of enumerated values
    • Not within the specified date range
    • Not unique for values that must be unique
  • The format of the feed may not be well formed
  • The data may be orphaned and not related to an appropriate parent
  • The feed itself may be incorrectly named and contain the data for an entirely different feed.

For any one feed, on average, there may be ten or more validation checks run on its data to ensure that the correct actions are taken in the code business logic to correct or reject the data before it is consumed.  In integration testing, these types of tests are called negative tests and are just as important as positive tests. 

Now, take that average of one positive test plus ten negative tests per inbound feed and multiply that by 100 feeds and you get 1,100 integration tests that need to be run; multiply the same by 1,000 feeds and there are 11,000 integration tests that need to be run; this is not a minor testing challenge.

Each one of these integration tests may need its own unique test feed containing the necessary test data to successfully run its test. This adds to the challenge and complexity of creating and maintaining hundreds to thousands of inbound feeds that also may exist in many different feed formats.

The data within a feed normally does not stand on its own; for example an inbound grocery coupon feed may have data that must be related to a store, user, coupon, product and vendor requiring referential integrity to parent-child relationships. There may also be other data values to process within the feed; for example, date, time, price, quantity and any of these values may require special validations.  In this example, there are at least 10 negative integration tests that can be run on the grocery coupon feed. 

For at lease one positive integration test, this same inbound grocery coupon feed may effect the insertion or updating of data to multiple tables in a back-end database.  This is why integration testing should be used to validate every aspect of a feed process to ensure that all effected components are fully tested.

These are the minimum integration tests required to fully test this feed.  Just think about the amount of time and effort that one or more QA Engineers may have to expend to manually create the test feeds, test data, implement code to load and store the test data and write tests to specifically work with the test feeds and test data.   Then think about the amount of time and effort it would take to replicate this for hundreds of feeds.

This is why it is almost impossible to fully integration test ten to hundreds of feeds in any reasonable time frame.  Even more so, the time and effort it takes to maintain all of the components effecting the test data and the execution of the tests becomes unsustainable as the code continues to change and new features are added.

It takes a lot of time to create test feeds and test data that is often painstakingly created by hand and in very specific formats (e.g. delimited, XML, JSON, etc). It can take even more time to implement the code to load and store the test data so that test data may be used in a test; this process may have to be repeated hundreds to thousands of times to implement enough tests to confidently test an entire set of feeds.

Test data that is stored in static files goes stale over time as updates and new features alter the data structures and business logic of the feeds. This makes managing testing of the feeds more difficult as the number of tests grows, more static test data is created and the feeds continue to change:

  • old feeds and back-end feed code logic gets modified
  • new feeds and back-end feed code logic gets implemented
  • old tests and static test data files must be updated
  • new tests and test data files must be implemented
  • code supporting old test data must be updated
  • code supporting new test data must be implemented
  • multiple versions of the feeds, test data and tests may have to be maintained.

Thus, over time the tests start to diverge from the changing back-end feed code base and can no longer accurately validate the feeds it was intended to test.  

This is also why QA Engineers who have spent hundreds of painstaking hours creating hundreds of static test feed files, test data files and implementing hundreds of classes to load the test data and the tests that use the test data are reluctant to modify their tests.

So what needs to change in order for QA Engineers to keep up with their ever-changing code base and feed integration testing challenges? In order for QA Engineers to be successful at integration testing hundreds to thousands of feeds, four major tasks need to be automated:

  1. Creating the supporting test data
  2. Creating the code to load the test data
  3. Creating the code to store the test data (including referential integrity)
  4. Creating the inbound test feeds

If these four major tasks were to be automated instead of being manually managed by a team QA Engineers, the process for integration testing hundreds to thousands feeds would be streamlined and the time and resources drastically reduced. This a complete paradigm shift for feed integration testing and this is where GenRocket changes the rules that make this paradigm shift possible.

Creating the Supporting Test Data

With GenRocket, test data is not statically stored in files, GenRocket Scenarios are used to determine the type of data that is generated and GenRocket Scenarios also ensure that referential integrity between test data is guaranteed. 

With the GenRocket Real Time Engine, test data can be generated and loaded in real time, eliminating the need for manual creation and storing of static test data files.


Creating Code to Load and Store Test Data

With GenRocket, we not only have Receivers that morph test data into useable formats, we also have developed intelligent Receivers that can generate code that loads the test data and stores the test data with referential integrity.

With GenRocket, the ability to fully automate the testing of hundreds to thousands of feeds with real time test data with full referential integrity of parent-child relationships is now possible.


Creating the Inbound Test Feeds

With GenRocket, our intelligent Receivers can also use the Factory design pattern to read from feed description files to automatically generated fully formatted inbound test feeds. And these Receivers can be used to generate the test feeds in real time.

With GenRocket, the ability to fully integration test hundreds of feeds is now possible. 


With GenRocket, outdated static test data files and old untenable test code paradigms can be readily replaced by a GenRocket solution while increasing the percentage of code being tested at the same time.

So, ask yourself the following questions:

  • Of your feeds – what percentage are not being fully tested? 
  • How much time and effort is lost trying to fully test even one feed?
  • How many bugs do your feeds encounter when run in production?
  • How much time is lost due to bug fixes and data clean up from running a bad feed because it was not fully integration tested?

If you’re not happy with your answers, perhaps it’s time to take a look at GenRocket.