How a Single Bug Can Trigger a Massive Outage

by Louie Flores on Jun 16, 2021

The Guardian recently reported a massive Internet outage affecting many prominent websites including Amazon, CNN, Hulu, The New York Times as well as their own online news site. The outage lasted for 1 hour, taking down major websites, and leaving millions of online visitors with nothing more than an obscure message: “Error 503 service unavailable”. For major online retailers like Amazon, the financial impact is measured in thousands of dollars in lost business for each second of downtime.


Massive Internet Outage


Quality assurance professionals will recognize this as a combinatorial testing problem. When testing software with different combinations and permutations of data values, missing just one of them can have unforeseen consequences.

The best practice is to test all combinations and permutations by injecting both valid and invalid data in all combinations to identify errors in the code before it’s released to production. In this article we describe how the use of real-time synthetic test data could have prevented this massive internet outage.