The test pyramid is a familiar concept to many in the software field. Initially created in the early 1990s for aircraft manufacturers to measure the quality of parts constructed from composite materials, it was adapted for software engineering in Mike Cohn’s 2009 book Succeeding with Agile. (Cohn, 2009; Vocke, 2018), where it was referred to as the “test automation pyramid.”
According to this version, the bottom level of the pyramid consists of automated unit tests that require small amounts of code, are more easily written and maintained by developers, and make up the bulk of the testing effort. In the middle, automated service tests for APIs make up a much smaller portion of the testing. At the top, a small number of user-facing tests covers the end-to-end workflow of the entire system.
While it’s been only just over a decade since Cohn’s innovative proposal, the world has since undergone a digital revolution that requires us to reconsider the ways in which the test pyramid could continue to offer value. In 2016, Forrester identified the “insights-driven business”—those organizations that support data science with larger budgets and platforms to unify technology and infrastructure—as a type of organization that would earn $1.2 trillion in revenue by 2020, a prediction that has largely come to pass. (Microstrategy, 2020)
At the heart of the insights-driven business is data.
Software testing is usually conducted with the assumption that the data that supports it is correct. However, this is not always true. Data can be compromised by the application of too many testing tools, conflicting requirements among diverse stakeholders, and changing business requirements. Data therefore needs to be analyzed, verified, and pre-qualified independently before or during the software testing process. The insights-driven business derives its insights from data, which means that the quality and integrity of that data is vital for the differentiation these businesses hope to achieve. By incorporating ETL (Extract, Transform, and Load) testing, (Big) Data testing, testing for machine learning (ML), and data pipeline tests into Cohn’s test pyramid, we can create a more holistic approach that provides quality assurance across traditional software and emerging data-intensive systems.
Poor data quality presents risk for the software systems that depend on them. A new test pyramid for data quality validation therefore requires risk-based thinking at every stage of the testing process to mitigate threats that could appear at any point throughout the entire information chain.
Our proposed holistic test pyramid for validating data quality captures the increasing dependence of software systems on underlying data, incorporates risk-based thinking at every level of the system, and doesn’t isolate ML testing or Big Data testing activities from the software testing. It provides for unit testing at all levels of the pyramid to supplement functional tests. Further, by not distinguishing between a user that is human and a user that is a system, the pyramid acknowledges the uncertainty that is always present as a result of the complexity of data-driven systems.
This new model for bridging the gap between data and traditional software testing will be the key element that differentiates the insights-driven businesses of the future from those that rely on traditional, siloed methods for validating data quality separately from software testing processes. The COVID-19 pandemic has already provided fertile ground for applying this method. One organization has significantly mitigated its risks by conducting a high-level FMEA (Failure Modes Effects Analysis) on several key business processes, collecting data on internal and external failures, and then examining how risk priorities would change by adding tests at any of the four levels of the holistic test pyramid.
Even before the onset of the incredible societal, technological, and economic changes brought about by the COVID-19 pandemic, the insights-driven business was set to disrupt the marketplace and draw a line in the sand between those organizations that know how to support data quality and those that don’t. The businesses that survive and thrive in this new world will be those that derive deep insights from validated data, which will require holistic approaches to data, pipelines, models, services, interfaces, and risk-based thinking. For a deeper look at these issues, read the article “Reframing the Test Pyramid for Digitally Transformed Organizations” in the September 2020 issue of ASQ’s Software Quality Professional.
Additional Reading
Torres, R. (2021, January 7). Poor software quality cost businesses $2 trillion last year and put security at risk. CIO Dive. Available from https://www.ciodive.com/news/poor-software-quality-report-2020/593015/
Radziwill, N. M. & Freeman, G. (2020). Reframing the Test Pyramid for Digitally Transformed Organizations. Software Quality Professional.
Vocke, H, Feb. 26, 2018. The Practical Test Pyramid. https://martinfowler.com/articles/practical-test-pyramid.html. Retrieved July 3, 2020.