desplega.ai - AI-Powered E2E Testing | Stop Trading Quality for Speed

TL;DR: Test-coverage numbers feel comforting, but they can hide mission-critical gaps. What you test matters far more than how much you test, so shift focus to end-to-end, revenue-driving scenarios; exactly the kind AI can automate.

Introduction

Ask any engineering team how “good” their tests are and someone will quote a percentage. Line-coverage dashboards offer an easy-to-grasp score that dazzles CEOs and calms CTOs. But coverage is only a proxy for quality, not a guarantee. When vanity metrics become goals, they distort priorities, leaving the very flows your customers (and your top line) depend on unchecked.

Picture the scene: a CTO is in a board meeting, presenting a dashboard that glows with a reassuring “Test Coverage: 95%.” Confidence is high. Hours later, the illusion shatters. A key enterprise customer is unable to complete a core transaction, and that beautiful 95% coverage means absolutely nothing. This isn't hypothetical. The recent CrowdStrike software bug that cost Delta Air Lines an estimated $380 million wasn't a failure of a single, isolated component; it was a failure of the system as a whole—the exact kind of disaster that code coverage metrics are fundamentally incapable of predicting.

1. Why Coverage Fetishes Persist

The allure of a single number is powerful, but it often masks a dangerous reality. Leaders naturally crave simple, quantifiable metrics, but this simplicity is a flaw when it abstracts away the immense complexity of modern software.

The Allure	The Reality
Simple scorecard executives can track.	A number is not evidence that the right behaviours are tested.
Gamification: teams race to 80% or 90%.	Metrics can be inflated by trivial tests that execute code without asserting behaviour.
Tooling built into IDEs and CI pipelines.	Easy instrumentation ≠ meaningful assurance.

“Having a high coverage gives you a sense of security, possibly making you blind to deeper issues.”

Martin Fowler echoes the warning: coverage is “useful for finding untested parts… but of little use as a numeric statement of how good your tests are.”

2. Evidence: High Coverage ≠ Fewer Bugs

The data is clear: a high coverage percentage does not reliably lead to more stable software.

Empirical research: A 2015 study of two large open-source systems found only a weak correlation between higher statement coverage and real-bug detection.
Eclipse crash data: Analysing 2 million crash reports, researchers discovered that unit-tested code crashed about as often as untested code.
Industry post-mortems: GitLab’s infamous 2017 database outage wiped six hours of customer data; unit and component tests all passed, yet an end-to-end restore path was never rehearsed.
Real-world outages: The July 2024 CrowdStrike software update grounded airlines worldwide. The failure wasn't in a single, isolated component but in the system as a whole—a blind spot for unit test coverage that cost some companies hundreds of millions.

3. Where Coverage Falls Short

Code coverage fails because it cannot see the most critical sources of failure in modern, distributed systems.

Inter-service seams: Microservices magnify the “gaps between the boxes.” Most unit tests stub network calls, so failures in contracts, timeouts, or auth headers go unnoticed until production. Test Wars Episode V dives into this problem in detail.
Critical user journeys: The checkout path that drives 80% of revenue may involve five services, two external APIs, and a feature flag. Line coverage doesn’t know how valuable that path is. Test Wars Episode VI was all about this. For an e-commerce business, this is: “Can a new user find a product, add it to their cart, and pay?” For a SaaS platform: “Can an enterprise admin invite a new team member and see the charge on their bill?” A metric that can't distinguish between a cosmetic flaw and a revenue-killing catastrophe is not a metric that can guide strategic decisions.
Data-dependent edge cases: Tests that merely touch a line cannot assert behaviour across weird encodings, leap-second timestamps, or country-specific tax rules.

Steve Grunwell sums it up: “100% code coverage, but no protection against the things most likely to cause issues.”

4. The Rebel Strategy: From Code Coverage to Business Risk Coverage

The rebellion against code coverage is not a call for anarchy; it is about having the right metrics. We must shift our focus from a purely technical, inside-out view to a business-centric, outside-in perspective.

From	To
Counting executed lines	Covering end-to-end scenarios users rely on
Chasing an arbitrary % target	Mapping business-risk to test depth
Writing more unit mocks	Generating production-like data & environments

To lead this rebellion, you must replace the old, misleading dashboard with one that reflects business reality. The following table provides a clear guide for translating the language of the old empire into the language of the rebel alliance.

The Executive's Guide to QA Metrics: From Vanity to Value

Vanity Metric (The Old Empire)	Value Metric (The Rebel Alliance)	The Business Question It Answers
Code Coverage Percentage	Critical User Journey (CUJ) Pass Rate	“Is our code being exercised by tests?” vs. “Can our most important customers achieve their most important goals right now?”
Total Number of Tests	Bug Escape Rate to Production	“Is our team busy writing tests?” vs. “Is our QA process effective at preventing customer-facing pain?”
Unit Test Pass Rate	Change Failure Rate (DORA Metric)	“Do individual components work in a lab?” vs. “Can we ship new value to customers quickly and safely?”
Test Execution Speed	Mean Time to Resolution (MTTR) for a CUJ Failure	“How fast do our tests run?” vs. “When a critical workflow breaks, how fast can we fix it to protect revenue?”

Desplega’s approach auto-generates real-world flows: spanning APIs, UIs, data stores, and message queues. That way you validate what actually happens in prod. Instead of asking “How many lines did we hit?”, we aim to show you which revenue-critical journeys are green.

5. Practical Steps for Tech Leaders

Inventory value journeys. Identify at least the top 5 user journeys or ops processes that directly impact revenue or SLA penalties. This is where your QA team will excel.
Instrument observability first. Trace each journey; the trace becomes the spec for an end-to-end test.
Automate scenario generation. Figure out how to replay traces with production-like data, spinning up ephemeral environments when needed.
Track “Scenario Coverage.” Report the percentage of high-value flows exercised on every build and fail pipelines when coverage slips.
Use code coverage as a smoke detector, not a KPI. Let low coverage highlight untouched code, but never treat the number itself as success.

A final note

Coverage percentages belonged to a simpler time. Modern systems fail at the integrations and edge cases that numbers can’t see. By pivoting from how much you test to what you test, you protect customer trust, prevent seven-figure outages, and keep revenue flowing.

As a leader, you set the strategy. The next time you review your engineering dashboard, resist the allure of the simple percentage. Instead, ask the questions that truly matter: “What percentage of our critical, revenue-generating user journeys are we testing with every release?” and “When a critical journey fails, how quickly can we resolve it?”

The rebels have spoken: may your tests be with the user, not the metric.

References

Marc-G. G., “Code Coverage Is a Vanity Metric,” 2015 — marcgg.com
Martin Fowler, “Test Coverage,” 2012 — martinfowler.com
M. Zhao, L. Kaufman, S. Spacco, “Code Coverage and Test-Suite Effectiveness: Empirical Study with Real Bugs,” SANER 2015 — doi.org
E. Chioteli, I. Batas, D. Spinellis, “Does Unit-Tested Code Crash? A Case Study of Eclipse,” arXiv 2019 — arxiv.org
GitLab Engineering, “Postmortem of the 31 Jan 2017 Database Outage,” 2017 — about.gitlab.com
Steve Grunwell, “The True Meaning of Code Coverage,” 2025 — stevegrunwell.com
Delta Air Lines September Quarter 2024 Results. Delta Investor Relations, Oct 2024 — delta.com

Test Wars – Episode VII: Test Coverage Rebels