Back to Blog
May 12, 2026

The Gentrification of Unit Tests: When Clean Code Kills Fast Feedback

Your test suite looks professional. It also quietly murders your iteration speed.

Drake meme: 30 unit tests mocking internal functions vs one integration test covering the public API

TL;DR: Most indie test suites are not slow because there are too many tests — they're slow because each test pays a fixed startup tax (ts-jest compilation, Prisma client regeneration, dependency injection wiring) that nobody profiled. Replace ts-jest with @swc/jest or move to Vitest, mock the network instead of your own modules, and keep one persistent Postgres container for the whole run. Sub-second feedback returns. Coverage doesn't suffer — it gets honest.


A few weeks ago I sat down with a founder in Barcelona who was convinced TDD was "over." He didn't hate tests. He hated his tests. Every save triggered a 14-second Jest run on his MacBook Air. He'd stopped running them on save eight months ago. Coverage was at 92% in the badge, and most of those tests had not been executed locally in three sprints. The CI pipeline did though — 11 minutes per push, with flaky DB factory tests retrying on every PR.

We opened his repo together. Each test file imported a custom TestModule with NestJS-style DI, a Prisma client mock generated by a factory, and a 2,000-line Faker-based seed helper. Every test took ~90ms before a single assertion ran. The pattern was beautiful. It was also gentrifying him out of his own dev loop.

This is the same disease the Clean Code crowd keeps lobbying for, just one floor down from the abstraction debate. The unit-test version of premature abstraction is the sophisticated test harness — fancy factories, custom matchers, in-memory pseudo-databases and module isolation so total that nothing real ever gets tested. The neighborhood looks polished. The original residents — the developers who used to run tests every save — have moved out.

Why does "clean" test infrastructure destroy fast feedback?

Tests that mock everything pay startup cost on every file. Tests that hit a real boundary pay it once. Premature factories and DI containers move feedback latency from milliseconds to seconds — the threshold where humans give up on TDD.

The classic argument for unit tests was speed. Kent Beck's Test Driven Development: By Example (2002) explicitly framed the cycle as red → green → refactor, fast enough that you stay in the IDE the whole time. That feedback loop is what unit testing was for. Once a test suite takes longer than the cognitive interrupt threshold — Stuart Card's classic 10-second context-switch finding and the 1-second "flow" ceiling Jakob Nielsen cited in Usability Engineering (1993) — most developers stop running tests on save. The suite is now technically there. Functionally it's gone.

The kicker: in the indie codebases I've reviewed in 2025, most slow Jest suites are not slow because of the assertions. They're slow because of:

  • ts-jest cold compilation. Every worker process recompiles your TypeScript on first import. Anecdotally, ts-jest is the largest single contributor to startup time we see — and the ts-jest docs themselves recommend switching to a transpiler-only mode if you want speed.
  • Prisma client regeneration per test file. If your jest.setup.ts calls prisma generate or instantiates a new client per worker, every test file pays the cost.
  • Module mocks that pull in real modules. jest.mock() evaluates the real module first to inspect its shape (per the Jest docs); a heavy module costs the same to mock as to use.
  • Factory graphs. A "simple" user factory that creates an org, a workspace, three feature flags, and a billing account is not simple. It's a slow fixture pretending to be a fast unit test.
  • One Postgres container per test, not per run. Docker startup is ~300ms + your image. Multiply by 400 tests and your "unit suite" is now an integration suite in a unit suite's clothing.

These are all fixable. None of them require deleting tests. They require honest profiling and a willingness to admit the harness — not the assertions — is the bottleneck.

Example 1: The over-engineered "unit" test that pays for everything

Here's the kind of pattern I keep finding in Vibe-coded Next.js codebases. The intent is good: isolate createOrder() from the database, from Stripe, from the email service. The result is a test that takes ~90ms to set up before any behavior is exercised.

// src/orders/__tests__/createOrder.test.ts  (the heavy version)
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { PrismaClient } from '@prisma/client';
import { OrderService } from '../OrderService';
import { StripeClient } from '../../billing/StripeClient';
import { EmailService } from '../../email/EmailService';
import { FeatureFlagService } from '../../flags/FeatureFlagService';
import { buildUserFactory } from '../../../test/factories/user';
import { buildProductFactory } from '../../../test/factories/product';

vi.mock('@prisma/client');
vi.mock('../../billing/StripeClient');
vi.mock('../../email/EmailService');
vi.mock('../../flags/FeatureFlagService');

describe('OrderService.createOrder', () => {
  let prisma: PrismaClient;
  let stripe: StripeClient;
  let email: EmailService;
  let flags: FeatureFlagService;
  let service: OrderService;

  beforeEach(() => {
    prisma = new PrismaClient();
    stripe = new StripeClient('sk_test_x');
    email = new EmailService('re_test_x');
    flags = new FeatureFlagService();
    service = new OrderService(prisma, stripe, email, flags);

    // Re-mock per test for "isolation"
    vi.mocked(flags.isEnabled).mockReturnValue(true);
    vi.mocked(stripe.charge).mockResolvedValue({ id: 'ch_1', status: 'succeeded' });
    vi.mocked(email.send).mockResolvedValue({ status: 'sent' });
    vi.mocked(prisma.order.create).mockResolvedValue({
      id: 'o_1',
      userId: 'u_1',
      total: 1000,
      status: 'paid',
    } as never);
  });

  it('charges Stripe with the order total', async () => {
    const user = buildUserFactory();
    const product = buildProductFactory({ price: 1000 });
    await service.createOrder({ userId: user.id, productId: product.id });
    expect(stripe.charge).toHaveBeenCalledWith(
      expect.objectContaining({ amount: 1000 }),
    );
  });

  it('persists an Order row with status=paid', async () => {
    const user = buildUserFactory();
    const product = buildProductFactory({ price: 1000 });
    await service.createOrder({ userId: user.id, productId: product.id });
    expect(prisma.order.create).toHaveBeenCalledWith(
      expect.objectContaining({ data: expect.objectContaining({ status: 'paid' }) }),
    );
  });
});

This test does not exercise createOrder(). It exercises my mocks of the things createOrder calls. If I rename prisma.order.create to prisma.order.insert, the assertion breaks even though behavior is identical. If a Stripe API change ships, neither test notices — the mock still returns { id: 'ch_1' }.

And the cost: ~90ms of setup per test (measured locally with vitest --reporter verbose), times roughly four similar describe blocks per module, times forty modules. You're paying 14 seconds of overhead before a single character of business logic runs. That's the whole iteration budget gone before the loop even starts.

The Hidden Cost

The polished test harness gives a false sense of safety: 92% line coverage, every dependency "isolated." But you're testing the wiring you wrote five minutes ago, not the contract that production has to satisfy. The real Stripe error codes, the real Prisma P2002 race, the real email unsubscribe state — none of those are in the mock graph. So the bugs ship.

Example 2: The honest integration test that survives refactors

Replace the four mocked dependencies with one real Postgres container (shared across the run), one network-level Stripe mock via MSW, and one in-memory queue. The test now actually exercises the code path that runs in production.

// src/orders/__tests__/createOrder.integration.test.ts
import { describe, it, expect, beforeAll, beforeEach, afterAll } from 'vitest';
import { setupServer } from 'msw/node';
import { http, HttpResponse } from 'msw';
import { PostgreSqlContainer, StartedPostgreSqlContainer } from '@testcontainers/postgresql';
import { execSync } from 'node:child_process';
import { PrismaClient } from '@prisma/client';
import { createOrder } from '../createOrder';

// Shared across the whole test run (configured in vitest.setup.ts in real code).
let container: StartedPostgreSqlContainer;
let prisma: PrismaClient;

// Boundary mocks — only the network. Local modules are real.
const stripeCalls: Array<{ amount: number; idempotencyKey: string }> = [];
const server = setupServer(
  http.post('https://api.stripe.com/v1/charges', async ({ request }) => {
    const body = await request.text();
    const params = new URLSearchParams(body);
    stripeCalls.push({
      amount: Number(params.get('amount')),
      idempotencyKey: request.headers.get('idempotency-key') ?? '',
    });
    return HttpResponse.json({ id: 'ch_test', status: 'succeeded' });
  }),
);

beforeAll(async () => {
  container = await new PostgreSqlContainer('postgres:16-alpine').start();
  process.env.DATABASE_URL = container.getConnectionUri();
  // One-time schema setup. We do NOT regenerate the Prisma client per test.
  execSync('npx prisma migrate deploy', { stdio: 'inherit' });
  prisma = new PrismaClient();
  server.listen({ onUnhandledRequest: 'error' });
});

afterAll(async () => {
  server.close();
  await prisma.$disconnect();
  await container.stop();
});

beforeEach(async () => {
  stripeCalls.length = 0;
  // Truncate in dependency order. Faster than re-migrating.
  await prisma.$executeRawUnsafe('TRUNCATE TABLE "Order", "User", "Product" CASCADE');
});

describe('createOrder (integration)', () => {
  it('charges Stripe exactly once for a given idempotency key', async () => {
    const user = await prisma.user.create({ data: { id: 'u1', email: 'a@b.c' } });
    const product = await prisma.product.create({ data: { id: 'p1', price: 1500 } });

    const key = 'idem-1';
    const a = await createOrder({ userId: user.id, productId: product.id, idempotencyKey: key });
    const b = await createOrder({ userId: user.id, productId: product.id, idempotencyKey: key });

    expect(a.status).toBe('paid');
    expect(b.status).toBe('duplicate');
    expect(stripeCalls).toHaveLength(1);
    expect(stripeCalls[0].amount).toBe(1500);
  });

  it('does not charge if the product price has changed mid-flight', async () => {
    const user = await prisma.user.create({ data: { id: 'u2', email: 'a@b.c' } });
    await prisma.product.create({ data: { id: 'p2', price: 1500, version: 1 } });

    // Caller pinned price; product moved underneath them.
    await prisma.product.update({ where: { id: 'p2' }, data: { price: 2000, version: 2 } });

    const r = await createOrder({
      userId: user.id,
      productId: 'p2',
      idempotencyKey: 'idem-2',
      expectedPriceCents: 1500,
      expectedVersion: 1,
    });
    expect(r.status).toBe('price-changed');
    expect(stripeCalls).toHaveLength(0);
  });
});

Look at what the second test catches that the mocked version cannot: an optimistic concurrency bug where the product's price changed between the client's "view price" and "click buy" events. That class of bug is invisible when you mock prisma.order.create — and it's exactly the kind of thing that ships to production and surprises you on a Friday.

Startup cost: ~3 seconds (one container boot, one migration), shared across alltests in the file. Per-test cost: ~25ms (truncate + a handful of inserts). For 40 tests the integration run is ~4 seconds total — faster than the mocked version's setup, and the bugs it catches are real.

Example 3: The 60-second harness fix that gives you the loop back

Before you delete any tests, profile the harness. Most slow Jest suites can be cut by more than half just by replacing the TypeScript compiler. ts-jest does full type-checking on every test transform; @swc/jest and @vitest/coverage-v8 with esbuild don't. Move strict type-checking to a one-shot tsc --noEmit pre-commit step and your feedback loop snaps back to where it was in 2019.

// 1. Replace ts-jest with @swc/jest in jest.config.ts
// (per https://swc.rs/docs/usage/jest)
import type { Config } from 'jest';

const config: Config = {
  testEnvironment: 'node',
  transform: {
    '^.+\\.(t|j)sx?$': [
      '@swc/jest',
      {
        jsc: {
          parser: { syntax: 'typescript', tsx: true, decorators: true },
          transform: { react: { runtime: 'automatic' } },
          target: 'es2022',
        },
      },
    ],
  },
  // Cap workers so factories don't thrash on a 4-core CI box.
  maxWorkers: process.env.CI ? 2 : '50%',
  // Reuse a single Postgres container per worker — see globalSetup below.
  globalSetup: '<rootDir>/test/globalSetup.ts',
  globalTeardown: '<rootDir>/test/globalTeardown.ts',
  // Avoid re-evaluating mocked heavy modules.
  resetModules: false,
};
export default config;

// 2. globalSetup boots one container for the whole run.
// (this saves ~300ms × file_count of Docker boot cost.)
import { PostgreSqlContainer } from '@testcontainers/postgresql';
import { execSync } from 'node:child_process';

export default async function globalSetup() {
  const container = await new PostgreSqlContainer('postgres:16-alpine').start();
  // Persist for teardown.
  (globalThis as any).__pg = container;
  process.env.DATABASE_URL = container.getConnectionUri();
  execSync('npx prisma migrate deploy', { stdio: 'inherit' });
}

// 3. tsc runs once in CI / pre-commit, not in the test runner.
// package.json
// {
//   "scripts": {
//     "test": "jest --no-coverage --silent",
//     "test:cov": "jest --coverage",
//     "typecheck": "tsc --noEmit -p tsconfig.json"
//   },
//   "lint-staged": {
//     "*.{ts,tsx}": ["eslint --fix", "tsc --noEmit -p tsconfig.json"]
//   }
// }

On the Barcelona founder's repo, applying just these three changes — @swc/jest, a shared globalSetup container, and moving tsc out of the test command — took watch-mode iteration from 14s to ~1.8s. No tests were deleted in this step. The harness was the bottleneck.

The exact numbers will differ for your codebase. The point is to profile before you delete, because Ham Vocke's "Practical Test Pyramid" is right about shape but doesn't say anything about runtime overhead. You can have a perfect pyramid and a 14-second feedback loop. Both must be fixed.

Cost vs. payoff: where each kind of test actually earns its slot

Patterns aren't universally good or bad. They have a cost (latency, brittleness, false confidence) and a payoff (real bugs caught, refactor safety). Here's the heuristic I use for indie / small-team work after auditing roughly fifty repos:

Test kindPer-test costCatches real bugs?Survives refactor?Pre-PMF verdict
Pure function unit test<5msFor pure logic, yesYesUse freely — almost free
Module-mock unit test50–150msRarelyNo (mocks pin internals)Default to deleting
MSW network-boundary test20–50msYes (provider contract)YesUse everywhere external HTTP exists
TestContainers integration test~25ms (shared boot)Yes (DB races, constraints)Yes3–10 per high-risk module
Playwright e2e1–5sYes (real browser bugs)Mostly1–3 golden paths only
Snapshot test (UI)10–80msCatches accidents, not intentNoSkip — visual regression is cheaper
Internal-mock factory test80–200msAlmost neverNoDelete — these are the gentrifiers

Kent Beck's 2022 essay Test Desiderata lays out the real properties you want from a test (deterministic, fast, isolated, behavioral, structure-insensitive). Most "clean" unit tests pass three or four of those at best. Boundary tests pass all six. Pick the shape, not the dogma.

Edge cases & gotchas the testing books don't warn you about

  • Postgres truncate is faster than re-migrating, but watch dependency order. TRUNCATE ... CASCADE is the fastest reset, but it resets sequences too — flaky tests appear if you assert on auto-incremented IDs. Use explicit UUIDs in fixtures or reset only the rows.
  • Jest describe.each + factories = exponential blowup. Each row in the table creates a fresh factory graph. If your factory walks five relations, a 6×6 table boots 180 rows of fixture data. Use static fixtures for parameterized cases.
  • Vitest's --isolate default is slow. Vitest defaults to a fresh module graph per file. For pure tests, you can set isolate: false in vitest.config.ts and recover ~30% on large suites — at the cost of leakage if any test mutates a singleton. Profile first.
  • MSW v2 changed the import paths. If a Stack Overflow snippet imports { rest } from msw, it's the old API — the v2 syntax (http, HttpResponse) is in the v1→v2 migration guide.
  • Jest fake timers + Promise.resolve = silent hang. jest.useFakeTimers() doesn't interact well with microtasks unless you use { legacyFakeTimers: false } and await jest.advanceTimersByTimeAsync(n). Otherwise the await never resolves and the test times out at 5,000ms with no useful diagnostic.
  • Module-mock hoisting and TS interop. vi.mock() / jest.mock() are hoisted above imports. Referencing a top-level const inside the factory throws Cannot access X before initialization. Use vi.hoisted() or import-time only.

Troubleshooting: how to diagnose a slow suite in 30 minutes

Six symptoms I see repeatedly when I audit indie test suites. Each one's diagnosis is fast; each fix is a sprint at most.

  • Symptom: Watch-mode iteration over 3 seconds even on tiny diffs. Cause: ts-jest cold compilation or no isolated test changes. Fix: Run jest --listTests --verbose and time the transformer with SWC_DEBUG=1. Replace ts-jest with @swc/jest first; then look for global imports in jest.setup.ts.
  • Symptom: "100% coverage" but production bugs every release. Cause: Coverage is over your mocks. Fix: Generate coverage with the unit suite, then re-run with mocks disabled (boundary tests only). Compare. The delta is your real coverage.
  • Symptom: Tests fail intermittently with connection refused from Postgres. Cause: One container per worker, race on first migration. Fix: Move container boot and migration to globalSetup; share via DATABASE_URL env var; isolate workers via schema, not container.
  • Symptom: A test passes alone but fails in the suite. Cause: Shared state leak — a singleton mutated by a previous test. Fix: Find the singleton (usually a logger, a cache, or a Prisma client with connection pool). Reset in afterEach or pass via DI.
  • Symptom: CI is 11 minutes for 400 tests, local is 12 seconds. Cause: CI runners have fewer cores; Jest's default --maxWorkers=50% on a 2-core box runs serially. Fix: Set --maxWorkers=2 explicitly, shard by file via --shard=<index>/<total> across matrix jobs.
  • Symptom: Adding one feature breaks 20 unmocked tests. Cause: Tests are coupled to internal structure, not behavior. Fix: Replace internal-module mocks with MSW (HTTP boundary) and a real DB. Most of the 20 will disappear when you delete the mocks.

The vibe-coder's manifesto on unit tests

Unit tests are not over. The neighborhood is being gentrified by sophisticated harnesses that look professional and feel safe but actively prevent you from running tests on save. That's the whole point of unit tests; that's what we're losing.

In our experience, the fastest indie codebases in 2025 share a small number of habits: boundary-level integration tests over module mocks, one shared Postgres container per run, an esbuild- or swc-based transformer, and tests written against observable behavior rather than internal calls. None of those are exotic. They're what the test pyramid was already shaped like before the harness layer started accreting.

Kent Beck's line from the original TDD book still applies — write a failing test, make it pass, refactor. The implicit precondition is that the cycle stays under a few seconds. If your harness doesn't allow that, the harness is the bug. Fix it before you delete a single test.

Your customers don't care about your coverage badge. They care that your code does what you said it does in production. Optimize for that, and your tests will earn their slot in the pipeline — instead of squatting in it.

Ready to ship your next project faster?

Desplega.ai helps indie hackers and solopreneurs ship and test faster with AI-powered QA that grades behavior, not architecture.

Get Started

Frequently Asked Questions

How do I know if my test suite is too slow?

If you stop running tests on save, you have crossed the threshold. In our experience, the inflection point is around 3 seconds — once feedback exceeds it, developers context-switch and TDD quietly dies.

Should I remove TypeScript from my test files to get speed?

No. Swap ts-jest for @swc/jest or vitest with esbuild. Move strict type-checking to a separate tsc --noEmit step in CI, keep the watch loop fast on the JS path. You get both.

What's the right number of unit tests per feature?

Aim for behavior coverage, not line coverage. One integration test through the public API often finds more real bugs than thirty shallow unit tests pinning internal function names in place.

Can I keep my fancy test utilities and still ship fast?

Only if each utility runs under ~50ms. Anything that walks a factory graph, hits a real container, or does Prisma generation per test will multiply runtime linearly. Reuse and isolate.

How do I convince my team to simplify our tests?

Measure save-to-result latency for a representative test on the average dev laptop and show the number. Most teams are shocked when they discover the median is over 8 seconds.