Does Windsurf Cascade retry a failed MCP server?

Cascade retries the process spawn a couple of times, but it does not retry a malformed JSON-RPC response. A bad frame is dropped silently and the turn continues without that context.

Why does my MCP server work in the terminal but not in Windsurf?

Windsurf spawns servers with a minimal environment and its own working directory. Missing PATH entries, unset API keys, or relative paths that worked in your shell break under Cascade.

Can console.log really break an MCP server?

Yes. Stdio transport uses stdout for JSON-RPC frames. Any console.log writes to that same stream, corrupting the next frame so Windsurf cannot parse it and disables the server.

How do I know if Cascade actually used my MCP context?

Ask Cascade to name the tool it called, or watch the MCP panel for a call count. If the count stays at zero after a relevant prompt, the context never reached the model.

Is SSE transport more reliable than stdio for MCP?

SSE and streamable HTTP avoid stdout pollution and give real HTTP status codes, which makes failures louder. Stdio is simpler but hides errors unless you proxy the stream.

Debugging Windsurf Cascade: Diagnosing MCP Server Silent Failures

You asked Cascade to refactor a function using your team's internal API conventions. It produced something confident, plausible, and completely wrong — it ignored the convention entirely. Your first instinct is to blame the model. "Windsurf got dumber today." It didn't. What actually happened is that the MCP server you wired up to feed Cascade that context failed, returned garbage, or timed out — and Cascade shrugged and answered anyway.

This is the single most frustrating class of bug when you're vibecoding with an AI-native IDE: the failure is invisible. The UI shows a green-ish status dot. No error toast. No red banner. The model just quietly operates with less context than you think it has. For a solo builder shipping fast, that's a silent tax on every prompt. This post is a field guide to finding those failures and killing them.

Why does Windsurf Cascade fail silently?

Cascade treats MCP tool output as optional context. If a server crashes or returns malformed JSON-RPC, Cascade skips it and answers the prompt anyway — no error surfaced.

The design choice is reasonable: an agent shouldn't hard-crash a coding session because one of six configured tools is unavailable. But the consequence is that partial context degradation looks identical to a normal response. The Model Context Protocol (MCP), the open standard Windsurf uses to connect to external tools, defines a clear error channel — but Cascade's job is to keep going, so it absorbs those errors instead of escalating them to you.

To debug this, you need to understand the three layers where a failure can hide: the process layer (the server didn't start), the transport layer (the JSON-RPC stream got corrupted), and the application layer (the server started and talks fine but returns an error for your specific call). Each layer has a different fingerprint.

How MCP context retrieval actually works

Under the hood, an MCP server is a separate process. Windsurf launches it (for the common stdio transport) and speaks JSON-RPC 2.0 over the process's standard input and output. The handshake is strict and ordered:

Windsurf spawns the process and sends an initialize request with its protocol version and capabilities.
The server replies with its own capabilities, then Windsurf sends an initialized notification.
Windsurf calls tools/list to discover what the server offers. If this fails or returns nothing, the server shows up empty.
During a Cascade turn, the model decides to call a tool. Windsurf sends tools/call and waits for the result.

The critical detail: every one of those messages is a newline-delimited JSON object written to stdout. The transport assumes stdout is a clean, exclusive channel for protocol frames. The moment anything else writes to stdout — a stray print(), a dependency's startup banner, a progress bar — the stream is corrupted and Windsurf can no longer parse it. In our experience this is the single most common cause of silent MCP failures, and it's worth seeing in code.

Failure #1: stdout pollution kills the JSON-RPC stream

Here is a realistic MCP server that retrieves documentation snippets for Cascade. It looks fine. It runs fine when you test it by hand. It is also completely broken inside Windsurf — and the bug is on line 14.

// docs-server.js  —  BROKEN: writes logs to stdout
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { CallToolRequestSchema } from '@modelcontextprotocol/sdk/types.js';
import { searchDocs } from './lib/docs.js';

const server = new Server(
  { name: 'docs-server', version: '1.0.0' },
  { capabilities: { tools: {} } },
);

server.setRequestHandler(CallToolRequestSchema, async (req) => {
  // BUG: console.log writes to STDOUT — the same stream as JSON-RPC.
  console.log('[docs-server] handling call:', req.params.name);
  const hits = await searchDocs(req.params.arguments.query);
  return { content: [{ type: 'text', text: JSON.stringify(hits) }] };
});

const transport = new StdioServerTransport();
await server.connect(transport);

The handshake might even succeed — until your first tool call, when console.log injects [docs-server] handling call: ... into the middle of the response frame. Windsurf reads a line, tries JSON.parse, fails, and disables the server. From your side: the tool simply stopped existing. No toast. The fix is to send every diagnostic to stderr, which the transport ignores, and to wrap the handler so a thrown error becomes a proper MCP error result rather than an unhandled rejection.

// docs-server.js  —  FIXED: stderr-only logging + structured errors
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { CallToolRequestSchema } from '@modelcontextprotocol/sdk/types.js';
import { searchDocs } from './lib/docs.js';

// console.error writes to STDERR — safe. Never console.log in a stdio server.
const log = (...args) => console.error('[docs-server]', ...args);

const server = new Server(
  { name: 'docs-server', version: '1.0.0' },
  { capabilities: { tools: {} } },
);

server.setRequestHandler(CallToolRequestSchema, async (req) => {
  const { name, arguments: args } = req.params;
  log('handling call:', name);

  // Edge case: the model can call a tool with missing/empty args.
  const query = (args && typeof args.query === 'string') ? args.query.trim() : '';
  if (!query) {
    return {
      isError: true,
      content: [{ type: 'text', text: 'Error: "query" argument is required and must be a non-empty string.' }],
    };
  }

  try {
    const hits = await searchDocs(query);
    // Edge case: a valid query with zero results is NOT an error —
    // tell the model explicitly so it does not hallucinate a source.
    if (hits.length === 0) {
      return { content: [{ type: 'text', text: 'No documentation found for: ' + query }] };
    }
    return { content: [{ type: 'text', text: JSON.stringify(hits, null, 2) }] };
  } catch (err) {
    // Return the failure THROUGH the protocol so Cascade can see it,
    // instead of throwing and letting the whole turn degrade silently.
    log('searchDocs failed:', err);
    return {
      isError: true,
      content: [{ type: 'text', text: 'Doc search failed: ' + (err instanceof Error ? err.message : String(err)) }],
    };
  }
});

const transport = new StdioServerTransport();
await server.connect(transport).catch((err) => {
  log('fatal: transport connect failed:', err);
  process.exit(1);
});

The difference is more than logging hygiene. The fixed version uses isError: true — an MCP-level signal that the tool ran but failed. Cascade can see that and react (retry, ask you, pick a different tool). An unhandled throw, by contrast, often just collapses into "no result" and the model proceeds blind. If you want a deeper tour of how servers expose tools and resources, see our MCP server architecture deep dive.

How do I see MCP server logs in Windsurf?

Open the Cascade MCP panel, expand the server, and check its status dot. For real detail, tee stdio through a proxy — Windsurf hides raw stderr from you.

Windsurf surfaces a coarse status (enabled, error, number of tools) but not the JSON-RPC traffic itself. When a server is misbehaving, you want to see exactly what bytes crossed the wire. The cleanest way to do that without touching the server's source is to slip a transparent proxy between Windsurf and the real process. The proxy forwards stdin and stdout untouched while copying both streams to a log file.

#!/usr/bin/env node
// mcp-proxy.js  —  transparent stdio interceptor for debugging MCP servers
// Usage in mcp_config.json:
//   "command": "node",
//   "args": ["/abs/path/mcp-proxy.js", "node", "/abs/path/docs-server.js"]
import { spawn } from 'node:child_process';
import { createWriteStream } from 'node:fs';
import { join } from 'node:path';
import { tmpdir } from 'node:os';

const [cmd, ...cmdArgs] = process.argv.slice(2);
if (!cmd) {
  process.stderr.write('mcp-proxy: no target command given\n');
  process.exit(2);
}

const logPath = join(tmpdir(), 'mcp-proxy-' + process.pid + '.log');
const logFile = createWriteStream(logPath, { flags: 'a' });
const stamp = () => new Date().toISOString();
const note = (line) => logFile.write('[' + stamp() + '] ' + line + '\n');
note('SPAWN ' + cmd + ' ' + cmdArgs.join(' '));

// Inherit the parent env so the wrapped server sees the same vars Windsurf set.
const child = spawn(cmd, cmdArgs, { env: process.env });

// Windsurf -> server (requests). Tee a copy into the log.
process.stdin.on('data', (chunk) => {
  note('--> ' + chunk.toString('utf8').trim());
  child.stdin.write(chunk);
});

// server -> Windsurf (responses). Forward verbatim, log a copy.
child.stdout.on('data', (chunk) => {
  note('<-- ' + chunk.toString('utf8').trim());
  process.stdout.write(chunk);
});

// The smoking gun: anything the server printed to stderr.
child.stderr.on('data', (chunk) => {
  note('ERR ' + chunk.toString('utf8').trim());
});

// Edge case: server dies mid-session. Record the exit code so you can tell
// a crash (non-zero) from a clean shutdown, then propagate it.
child.on('exit', (code, signal) => {
  note('EXIT code=' + code + ' signal=' + signal);
  logFile.end(() => process.exit(code ?? 1));
});

child.on('error', (err) => {
  note('SPAWN-ERROR ' + err.message); // e.g. ENOENT: command not on PATH
  logFile.end(() => process.exit(127));
});

// Edge case: stdin closes (Windsurf disabled the server) — shut the child down.
process.stdin.on('end', () => child.kill('SIGTERM'));

Point Windsurf at this wrapper instead of the server directly, restart Cascade, run a prompt, then read the log file from your temp directory. You will see one of three things: a clean request/response pair (the server is fine — look elsewhere), a request with no matching response (the server hung or crashed — check the ERR and EXIT lines), or a response line with non-JSON text spliced in (stdout pollution, exactly as above). This single tool resolves the large majority of MCP mysteries.

Failure #2: the handshake never completes

If tools/list comes back empty or the server shows zero tools, the problem is usually upstream of any tool logic — the initialize handshake itself failed. To isolate this, talk to the server the way Windsurf does, by hand. This health-check script spawns the server, performs the full handshake, and reports precisely where it broke.

#!/usr/bin/env node
// mcp-healthcheck.mjs  —  manual MCP handshake validator
// Run: node mcp-healthcheck.mjs node ./docs-server.js
import { spawn } from 'node:child_process';

const [cmd, ...args] = process.argv.slice(2);
const HANDSHAKE_TIMEOUT_MS = 8000;

const child = spawn(cmd, args, { env: process.env, stdio: ['pipe', 'pipe', 'pipe'] });

let buffer = '';
const pending = new Map(); // id -> { resolve, reject }

child.stdout.on('data', (chunk) => {
  buffer += chunk.toString('utf8');
  let nl;
  while ((nl = buffer.indexOf('\n')) !== -1) {
    const line = buffer.slice(0, nl).trim();
    buffer = buffer.slice(nl + 1);
    if (!line) continue;
    let msg;
    try {
      msg = JSON.parse(line);
    } catch {
      // This is the failure mode: non-JSON on stdout = polluted stream.
      console.error('NON-JSON ON STDOUT (stdout pollution!):', line);
      continue;
    }
    const waiter = pending.get(msg.id);
    if (waiter) {
      pending.delete(msg.id);
      msg.error ? waiter.reject(new Error(JSON.stringify(msg.error))) : waiter.resolve(msg.result);
    }
  }
});

// Surface server stderr — usually where the real crash reason lives.
child.stderr.on('data', (c) => process.stderr.write('[server stderr] ' + c));
child.on('error', (e) => { console.error('SPAWN FAILED:', e.message); process.exit(127); });

let nextId = 1;
function rpc(method, params) {
  const id = nextId++;
  const payload = JSON.stringify({ jsonrpc: '2.0', id, method, params }) + '\n';
  child.stdin.write(payload);
  return new Promise((resolve, reject) => {
    pending.set(id, { resolve, reject });
    // Edge case: server accepts the request but never answers — fail loud.
    setTimeout(() => {
      if (pending.has(id)) {
        pending.delete(id);
        reject(new Error('Timeout: no response to "' + method + '" in ' + HANDSHAKE_TIMEOUT_MS + 'ms'));
      }
    }, HANDSHAKE_TIMEOUT_MS);
  });
}

try {
  await rpc('initialize', {
    protocolVersion: '2025-06-18',
    capabilities: {},
    clientInfo: { name: 'mcp-healthcheck', version: '1.0.0' },
  });
  console.log('OK  initialize');

  // The 'initialized' notification has no id and expects no reply.
  child.stdin.write(JSON.stringify({ jsonrpc: '2.0', method: 'notifications/initialized' }) + '\n');

  const tools = await rpc('tools/list', {});
  const names = (tools.tools ?? []).map((t) => t.name);
  if (names.length === 0) {
    console.error('WARN  tools/list returned ZERO tools — server starts but exposes nothing.');
    process.exitCode = 1;
  } else {
    console.log('OK  tools/list ->', names.join(', '));
  }
} catch (err) {
  console.error('FAIL ', err.message);
  process.exitCode = 1;
} finally {
  child.kill('SIGTERM');
}

Run it and the diagnosis is immediate. SPAWN FAILED: ENOENT means the command isn't on the PATH Windsurf gave the process. NON-JSON ON STDOUT is the pollution bug. A Timeout means the server accepted the request but never replied — typically a synchronous dependency (a database driver, an auth SDK) that blocks the event loop during startup. tools/list returned ZERO tools means the process is healthy but your registration code never ran. Each maps to a different fix, and you found it in under ten seconds instead of guessing.

Failure #3: the environment Windsurf gives you is not your shell

The most maddening variant of "works on my machine" is the server that runs perfectly in your terminal and dies under Cascade. Windsurf does not launch MCP servers inside your interactive shell, so none of your .zshrc or .bashrc setup applies. No nvm shims, no exported API keys, no PATH additions. You must declare everything explicitly in mcp_config.json.

{
  "mcpServers": {
    "docs-server": {
      "command": "/Users/you/.nvm/versions/node/v20.11.1/bin/node",
      "args": ["/Users/you/projects/tools/docs-server.js"],
      "env": {
        "DOCS_API_KEY": "sk-live-...",
        "DOCS_INDEX_PATH": "/Users/you/projects/tools/.index",
        "NODE_ENV": "production"
      }
    }
  }
}

Three rules save hours here. Use an absolute path for command — a bare node resolves against Windsurf's PATH, not yours, and nvm users almost always have a different default version there. Use absolute paths in args and env — the working directory is not your project root, so a relative ./.index points nowhere. And pass every secret explicitly in the env block; an unset API key usually surfaces as a confusing 401 deep inside the server rather than an obvious startup error.

Transport comparison: where each one hides its failures

MCP supports more than one transport, and your choice changes how loud a failure is. Stdio is the default and the simplest, but it is also the easiest to corrupt. Here is how the three common options compare for debuggability.

Transport	Error channel	Common silent failure	Best for
stdio	stderr (hidden by Windsurf)	stdout pollution corrupts frames	Local single-user tools
SSE (legacy)	HTTP status + event stream	Dropped connection, no auto-resume	Remote servers, older clients
Streamable HTTP	HTTP status codes	CORS or 401 swallowed as "no tools"	Shared / hosted servers

The takeaway for solo builders: prototype with stdio because it's zero-config, but the moment a server becomes load-bearing for your daily flow, consider streamable HTTP. A real HTTP status code is a failure you cannot miss; a corrupted stdout line is one you will.

Troubleshooting checklist: matching symptoms to causes

When Cascade ignores context it should have, work down this table. It is ordered by how often each cause shows up — in our experience stdout pollution and environment mismatches account for the bulk of MCP support questions we field from indie builders.

Symptom	Likely cause	How to confirm
Server shows 0 tools	Handshake failed or registration skipped	Run mcp-healthcheck.mjs
Tool exists, never gets called	Vague tool description; model ignores it	Check MCP panel call count stays 0
Tool worked, then stopped mid-session	A log line hit stdout on a code path	Look for non-JSON in mcp-proxy log
Works in terminal, not in Windsurf	Missing env var or wrong node version	Diff process.env in proxy log
Long pause, then degraded answer	Tool timed out; Cascade moved on	Request logged, no response logged

Edge cases and gotchas worth knowing

The empty-result trap. A tool that returns nothing for a valid query is not an error — but if you return an empty content array, the model may hallucinate a source. Always return an explicit "no results found" text, as the fixed docs-server.js does.
Restart, don't reload. Editing mcp_config.json does not always hot-reload. After config changes, fully refresh the MCP server list or restart Windsurf, or you will debug a stale process.
Large payloads. Returning a 200KB blob of context burns the model's context window and slows every turn. Cap tool output and paginate; a tool that returns too much is its own kind of silent failure.
Buffering. Some runtimes buffer stdout when it is not a TTY. If responses arrive in bursts or get truncated, flush explicitly or disable buffering — under Windsurf, stdout is a pipe, not a terminal.
Crash loops. If a server exits non-zero immediately, Windsurf may respawn it a few times and then give up. The MCP panel settles on "error" with no history — your EXIT lines in the proxy log are the only record.

One more habit that pays off: build observability into the server from day one. A startup line to stderr (server ready, 4 tools registered) and a per-call stderr log cost nothing and turn a future silent failure into a two-minute fix. If you are wiring MCP into a broader toolchain, our guide to configuring Windsurf MCP servers walks through a hardened setup end to end.

Ship with confidence, not vibes alone

Vibecoding is a superpower right up until your tools fail quietly and you spend an afternoon blaming the model. The fix is not to trust less — it's to make failures loud. Log to stderr, never stdout. Keep a proxy and a health-check script in your tooling folder. Treat a green status dot as a claim to verify, not a fact. Do that, and Cascade goes back to feeling like magic — because you can actually see when it isn't.

Debugging Windsurf Cascade: Diagnosing MCP Server Silent Failures

Your AI pair-programmer isn't getting dumber — your MCP server is failing without saying a word.