In which I finally understand Model Context Protocol…

I spent a day interrogating what an MCP server actually is. Here’s what I got wrong.

I came into this with a lot of assumptions. Some of them were reasonable intuitions from a decade of working with APIs, developer platforms, and integration ecosystems. Some of them were just wrong. All of them were worth pressure-testing before making a build-vs-skip decision.

This is the write-up of that interrogation: what I thought MCP was, what it actually is, where the real value is, and where a lot of the discourse is noise. If you’re a technical PM or engineering lead at a B2B SaaS company trying to decide whether to build an MCP server, this is written for you.

What I thought MCP was (and why I was confused)

My mental model going in: MCP is a documentation layer with a chatbot wrapper. You write some structured descriptions of your API capabilities, package them up in a server, and LLMs can ask it questions and get pointed in the right direction. Like LLMs.txt, but interactive. Like a developer portal, but with a natural language interface.

Wrong.

I also thought MCP might be some kind of redirect mechanism – the MCP server tells the agent “here’s where to find the API, here’s how to call it, go do it yourself.”

Also wrong.

The confusion is understandable because MCP is discussed in at least five different ways simultaneously, often by different advocates in the same conversation:

A protocol
A plugin architecture
A tool registry
An orchestration layer
A workflow guidance system

These aren’t the same thing, and conflating them makes the value proposition incoherent. And makes dudes like me chase our tails in circles trying to pin down WHAT’S THE PROBLEM THIS UNIQUELY SOLVES?

What an MCP server actually is

After reasoning this out all day, I’ve concluded that an MCP server is best described as a hosted function library. It exposes a fixed set of named, typed functions (called tools) over a standardized JSON-RPC protocol. When an agent (I’ve started calling them “Agentic clients”) calls a tool, the MCP server executes the underlying API call and returns a structured result. The agent never constructs an HTTP request itself.

This makes MCP closer to a credentialed API proxy, or a hosted SDK, than to documentation or a chatbot. There is no LLM inside the MCP server. The intelligence – the natural language understanding, the planning, the “which tool do I need for this” reasoning – lives in whichever agent runtime is calling the MCP server. Claude, GPT, Gemini, whatever. The MCP server is dumb Python executing a fixed set of functions. It’s a hosted library for 2026.

The canonical interaction looks like this:

Agent runtime connects to MCP server and calls list_tools()
MCP server returns a typed manifest: tool names, natural language descriptions, and full JSON Schema for every input parameter
Agentic client’s LLM reads the manifest, decides which tool fits the current task, and constructs a fresh structured call
MCP server executes the underlying REST API call
Result comes back to the agent

The agent framework treats these tool definitions like typed function signatures. No document retrieval, no code generation, no format translation. That’s the concrete difference from LLMs.txt plus a well-maintained OpenAPI spec – not the semantic content, but the calling convention.

The correct analogy: MCP is what you get when you ask “what if a Python SDK had a standardized discovery protocol that any LLM agent runtime already knew how to speak?” That’s it. That’s the thing. Man did I take all day thinking it was everything but this.

The two MCP patterns you’re probably conflating

Here’s the structural problem with most MCP conversations: two completely different deployment patterns share the name, and they have completely different cost profiles, security implications, and appropriate use cases.

Pattern one: the builder assistant

This one solely gets used during integration development. A developer, or an agent helping a developer, explores available capabilities, runs sample calls, figures out how to automate a workflow. You’d better impose heavy rate limits, no SLA, and you’re under no obligation to cover every endpoint. Analogous to a sandbox environment or an AI-assisted Bruno (sorry to the remaining Postman advocates) collection.

This pattern is defensible. It reduces the time it takes to understand a complex API surface, and it compresses the exploration phase of building an integration. The LLM can probe available tools, test call patterns, and help a developer construct the workflow they’ll eventually hardcode into their application.

The value is front-loaded and short-lived. Once the workflow is understood and coded, you don’t need the MCP server anymore for that workflow. You’ve bootstrapped your way to a deterministic integration.

Pattern two: the runtime gateway

Wildly different. Sits in a production path. Executes API calls on behalf of agents at scale. Adds a network hop – Agentic client to MCP server to your API – plus JSON-RPC serialization overhead on every single call. Requires its own SLA, monitoring, scaling, and credential management. Essentially a microservice or a superset of an API gateway. Thought you were good with one? Not no longer.

This pattern is much harder to justify. There’s serious latency; there’s operational maintenance and support. And if the workflows running through it are known and repeatable – which most production integrations are – you’ve added infrastructure complexity without adding capability. A compiled SDK call is faster, cheaper, and more reliable than an LLM-mediated tool invocation through a proxy.

For high-frequency, latency-sensitive production workflows, MCP is an architectural anti-pattern. No detection pipeline, no real-time alerting system, no high-throughput data processing workflow should have an LLM in the loop – let alone an MCP proxy sitting between the LLM and your API.

The `list_tools()` response and why description quality is everything

list_tools() is a standardized method, not a vendor choice. The response is a list of tool objects. Each one has a name, a natural language description, and a JSON Schema defining every input parameter.

What it does not have: a standardized output schema. The agent learns what comes back by calling the tool. This is actually a meaningful contract gap compared to a well-maintained OpenAPI spec, which defines both request and response shapes.

The implication is that the natural language description field is doing most of the cognitive heavy lifting. The LLM reads it and decides: is this the right tool for what I’m trying to do? If the description is vague, wrong, or missing important context about when not to use a tool, the LLM calls the wrong thing or constructs malformed inputs.

Someone has to write those descriptions carefully. That’s the same intellectual work as maintaining good API documentation – it just lives in a Python decorator instead of a YAML file. The work doesn’t disappear; it relocates. And you’re probably still publishing that content on your DevRel/API Docs site/Support Portal too. One more sink to add to the content pipeline.

This is the core rebuttal to the “MCP reduces the documentation burden” argument. It doesn’t. It changes the format, and adds another step.

The security claims don’t hold up

A recurring MCP argument: you can mark destructive operations as dangerous, require confirmation, add safety metadata around sensitive tools. This sounds like a security feature. It isn’t. Take it from a dude with a couple of decades in cybersecurity.

Actual security boundaries live in OAuth scopes, RBAC, and server-side authorization on the API. A confirmation prompt in a tool definition is UX friction – it’s designed as a convenience affordance to give a human reviewer pause, not to enforce access control. A sufficiently goal-directed autonomous agent will confirm and proceed. That’s not a security failure; that’s the agent doing its job. How many agents will not just aggressively but in worst cases maliciously subvert porous security theatre?

There’s a related security concern that gets less attention and is more important: the auth threading problem. An MCP server that calls your API as itself – using a service account or “god mode” cred – bypasses the per-caller authorization your API enforces. The MCP server becomes a privilege escalation surface. Every API call looks like it came from the same principal regardless of which customer or agent is actually initiating it. Cannot be audited, cannot be traced, does not conform to basic compliance requirements.

A well-built MCP implementation threads the caller’s bearer token through to the underlying API call. Most quick implementations don’t do this. It’s worth asking about any MCP server you’re evaluating, building, or inheriting.

What MCP genuinely does solve

After stripping out the noise, a few things survive the interrogation:

Standardized cross-runtime compatibility. Before MCP, writing AI agent integrations meant maintaining separate tool definitions for LangChain, Claude, GPT function calling, and whatever inventive madness comes next. MCP is an attempt to write it once and have all the runtimes understand it. If that standardization holds, the ecosystem benefit is real and reduces per-runtime maintenance burden on the vendor side.

BUT this will only be true while tokens are relatively cheap. When that cost balloons even more than it has already, a runtime that is only relevant to token-burning AI tooling is going to be more lonely than the only Canadian who doesn’t like watching hockey.

Builder-phase exploration. For developers building agentic workflows against a complex API, having a structured capability manifest they can probe – with an LLM helping them figure out which calls to chain together – genuinely compresses the exploration phase. This is the DevRel value proposition in code: professional services in a box, available without a sales cycle.

High-friction workflow encapsulation. If your API requires a three-step choreography – start a job, poll for completion, download the result – wrapping that in a single tool call hides complexity that an agent developer would otherwise have to discover and implement themselves. That’s real value, concentrated on the endpoints where your API is hardest to use correctly.

AGAIN, only while tokens are cheap enough to burn all day every day in your operational tools.

Local developer tooling. An MCP server running on a developer’s laptop, as a VS Code or IDE extension, with no infrastructure footprint – no DNS record, no server to scale, no COGS – is a much more defensible pattern than a hosted gateway. The latency problem largely disappears. The operational burden is nil.

What most MCP advocacy is actually about

A lot of MCP momentum is ecosystem signaling: “we’re agentic-ready, look, we built the thing.” I understand the pressure, I’ve been sorely tempted to give into it myself. Enterprise buyers are asking about AI strategy. Partners want to know you’re keeping up. The demo-ability of an MCP server is high.

But signaling is expensive when engineering capacity is finite. A server that exists to say “we’re ready” without solving a specific customer problem is performative AI engineering. It carries ongoing maintenance burden, creates security questions that have to be answered anyway, and may set customer expectations about capabilities that weren’t thoughtfully designed.

The alternative posture: be ready to build it when you know what it needs to do. The right question to ask any stakeholder who wants an MCP server is: what role does it need to play, where will it sit in the customer’s stack, and how much tolerance does the customer have for non-deterministic output? If those questions don’t have concrete answers, the business case isn’t ready.

The recommendation for most B2B SaaS teams

If you have complex API choreography that customers repeatedly struggle to implement, and you have customers or partners actively building agentic workflows, a builder-tool MCP is worth scoping – with aggressive rate limits, no SLA, and explicit positioning as a prototyping aid rather than a production integration path.

If you don’t have those conditions yet, the highest-ROI investment is in the foundation that makes MCP viable when you do build it: well-maintained OpenAPI specs, high-quality LLMs.txt, structured workflow documentation, and clear examples. That work is useful now, with or without MCP. The MCP server can follow when you have something specific it needs to do.

If someone in your organization wants to build an MCP server as a runtime gateway in a production path, ask them to walk you through the latency math, the auth threading model, and the customer workflow it serves. If those answers are sharp, the conversation is worth having. If they aren’t, you’ve saved yourself from building infrastructure for a problem you don’t have.

The customers who get durable value from production MCP gateway deployments are building multi-vendor agent orchestration over dynamic, unpredictable tool surfaces. Most B2B SaaS integrations don’t look like that. Most of them look like known workflows, known endpoints, and a strong preference for deterministic, auditable execution. That’s the hardcoded SDK end state, not the MCP end state.

Build toward the actual end state. Use MCP to explore the path.

Til then? Solidify the shit out of your OpenAPI docs. Build that LLMs.txt (and the sprawl of partitioned, task-oriented Markdown that it should be an index into). Keep your best practices, sample code and troubleshooting docs up to date.

These are still the essential foundations of every SaaS integration surface. Don’t build the infinity pool until the supports are in place.

If you’re working through a similar evaluation and want to compare notes, I’m on LinkedIn.

	Lewis on Update my Contacts with Python…
	paranoidmike on Parsing PDFs using Python
	Anne Laski on Parsing PDFs using Python
	paranoidmike on Hashicorp Vault + Ansible + CD…
	KrzWrd on Hashicorp Vault + Ansible + CD…