Model Context Protocol
Module 8 · Model Context Protocol (MCP) in depth (tool.mcp)
Prerequisite: complete M6 (agents, tool calling) and understand the ReAct loop.
RAGorbit nodes:
tool.mcp,agent.react(as MCP consumer)Anchor templates:
01-airline-flight-change(PolicyRAG exposed as MCP server in the lab)
1. What MCP is and why it exists
1.1 The ad hoc integration problem
In M6 you learned tool calling: the LLM emits {"tool": "ReservationService", "arguments": {...}} and the framework runs the function. Each integration is custom:
LangChain Agent ──▶ ReservationService (custom HTTP, manual schema)
Cursor IDE ──▶ GitHub API (another schema, another auth)
Claude Desktop ──▶ Filesystem (another contract)
Each host (IDE, agent, copilot) reinvents how to discover tools, pass context, request permissions, and transport messages. The result: N hosts × M services = N×M adapters.
1.2 MCP as an open standard
The Model Context Protocol (MCP), started by Anthropic and adopted by the ecosystem (Cursor, Claude Desktop, VS Code, etc.), defines a single contract between:
┌─────────────┐ MCP ┌─────────────┐ API/biz ┌─────────────┐
│ MCP Host │ ◀──────────▶ │ MCP Server │ ─────────────▶ │ Service │
│ (IDE/agent) │ JSON-RPC │ (PolicyRAG) │ HTTP/DB/... │ real │
└─────────────┘ └─────────────┘ └─────────────┘
│
│ orchestrates
▼
┌─────────────┐
│ MCP Client │ ← library inside the host; speaks the protocol
└─────────────┘
- Host: the application the user runs (Cursor, your FastAPI agent).
- Client: library that implements MCP inside the host.
- Server: process that exposes capabilities (tools, resources, prompts).
An MCP server built once serves all compatible hosts — without rewriting integrations.
1.3 MCP vs traditional tool calling
| Aspect | Tool calling (M6) | MCP |
|---|---|---|
| Contract | JSON Schema per framework | Standard protocol (JSON-RPC) |
| Discovery | You register tools manually in the agent | Client calls tools/list dynamically |
| Transport | In-process or ad hoc HTTP | Standardized STDIO or Streamable HTTP |
| Primitives | Tools only (functions) | Tools + resources (data) + prompts (templates) |
| Security | Framework guardrails (guardrail.confirm) |
Sampling, roots, permissions at protocol level |
| Portability | Framework lock-in (LangChain, OpenAI) | One server serves multiple hosts |
They are not mutually exclusive: in production, a LangGraph agent can consume tools via the tool.mcp node — the LLM still does tool calling, but the tools come from an external MCP server.
1.4 When to use MCP / when NOT to
Use MCP when:
- You want to expose capabilities to multiple clients (IDE + internal agent + another team).
- Tools live in a separate process with its own lifecycle (independent deploy).
- You need dynamic discovery (the client does not know the tools in advance).
- You operate in regulated environments where explicit permissions are mandatory.
Do NOT use MCP when:
- You have 2–3 local Python functions used by a single agent → in-process
@toolis enough (M6). - The latency of an extra subprocess/HTTP is unacceptable (microsecond hot path).
- The service already exposes a mature REST API and only one client consumes it →
tool.serviceis simpler. - You are prototyping and the protocol complexity does not add value yet.
2. Protocol architecture
2.1 Layers
┌──────────────────────────────────────────────────────────────────┐
│ MCP ARCHITECTURE │
│ │
│ Application layer │
│ ────────────────── │
│ Tools · Resources · Prompts · Sampling · Roots · Permissions │
│ │
│ Protocol layer │
│ ───────────────── │
│ JSON-RPC 2.0 — methods: initialize, tools/list, tools/call, │
│ resources/list, resources/read, prompts/list, │
│ prompts/get, sampling/createMessage, ... │
│ │
│ Transport layer │
│ ────────────────── │
│ STDIO (local subprocess) · Streamable HTTP (network) │
└──────────────────────────────────────────────────────────────────┘
2.2 Session lifecycle
Client Server
│ │
│──── initialize ──────────────────▶│ handshake + capabilities
│◀─── {protocolVersion, serverInfo}─│
│ │
│──── tools/list ──────────────────▶│ discovery
│◀─── [{name, description, schema}]─│
│ │
│──── tools/call ──────────────────▶│ execution
│◀─── {content, structuredContent}──│
│ │
│──── tools/call (sensitive) ───────▶│ no permission → blocked
│◀─── {permission_required: true}──│
│ │
│──── permissions/respond ─────────▶│ user approves
│◀─── {status: approved}───────────│
│ │
│──── tools/call (with token) ──────▶│ executes action
│◀─── {status: captured}───────────│
2.3 The three primitives
| Primitive | Analogy | Airline example |
|---|---|---|
| Tool | Callable function | policy_rag(fare_class, route_type) |
| Resource | Readable data (URI) | policy://ECONOMY_FLEX/internacional → rule text |
| Prompt | Message template | flight_change_analysis(fare_class) → prompt for the LLM |
In RAGorbit:
tool.mcpconsumes server tools.- The host uses resources and prompts directly (Cursor shows them to the user/LLM).
3. Transport: STDIO vs Streamable HTTP
3.1 STDIO — local communication
Host/Client
│
│ spawn subprocess
▼
┌─────────────────────────┐
│ python server.py │
│ stdin ◀── JSON-RPC │
│ stdout ──▶ JSON-RPC │
└─────────────────────────┘
- When: local development, Claude Desktop, Cursor, agents that launch the server as a child process.
- Advantage: no open ports, process isolation, simple.
- In the lab:
solucion_scratch.pyusessubprocess.Popen+ pipes.
3.2 Streamable HTTP — network communication
Client (agent in K8s) ──HTTP POST/SSE──▶ MCP Server (Cloud Run)
https://mcp.airline.internal/mcp
- When: server shared by multiple agents, deploy on Docker/Cloud Run.
- Advantage: scalable, one server for the whole organization.
- Gotcha: requires auth (JWT, mTLS), rate limiting, and observability.
3.3 Comparison
| STDIO | Streamable HTTP | |
|---|---|---|
| Latency | Low (local pipes) | Higher (network) |
| Deploy | Client subprocess | Independent service |
| Security | OS isolation | Network auth required |
| Multi-client | One client per process | Many concurrent clients |
| Example | Cursor + local server | K8s agent → MCP on Cloud Run |
4. MCP security
4.1 Sampling — the server asks the LLM
In MCP, the server can ask the client to invoke the LLM (sampling). Use case: the server needs the LLM to summarize a document before indexing it.
Server ──sampling/createMessage──▶ Client ──▶ LLM ──▶ response ──▶ Server
Risk: a malicious server could abuse the client's LLM (cost, context leakage). Mitigation: the client must show the user which server is requesting sampling and allow denial.
4.2 Roots — filesystem boundaries
Roots define which filesystem directories an MCP server may read/write (e.g. a file server).
Client declares roots: ["/home/user/proyecto", "/tmp/shared"]
Server only accesses paths within those roots
In production: never give root: "/" to an untrusted server.
4.3 Permission-based approval
The central mechanism for sensitive actions. Lab flow:
1. Agent calls apply_flight_change(amount=130)
2. Server responds: permission_required, scope=financial
3. Host shows UI: "Do you authorize a charge of USD 130.00?"
4. User approves → permissions/respond(decision=approved)
5. Agent retries with permission_token → charge executed
This is the protocol equivalent of guardrail.confirm in template 01 — but independent of the agent framework.
| RAGorbit guardrail | MCP equivalent |
|---|---|
guardrail.pre-tool |
Validation on the server before execution |
guardrail.confirm |
permission_required + approval UI |
guardrail.idempotency |
Logic on the server (composite key) |
guardrail.resilience |
Retry/circuit breaker on HTTP transport |
4.4 Security gotchas
- Trust the server like third-party code — an MCP server can exfiltrate data via tools or sampling.
- Server allowlist — in Cursor/Claude Desktop, only connect servers you audit.
- Granular scopes — separate
read_policy(no approval) fromfinancial(with approval). - Do not mix permissions — a
read_policytoken must not authorizefinancial. - Audit trail — log every
tools/callwith arguments and result (likeobservability.auditin template 01).
5. MCP vs plugins and proprietary functions
See the full comparison in referencia/tecnologias-comparadas.md §10.
MCP OpenAI Plugins / Assistants
─── ───────────────────────────
Standard Open Vendor-locked
Discovery Dynamic tools/list Manual definition
Portability Claude, Cursor, custom OpenAI ecosystem only
Security Protocol permissions Variable
Practical rule: MCP wraps your business APIs in a portable contract. The APIs (tool.service, tool.http) remain the business layer; MCP is the presentation layer to the LLM.
6. Connecting to one and many servers
6.1 One server
ReAct Agent
│
└── tool.mcp ──STDIO──▶ PolicyRAG Server
Typical tool.mcp node configuration in RAGorbit:
{
"type": "tool.mcp",
"config": {
"server": "python /app/mcp/policy_server.py",
"transport": "stdio"
}
}
The agent discovers the server's tools at runtime and uses them like any other tool.
6.2 Multiple servers
ReAct Agent
├── tool.mcp ──▶ PolicyRAG Server (tools: policy_rag)
├── tool.mcp ──▶ Inventory MCP (tools: search_flights)
└── tool.mcp ──▶ Payment MCP (tools: charge_fee)
In FastMCP, Client can connect to multiple sources and prefix tools by server (policy_policy_rag, inventory_search_flights) to avoid name collisions.
6.3 Template 01 — before and after MCP
Before (M6): PolicyRAG embedded in the graph.
store.pgvector ──▶ tool.retriever "policy_rag" ──▶ agent.react
After (M8): PolicyRAG as a decoupled MCP service.
[MCP Server: policy-rag]
@tool policy_rag
@resource policy://{fare_class}/{route_type}
▲
│ STDIO or HTTP
│
tool.mcp ──▶ agent.react
Advantage: the policy team deploys and versions the MCP server without touching the agent graph. See examples/01-airline-flight-change/README.md.
7. RAGorbit tool.mcp node
{
"id": "policy_mcp",
"type": "tool.mcp",
"config": {
"server": "python mcp_servers/policy_rag.py",
"transport": "stdio",
"tool": "policy_rag"
}
}
- Output port:
Tool→ connects toagent.react. - Codegen generates code that launches the MCP subprocess and translates
tools/callinto agent invocations.
Full reference: referencia/catalogo-nodos.md §tool.mcp.
8. Layer ③ explained: FastMCP from scratch
Prerequisite: implement layer ② of the lab (
lab/solucion_scratch.py) or understand each piece you wrote by hand. Read this section in full before attempting to writelab/solucion_framework.py.Environment: the course study machine has no
pipor network. You will not be able to run this code here. The goal is that when you havepip install fastmcp, you can write the framework solution yourself.Cross-link: if you mastered tools and the ReAct loop in M6, connect with M6 §8 — Layer ③ explained: LangGraph from scratch. There you learned
@toolandcreate_react_agent; here you learn@mcp.toolandClient.
8.1 Reminder: what you already know from M6
| M6 piece | What it is for in M8 |
|---|---|
@tool + docstring → LLM description |
@mcp.tool does the same for the MCP protocol |
TOOLS = [...] manual registration |
tools/list discovers tools automatically from the server |
fake_llm decides which tool to call |
The host (Cursor/agent) uses the LLM + discovered MCP tools |
guardrail.confirm for sensitive actions |
permission_required + user approval |
What's new in M8 is not "tools" in general — it is the standard protocol that transports, discovers, and protects those tools across processes.
8.2 Bridge table: your scratch → FastMCP
| What you did by hand (layer ②) | FastMCP piece (layer ③) | Where in the lab |
|---|---|---|
TOOL_DEFINITIONS — manual registration with schemas |
@mcp.tool generates schema + description from docstring |
policy_rag, apply_flight_change |
TOOL_HANDLERS dict name→function |
Decorated functions are the handlers | Same functions |
MCPServer.handle() — JSON-RPC router |
FastMCP + mcp.run() — full protocol |
mcp = FastMCP(...) |
serve_stdio() — read stdin line by line |
mcp.run(transport="stdio") |
--server |
MCPStdioClient._send() — write JSON-RPC |
Client(server_script) — automatic transport |
demo_stdio_client() |
Manual initialize handshake |
Handled by Client on entering async with |
async with Client(...) |
Manual tools/list |
await client.list_tools() |
STDIO demo |
Manual tools/call |
await client.call_tool(name, args) |
STDIO demo |
| Resource as JSON in tool result | @mcp.resource("policy://{fare_class}/{route_type}") |
policy_resource |
| Hardcoded system prompt | @mcp.prompt — reusable template |
flight_change_analysis |
permission_required in tools/call |
Same logic; FastMCP does not implement it for you — you put it in the tool | apply_flight_change |
subprocess.Popen + pipes |
Client launches STDIO subprocess automatically |
Client(server_script) |
| HTTP not implemented in scratch | mcp.run(transport="streamable-http") + Client(url) |
--http |
Mental model: in scratch you are the protocol (JSON-RPC, transport, handshake). In FastMCP the framework is the protocol — you only declare tools/resources/prompts as Python functions.
8.3 FastMCP — create the server
from fastmcp import FastMCP
mcp = FastMCP(
name="airline-policy-rag-mcp",
instructions="Servidor MCP de políticas tarifarias de la aerolínea.",
)
name→ appears inserverInfoduring handshake (your scratch:SERVER_NAME).instructions→ context clients receive about the server's purpose.
Run the server:
if __name__ == "__main__":
mcp.run(transport="stdio") # local — Cursor, Claude Desktop
# mcp.run(transport="streamable-http", port=8765) # network
8.4 @mcp.tool — from Python function to MCP tool
In scratch you defined schemas manually:
TOOL_DEFINITIONS = [{
"name": "policy_rag",
"description": "Consulta reglas de tarifa...",
"inputSchema": {"type": "object", "properties": {...}},
}]
In FastMCP:
@mcp.tool(annotations={"readOnlyHint": True})
def policy_rag(fare_class: str, route_type: str, query: str = "") -> dict:
"""
Consulta reglas de tarifa y penalidades filtradas por
fare_class y route_type. Úsala para determinar si aplican cargos.
"""
...
FastMCP automatically:
- Name — from the function (
policy_rag). - Description — from the docstring (same as LangChain
@toolin M6 §8.3). - JSON Schema — from type hints.
- Registration — the tool appears in
tools/listwith no extra code.
Annotations (readOnlyHint, destructiveHint) help the host decide whether to request permission before execution — similar to marking sensitive: True in your scratch.
8.5 @mcp.resource — data readable by URI
Resources are data the client reads (does not execute):
@mcp.resource("policy://{fare_class}/{route_type}")
def policy_resource(fare_class: str, route_type: str) -> str:
"""Texto completo de la política para una tarifa y ruta."""
...
The client accesses with:
text = await client.read_resource("policy://ECONOMY_FLEX/internacional")
When to use resource vs tool:
- Resource: the data is idempotent and readable (policy, config, state). No side effects.
- Tool: the operation may have effects (charge, send email, modify DB).
8.6 @mcp.prompt — message templates
@mcp.prompt
def flight_change_analysis(fare_class: str, route_type: str) -> str:
"""Plantilla para analizar viabilidad de un cambio de vuelo."""
return f"Analiza si un pasajero con tarifa {fare_class}..."
The client gets the rendered prompt:
prompt = await client.get_prompt("flight_change_analysis",
{"fare_class": "ECONOMY_FLEX", "route_type": "internacional"})
Useful for the host (Cursor) to offer predefined actions to the user without the LLM inventing the prompt.
8.7 Client — consume the server (STDIO)
from fastmcp import Client
async with Client("solucion_framework.py") as client:
tools = await client.list_tools()
result = await client.call_tool("policy_rag", {
"fare_class": "ECONOMY_FLEX",
"route_type": "internacional",
})
What Client does for you (what you implemented by hand in scratch):
| Your scratch | FastMCP Client |
|---|---|
subprocess.Popen([python, script, "--server"]) |
Launches the server automatically |
initialize + verify protocolVersion |
Handshake on entering async with |
_send("tools/list") |
list_tools() |
_send("tools/call", params) |
call_tool(name, args) |
| Read stdout line by line | Buffer and error handling |
8.8 Client — consume the server (HTTP)
# Terminal 1: start server
mcp.run(transport="streamable-http", host="127.0.0.1", port=8765)
# Terminal 2: client
async with Client("http://127.0.0.1:8765/mcp") as client:
result = await client.call_tool("policy_rag", {...})
Same client API, different transport — the client negotiates automatically.
8.9 Multi-server
async with Client(["policy_server.py", "inventory_server.py"]) as client:
tools = await client.list_tools()
# tools prefijadas: policy_policy_rag, inventory_search_flights
Useful pattern when each domain (policies, inventory, payments) is an independent MCP server — like the separate tool.service nodes in template 01.
8.10 Block-by-block walkthrough of lab/solucion_framework.py
Block 1 — Data loading (lines 17–34)
Identical to solucion_scratch.py. No surprises.
Block 2 — FastMCP server (lines 37–130)
mcp = FastMCP(name="airline-policy-rag-mcp", ...)
@mcp.tool def policy_rag(...): ...
@mcp.tool def apply_flight_change(...): ...
@mcp.resource("policy://{fare_class}/{route_type}") def policy_resource(...): ...
@mcp.prompt def flight_change_analysis(...): ...
Scratch bridge: TOOL_DEFINITIONS + TOOL_HANDLERS → four decorators. Business logic (read politica.json) is the same.
Block 3 — Permission gate in apply_flight_change (lines 85–115)
You implement the permission_required logic yourself — FastMCP has no built-in guardrail.confirm. Your scratch already does it in MCPServer._tools_call; here it goes inside the tool function.
Block 4 — STDIO client demo (lines 140–195)
async with Client(server_script) as client:
tools = await client.list_tools()
await client.read_resource("policy://...")
await client.get_prompt("flight_change_analysis", {...})
await client.call_tool("policy_rag", {...})
# manejar permission_required en apply_flight_change
Scratch bridge: equivalent to PolicyRAGAgent.run() but with async API and no manual JSON-RPC.
Block 5 — HTTP client and multi-server (lines 198–250)
Demonstrates that the same server serves over STDIO and HTTP — only transport in mcp.run() and the URL in Client change.
8.11 When to use each approach and final gotchas
| Situation | Use | Why |
|---|---|---|
| Local prototype, one IDE | STDIO + FastMCP | mcp.run(transport="stdio") — 3 lines |
| Shared server in the org | Streamable HTTP | One deploy, N agents |
| In-process tool, single agent | LangChain @tool (M6) |
No protocol overhead |
| Point REST integration | tool.service / tool.http |
You do not need MCP |
| Sensitive action | Permissions + guardrail.confirm |
Defense in depth: protocol + guardrail |
Gotchas:
- Poor docstrings → misused tools (same as M6 §8.9).
- An MCP server is arbitrary code — audit it like any dependency.
- STDIO = one client per process — use HTTP for multi-client.
- FastMCP does not replace guardrails — combine
permission_requiredwithguardrail.confirmin production. - Protocol versioning — verify
protocolVersionin the handshake (your scratch:2024-11-05).
8.12 Checklist before writing your solucion_framework.py
- Do you have
@mcp.toolwith docstrings that explain when to use each tool? - Does
apply_flight_changereturnpermission_requiredwithout a token? - Do you have at least one
@mcp.resourceand one@mcp.prompt? - Does the STDIO
Clientlist tools, callpolicy_rag, and handle permissions? - (Challenge) Can you start the server over HTTP and connect a remote
Client?
Next step: open lab/enunciado.md (Part B) and try to write the file yourself before looking at solucion_framework.py.
9. Checkpoint — You know it if you can…
- Explain the difference between MCP Host, Client, and Server.
- Describe how MCP differs from M6 tool calling (discovery, transport, primitives).
- Name the three MCP primitives (tools, resources, prompts) and give an example of each.
- Explain when to use STDIO vs Streamable HTTP.
- Describe the permission approval flow for a sensitive action.
- Explain what sampling is and why it can be a security risk.
- Explain what roots are and why they matter for filesystem servers.
- Draw the lab flow: handshake → tools/list → policy_rag → permission gate → apply_flight_change.
- Explain what
@mcp.tooldoes and how it relates to LangChain@tool. - Read template 01's
flow.jsonand identify which node you would replace withtool.mcp. - Argue when MCP is overkill and when
tool.serviceis enough.
If you cannot: review §1–§4 (concept and security), §8 (FastMCP), and lab/enunciado.md. Run solucion_scratch.py and follow the trace line by line.