Agents II — multi-agent
Module 7 · Agents II — Multi-agent and Frameworks (agent, tool)
Prerequisite: M6 (ReAct agent, tool calling, basic LangGraph).
RAGorbit nodes:
agent.fanout,agent.react,tool.service,tool.retrieverAnchor template:
10-logistics-disruption-rebooking(event-driven fan-out)
Table of contents
- When multi-agent vs a single agent?
- Multi-agent patterns
- Orchestration and stateless fan-out
- LangGraph multi-agent (supervisor, conditional edges, checkpoints)
- CrewAI (agents, tasks, crews, tools)
- AutoGen / AG2 (conversation between agents)
- BeeAI and Semantic Kernel (quick overview)
- Framework selection and combination
- Layer ③ explained: multi-agent frameworks from scratch
- RAGorbit nodes in this module
- Template 10 · Logistics
- Checkpoint — You know it if you can…
1. When multi-agent vs a single agent?
1.1 The limits of a single agent
In M6 you built a ReAct agent with several tools. That is enough when:
- A single conversational entity serves the user.
- The tools share the same session context.
- The flow is sequential (even if dynamic) toward one goal.
A multi-agent system adds value when:
| Signal | Why a single agent fails | Example (template 10) |
|---|---|---|
| Massive parallelization | 3,000 shipments do not fit in a sequential loop | Fan-out with concurrency=16 |
| Domain specialization | One LLM with 15 tools confuses tool descriptions | ProfileAgent vs PolicyAgent vs AlternativesAgent |
| Different routing policies | Simple cases should not pay tokens for complex ones | logic.rules → auto-confirm vs LLM |
| Stateless agents | State lives in Kafka/DB, not in sub-agent memory | Re-process after crash without losing context |
| Explicit supervision | You need to audit who decided what | Supervisor + audit trail |
1.2 Golden rule
Does a single LLM with N tools solve 80% in < 10 steps?
YES → agent.react (M6)
NO → evaluate multi-agent
Do you need to process > 100 identical items in parallel?
YES → agent.fanout (M7)
Do you need distinct cognitive roles (researcher vs reviewer)?
YES → CrewAI or LangGraph multi-node
Does the flow emerge from free conversation between agents?
YES → AutoGen (prototype); LangGraph (production)
1.3 When NOT to use multi-agent
- Unnecessary overhead: 2 tools and one user → ReAct is enough.
- Critical audit without an explicit graph: conversational AutoGen is hard to trace.
- Strict latency: each hop between agents adds an LLM call.
- Cost: N agents × M steps × tokens = explosion if you do not segment first (see
logic.rulesin template 10).
2. Multi-agent patterns
2.1 Supervisor (central orchestrator)
A supervisor agent receives the task, decides which specialist to invoke, and consolidates the result.
┌──────────────┐
Input ───────────▶│ SUPERVISOR │
└──────┬───────┘
┌───────────────┼───────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Agent A │ │ Agent B │ │ Agent C │
│ (profile) │ │ (policy) │ │ (routing) │
└────────────┘ └────────────┘ └────────────┘
│ │ │
└───────────────┴───────────────┘
▼
Consolidated response
When to use: transactional flows with known steps but branching (rebooking, legal research).
In RAGorbit: agent.fanout acts as the supervisor of the per-shipment sub-agent; internally the sub-agent follows a mini-graph.
2.2 Hierarchical
The supervisor delegates to sub-supervisors that in turn coordinate specialists.
CEO Agent
├── Research Manager → Web Agent, Doc Agent
└── Writing Manager → Drafter, Editor
When to use: large teams (> 5 roles), multi-section reports.
Framework: CrewAI Process.hierarchical with manager_llm.
2.3 Collaborative (peer-to-peer)
Agents converse with each other without a fixed supervisor; the flow emerges from dialogue.
Agent A ◄────────────────► Agent B
│ │
└──────────► Agent C ◄─────┘
When to use: brainstorming, coding agents, exploration.
Framework: AutoGen/AG2. Risk: hard to audit; few native guardrails.
2.4 Stateless fan-out
The same sub-agent is instantiated N times in parallel, once per item. No shared memory between instances.
Kafka Event Batch (3000 shipments)
│
▼
┌─────────────────────────────────┐
│ agent.fanout (concurrency=16) │
│ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │Sub 1│ │Sub 2│ │Sub N│ ... │
│ └──┬──┘ └──┬──┘ └──┬──┘ │
└─────┼───────┼───────┼───────────┘
▼ ▼ ▼
notify notify notify
audit audit audit
When to use: massive event-driven processing (logistics, fraud, alerts).
State: in event log + DB, not in agent heap. Kafka redelivery + DB idempotency = exactly-once.
2.5 Pattern comparison table
| Pattern | Control | Parallelism | Auditability | RAGorbit case |
|---|---|---|---|---|
| Supervisor | High | Medium | High | Rebooking sub-agent |
| Hierarchical | High | Medium | Medium-high | Multi-section reports |
| Collaborative | Low | Low | Low | AutoGen prototypes |
| Stateless fan-out | High (per item) | Maximum | High (per shipment_id) | Template 10 |
3. Orchestration and stateless fan-out
3.1 Template 10 pipeline
io.event-source ──▶ logic.rules ──▶ logic.router ──▶ agent.fanout
│ │
│ tool.service × 3
│ tool.retriever
▼ ▼
P1/P2/P3 io.notify
simple/complex observability.audit
3.2 Segmentation before the LLM (cost control)
logic.rules classifies without an LLM:
- P1 / complex: premium,
connections_lost > 0,CRITICAL. - P2 / simple:
delivery_flexibility == flexible. - P3 / simple: everything else.
Only the complex track invokes the sub-agent LLM. In a typical weather disruption, ~70% auto-confirm — 10–20× token savings.
3.3 Fan-out in code (concept)
# Generated by RAGorbit codegen (simplified)
async def fanout(events, concurrency=16):
sem = asyncio.Semaphore(concurrency)
async def process_one(event):
async with sem:
return await sub_agent.invoke(event)
return await asyncio.gather(*[process_one(e) for e in events])
In the scratch workshop, SupervisorOrchestrator.fan_out simulates this sequentially but respects the batch concept.
3.4 Idempotency + exactly-once
- Kafka
exactlyOnce: true→ atomic offset and audit. - DB with key
shipment_id→ second processing returns cache. - In scratch:
self._processed: set[str].
4. LangGraph multi-agent
4.1 From ReAct to multi-node graph (M6 → M7 recap)
M6: agent ↔ tools graph (one agent).
M7: graph with several agent nodes + supervisor + conditional edges:
ENTRY → supervisor → profile → policy → alternatives
│
┌───────────────┴───────────────┐
▼ ▼
autoconfirm llm_specialist
│ │
END END
4.2 Conditional edges
Router function returns the name of the next node:
def route_after_alternatives(state) -> str:
if state["track"] == "complex":
return "llm"
return "autoconfirm"
builder.add_conditional_edges(
"alternatives",
route_after_alternatives,
{"autoconfirm": "autoconfirm", "llm": "llm_specialist"},
)
Scratch equivalent: if track == "simple": autoconfirm else: llm_agent.analyze(...).
4.3 Checkpoints in fan-out
- Conversational (M6):
thread_id= user session. - Fan-out (M7):
thread_id=shipment_id(one checkpoint per shipment).
config = {"configurable": {"thread_id": event["shipment_id"]}}
graph.invoke(initial_state, config=config)
If the worker crashes, re-invoke with the same shipment_id and the checkpointer restores partial progress.
4.4 Subgraphs
A node can be another compiled graph — useful to encapsulate the fan-out sub-agent:
sub_rebook_graph = build_rebook_subgraph()
builder.add_node("rebook", sub_rebook_graph)
5. CrewAI
5.1 Mental model
Crew = Agents + Tasks + Process
| Concept | What it is | Analogy |
|---|---|---|
Agent |
Role with goal, backstory, optional tools |
Specialized employee |
Task |
Concrete work + expected_output |
Jira ticket |
Crew |
Team that executes tasks | Sprint |
Process |
Execution order | Kanban / hierarchy |
5.2 Process.sequential vs hierarchical
# Sequential: task B receives context from task A
Crew(..., process=Process.sequential)
# Hierarchical: a manager delegates tasks to agents
Crew(..., process=Process.hierarchical, manager_llm=llm)
For per-shipment rebooking: sequential (classify → investigate → execute).
For massive fan-out: external loop for event in events: crew.kickoff(...).
5.3 When to use CrewAI
Yes: multi-role prototypes, reports (researcher + writer + reviewer), teams with fixed roles.
No: massive fan-out with strict audit (LangGraph + Kafka is better), flows with fine financial guardrails.
5.4 Gotchas
- Vague tasks → vague outputs.
expected_outputmust be specific. - Duplicate tools across agents → confusion; centralize in one researcher agent.
- Cost: 3 agents × 3 tasks = up to 9 LLM calls per shipment if you do not segment first.
6. AutoGen / AG2
6.1 Conversation between agents
AutoGen models agents that send messages to each other until they converge:
user_proxy = UserProxyAgent(name="user")
assistant = AssistantAgent(name="assistant", llm_config=...)
user_proxy.initiate_chat(assistant, message="Diseña el rebook para SHP-001")
The flow is not in a graph — it emerges from dialogue.
6.2 When to use
- Coding agents (generate + execute + fix code).
- Design exploration with several simulated "experts".
- Quick prototypes without strict compliance.
6.3 When NOT to use
- Transactional services (payments, regulated rebooking).
- When you need exactly-once or an audit trail per step.
- Production without refactoring to LangGraph.
6.4 AG2 (AutoGen evolution)
AG2 adds better typing, agent groups, and explicit termination. The mental model remains conversational.
7. BeeAI and Semantic Kernel
7.1 BeeAI (IBM)
Modular framework oriented to IBM/watsonx enterprise:
- Agents with integrated governance and policies.
- Integration with watsonx.ai and Granite.
- Useful if your stack is already IBM; medium learning curve.
7.2 Semantic Kernel (Microsoft)
Plugins + Planners on .NET/Azure:
- Typed functions as plugins.
- Automatic planners that chain plugins.
- Ideal in Azure/OpenAI ecosystem; less common in pure Python.
7.3 Quick comparison (see also tecnologias-comparadas.md §9)
| Framework | Control | Fan-out | Enterprise |
|---|---|---|---|
| LangGraph | ★★★★★ | ★★★★★ | Production |
| CrewAI | ★★★☆☆ | ★★☆☆☆ | Prototypes |
| AutoGen/AG2 | ★★☆☆☆ | ★★☆☆☆ | Exploration |
| BeeAI | ★★★☆☆ | ★★★☆☆ | IBM stack |
| Semantic Kernel | ★★★★☆ | ★★★☆☆ | Azure/.NET |
8. Framework selection and combination
8.1 Decision tree
Massive event-driven processing?
└─ YES → LangGraph + Kafka fan-out (template 10)
Fixed roles like an "editorial team"?
└─ YES → CrewAI sequential/hierarchical
Exploration / coding / free dialogue?
└─ YES → AutoGen (prototype) → migrate to LangGraph
IBM watsonx stack?
└─ YES → BeeAI
Azure/.NET stack?
└─ YES → Semantic Kernel
8.2 Combining frameworks (hybrid pattern)
It is valid and common:
- CrewAI to generate offline report drafts.
- LangGraph for the transactional worker in production.
- AutoGen in the development sandbox.
What we do not recommend: two frameworks orchestrating the same flow in production — duplicates observability and failure points.
8.3 RAGorbit as a unifying layer
The flow.json abstracts the framework:
agent.react→ LangGraph ReAct (codegen).agent.fanout→ asyncio + LangGraph subgraph.- Tools →
@tool/tool.serviceindependent of the orchestration framework.
9. Layer ③ explained: multi-agent frameworks from scratch
Prerequisite: you have implemented
lab/solucion_scratch.pyor understand each agent you wrote by hand. Read this section in full beforelab/solucion_framework.py.Environment: no pip/network in the course. The goal is that, with
pip install crewai langgraph langchain langchain-anthropic, you can write the framework solution yourself.
9.1 Recap and cross-links
| Module | What you learned | Link |
|---|---|---|
| M1 §11 | LangChain base: ChatAnthropic, messages, invoke |
M1 §11 |
| M6 §8 | @tool, create_react_agent, StateGraph, MemorySaver |
M6 §8 |
| M7 | Multi-agent: supervisor, fan-out, CrewAI, conditional edges | This section |
What is new in M7: not a single tool or a single ReAct loop — it is orchestrating several agents that pass state and branch with conditional edges.
9.2 Bridge table: scratch → CrewAI / LangGraph
| What you did by hand (layer ②) | CrewAI (layer ③) | LangGraph (layer ③) |
|---|---|---|
PriorityRulesAgent.classify() |
Classifier Agent Task |
supervisor node |
ProfileAgent, PolicyAgent, … |
Researcher Agent + @tool |
profile, policy, alternatives nodes |
if track == "simple": autoconfirm else: llm |
Executor Task with instructions |
add_conditional_edges after alternatives |
SupervisorOrchestrator.fan_out() |
for event: crew.kickoff(...) |
for event: graph.invoke(...) |
FakeLLMAgent.analyze() |
Executor Agent with real LLM | llm_specialist node |
self._processed (idempotency) |
External cache / flag in task output | checkpointer + thread_id=shipment_id |
Trace [profile_agent], [llm_agent] |
verbose=True in Crew |
Node stream / LangSmith |
9.3 CrewAI from scratch — APIs used by solucion_framework.py
Agent
from crewai import Agent
researcher = Agent(
role="Investigador de rebook", # título del rol
goal="Recopilar perfil, política y alternativas",
backstory="Conoce PolicyRAG y servicios de routing.",
tools=[get_shipment_profile, get_alternatives], # LangChain @tool
llm=llm,
verbose=True,
)
role+goal+backstory≈ specialized system prompt from scratch.tools: the same@toolfrom M6.
Task
from crewai import Task
research_task = Task(
description="Para el envío en {event_json}, llama las tools necesarias.",
expected_output="JSON con perfil, política y alternativas",
agent=researcher,
context=[classify_task], # recibe output de tasks anteriores
)
contextchains tasks likememory.appendin scratch.expected_outputguides the agent's internal evaluation.
Crew and Process
from crewai import Crew, Process
crew = Crew(
agents=[classifier, researcher, executor],
tasks=[classify_task, research_task, execute_task],
process=Process.sequential,
)
result = crew.kickoff(inputs={"event_json": json.dumps(event)})
Process.sequential= fixed pipeline A → B → C (like yourprocess_event).Process.hierarchical= manager LLM delegates (hierarchical pattern §2.2).
9.4 LangGraph multi-agent from scratch
Shared state
class RebookState(TypedDict):
messages: Annotated[list, add_messages]
event: dict
track: str
profile: dict
alternatives: list
handler: str
All nodes read/write fields of RebookState — equivalent to the dict you passed between agents in scratch.
Specialist nodes
def node_profile_agent(state: RebookState) -> RebookState:
profile = get_shipment_profile.invoke({"shipment_id": state["event"]["shipment_id"]})
return {"profile": profile, "messages": [AIMessage(content=f"Profile: {profile['tier']}")]}
builder.add_node("profile", node_profile_agent)
Each node = one scratch agent class.
Supervisor + conditional edges
def route_after_alternatives(state) -> Literal["autoconfirm", "llm"]:
return "llm" if state["track"] == "complex" else "autoconfirm"
builder.add_conditional_edges("alternatives", route_after_alternatives,
{"autoconfirm": "autoconfirm", "llm": "llm_specialist"})
This is the if track == "simple" of SupervisorOrchestrator.process_event.
Compile and run
graph = builder.compile()
final = graph.invoke({"event": event, "messages": [], ...})
9.5 Block-by-block walkthrough of solucion_framework.py
Block 1 — Data and shared @tool (lines 1–75)
Identical to scratch. Tools are the common interface between CrewAI and LangGraph.
Block 2 — CrewAI (lines 78–145)
| Fragment | Scratch equivalent |
|---|---|
Classifier Agent |
PriorityRulesAgent |
Researcher Agent + tools |
ProfileAgent + PolicyAgent + AlternativesAgent |
Executor Agent |
AutoConfirmAgent + FakeLLMAgent |
Task with context=[...] |
Call order in process_event |
crew.kickoff(inputs={...}) |
orchestrator.process_event(event) |
Loop for event in events |
fan_out() |
Block 3 — LangGraph multi-agent (lines 148–280)
| Fragment | Scratch equivalent |
|---|---|
RebookState |
Local fields of process_event |
node_supervisor |
PriorityRulesAgent.classify |
node_profile_agent … node_alternatives_agent |
Specialist agents |
route_after_alternatives |
Auto-confirm vs LLM branch |
node_llm_specialist |
FakeLLMAgent (with real LLM) |
build_langgraph_multi_agent |
SupervisorOrchestrator |
Block 4 — Comparative demo (lines 283–end)
Runs both frameworks on the same 6 events and prints CrewAI vs LangGraph table.
9.6 When to use each framework and gotchas
| Situation | Use | Why |
|---|---|---|
| Fan-out 3000 shipments + Kafka + audit | LangGraph | Explicit graphs, checkpoints, LangSmith |
| Prototype "team" researcher+executor | CrewAI | Less boilerplate, declarative roles |
| Explore free dialogue between agents | AutoGen | Emergent; migrate to LangGraph afterward |
| IBM watsonx enterprise | BeeAI | Native governance |
| Same problem, compare in the lab | Both CrewAI + LangGraph | See trade-offs in practice |
Gotchas:
- CrewAI without prior segmentation → 3 LLM agents per simple shipment = unnecessary cost. Replicate
logic.rulesbefore the crew. - LangGraph without
thread_idper shipment → you mix state between shipments in fan-out. - Misnamed conditional edge → the graph ends without running
autoconfirm. Dict keys must match exactly. - AutoGen in transactional production → unpredictable conversation; hard to meet exactly-once.
- Duplicating logic between CrewAI and LangGraph → extract shared tools (
SHARED_TOOLSin the lab).
9.7 Checklist before writing solucion_framework.py
- Shared tools with docstrings that indicate when to use them?
- CrewAI: 3 agents + 3 tasks +
Process.sequential? - LangGraph: one node per specialist + conditional edge after
alternatives? - Does
RebookStateincludetrackfor the router? - External loop per event to simulate fan-out?
- CrewAI vs LangGraph trade-offs table at the end?
Next step: lab/enunciado.md Part B — write the file before looking at the solution.
Beyond Lang*: besides LangGraph and CrewAI, the rebooking/flight-change case is covered in AutoGen/AG2, Pydantic-AI, and a native multi-agent loop (no framework) in
../referencia/agentes-sin-langchain.md. And review the critiques of the LangChain/LangGraph/LangSmith stack to decide multi-agent vs single agent vs native SDK.
10. RAGorbit nodes in this module
agent.fanout
Ports:
→ Event (from io.event-source / logic.router)
→ Tool (n) — sub-agent tools
← Any — toward notify, audit, metrics
Config:
concurrency: 16
subAgentSystem: "stateless sub-agent instructions"
agent.react (in conversational sub-agents)
Still the node for one user; in template 10 the fan-out sub-agent uses it internally for complex cases.
tool.service + tool.retriever
Template 10 uses:
ShipmentProfileService,AlternativesService,AutoConfirmServicepolicy_rag(tool.retrieveroverstore.pgvector)
11. Template 10 · Logistics
The most complete multi-agent fan-out template in RAGorbit.
Flow summary:
- Kafka
shipment.disruption→logic.rules(P1/P2/P3). logic.router→agent.fanout(simple and complex to the same node).- Sub-agent per shipment: tools + selective LLM.
io.notify+observability.audit+ OTLP metrics.
Key metrics in a crisis:
rebooking_autoconfirm_total / rebooking_processed_total→ efficiency.rebooking_duration_secondsby priority → P95 latency.
Full documentation: examples/10-logistics-disruption-rebooking/README.md and flow.json.
12. Checkpoint — You know it if you can…
- Explain when a single
agent.reactis enough and when you need multi-agent. - Draw the 4 patterns (supervisor, hierarchical, collaborative, fan-out).
- Describe why
logic.rulesgoes before the LLM in template 10. - Build a
StateGraphwith supervisor andadd_conditional_edges. - Explain
Agent/Task/Crew/Processin CrewAI. - Compare AutoGen vs LangGraph for production auditability.
- Map each scratch class to its CrewAI and LangGraph node (table §9.2).
- Read template 10
flow.jsonand identify fan-out, rules, and tools. - Complete the lab: 6 shipments, 3 auto-confirm, 3 LLM, idempotency.
- Justify framework choice for a new brief (tree §8.1).
If you cannot: review §2 (patterns), §9 (frameworks from scratch), and lab/enunciado.md.