AI AGENT SKILLS

Agentsop Agent Topology Selection

一个面向 Automation 场景的 Agent 技能。原始说明:Cross-framework enhancement overlay for choosing a multi-agent topology BEFORE writing any agent. A binary-question rubric — is single-agent + tools enough?...

SKILL.md

SKILL.md


name: agentsop-agent-topology-selection
version: 0.1.0
description: >-
Cross-framework enhancement overlay for choosing a multi-agent topology BEFORE writing any
agent. A binary-question rubric — is single-agent + tools enough? do agents need to know
about each other? does the output need one voice? — maps the answer to single-agent /
supervisor / swarm / sequential / hierarchical. Activates when a coder agent is tempted to
"split the work into roles" or reaches for a multi-agent framework. Encodes the *selection
rubric* that the per-framework skills assume but never surface. Search keywords: when to
use multi-agent, single vs multi agent, do I need multiple agents, supervisor vs swarm,
multi-agent vs single agent, agent team design.
overlay: true
cross_links: [crewai, langgraph, bounded-loop]


Multi-Agent Topology Selection · SOP (ENHANCE overlay)

Overlay posture: this skill decides whether and which topology. It does not

teach the API — descend to [[crewai]] or [[agentsop-langgraph]] for that. Every

load-bearing claim carries an inline source tag resolving in

references/R1-source-evidence.md.


1. 何时激活 (When to Activate)

Activate when any of the following fire:

  • The task description contains "team of agents", "researcher + writer + reviewer",

"manager agent", "agents that hand off", "split this into roles", or "multi-agent".

  • A coder agent is about to instantiate ≥2 agents (CrewAI Agent(...) × N,

LangGraph supervisor/swarm, OpenAI Swarm handoffs) and has not yet justified
why a single agent with tools is insufficient.

  • Someone is choosing between CrewAI Process.sequential vs Process.hierarchical,

or LangGraph supervisor vs swarm vs hierarchical-teams, and wants the rubric,
not the syntax.

  • A multi-agent system is over budget on tokens/latency and the question is "can we

collapse agents back into one?".

Do not activate for: a single LLM call, a one-shot RAG query, or a fixed
tool-call pipeline with no role separation. Those are the single-agent baseline
this skill defends.

Mental check: "An agent needs agency, otherwise it's just another script."

— João Moura, CrewAI founder [[crewai · §1.3]]. If you can write the control

flow in if/else, you do not need multiple agents — you need one agent (or a

graph) with explicit edges.


2. 核心心智模型 (Core Mental Model)

Most "multi-agent" problems are single-agent + tools. Add agents only when
context isolation or parallel expertise genuinely demands it.

"Single-agent is right for approximately 80% of cases; the trap is reaching for

multi-agent because it sounds more capable." [[crewai · DC-1]]

Two — and only two — forces justify a second agent:

  1. Context isolation. One agent's working context would pollute another's

(a critic that must not see its own draft's rationalisations; a tool-heavy
sub-task whose 40 intermediate tool calls should not bloat the main thread).
Splitting gives each agent a clean, bounded prompt.

  1. Parallel expertise. Two genuinely different skills run concurrently or in

strict sequence (research → write → review), where a single prompt provably
cannot hold both jobs without quality collapse [[crewai · DC-1]].

If neither force is present, a single agent with the union of tools wins
fewer hops, fewer tokens, no handoff failures. This is the baseline the rubric
must beat, not the default to escape.

The selection rubric (three binary questions)

Q0  Is single-agent + tools enough?
      (no context-isolation need, no parallel-expertise need)
        YES → single-agent + tools. STOP. Do not add agents.
        NO  → ↓

Q1  Do the agents need to KNOW ABOUT EACH OTHER (peer handoff)?
        NO  → one funnels through a coordinator → SUPERVISOR
              (or static order → SEQUENTIAL, if order is fixed)
        YES → ↓

Q2  Must the OUTPUT speak with ONE VOICE / single audit funnel?
        YES → SUPERVISOR (single user-facing persona, one funnel)
        NO  → SWARM (dynamic peer handoff, last-active agent remembered)

Scaling override: ≥6 specialists that group into teams → HIERARCHICAL
(supervisor-of-supervisors). Use only for grouping, not for routing.

The two questions that actually separate the patterns: (a) can sub-agents know
each other, (b) is one user-facing voice mandated.
Everything else is tuning
[[langgraph · Case 2]].


3. SOP 工作流 (Selection Protocol)

Walk top-down. Each gate can send you back down the ladder — collapsing agents
is as valid an answer as adding them.

Step 1 · Defend the single-agent baseline first

Ask Q0. Enumerate the would-be roles. For each, ask: *would merging it into one
agent's prompt + toolset actually degrade output?* If you cannot point to a
concrete failure mode (style drift, missed checklist, context bloat, parallel
latency), the honest answer is single-agent + tools. Exit here ~80% of the time
[[crewai · DC-1]].

Step 2 · If splitting, decide static vs dynamic routing

  • Order is fixed and known at design time (research always precedes write

precedes review) → SEQUENTIAL. Cheapest, most debuggable, 1× token baseline
[[crewai · §2.3]]. In CrewAI this is Process.sequential; in LangGraph it is
static edges A→B→C.

  • Routing must be decided at runtime (which specialist handles this query) →

you need a coordinator or peer handoff → continue to Step 3.

Step 3 · Coordinator (supervisor) vs peers (swarm)

Ask Q1 then Q2.

  • Agents that do not know each other and funnel through one router →

SUPERVISOR. Sub-agents are effectively tools the supervisor calls; the
supervisor "translates" their output back to the user — which is exactly why
it costs the most tokens [[langgraph · Step 4]].

  • Agents that do know each other and no single voice is mandated →

SWARM. Dynamic handoff, last-active agent stays active across turns, no
translation step → fewer tokens, slightly higher accuracy on the τ-bench retest
[[langgraph · §SOP Step 4]].

Step 4 · Apply the supervisor-default caveat (read this twice)

LangChain's own benchmark found swarm "slightly outperformed supervisor across
all scenarios" and supervisor "consistently uses more tokens than swarm" — yet
they still ship supervisor as the recommended default [[langgraph · Step 4]].
Why the nuance matters:

  • Supervisor is the safest with third-party / untrusted agents (single funnel,

single audit log, single place to enforce policy) [[langgraph · Step 4]].

  • Swarm is a bad fit for third-party agents — peers handing off to peers means

no central control point [[langgraph · Step 4]].

  • So: do not copy "default = supervisor" blindly. If your agents are internal

and trusted, swarm is often the better pick the default hides. Pick on the two
questions, not on the framework's default.

Step 5 · Verify the chosen topology can terminate

Any topology with runtime handoff (swarm, hierarchical, CrewAI delegation) can
loop. Bound it before shipping — cross-link [[agentsop-bounded-loop]]. Concretely:
default allow_delegation=False on workers, set per-agent max_iter, wrap an
outer timeout, and bake an explicit exit counter into state rather than trusting
the LLM to stop [[crewai · DC-5]] [[agentsop-bounded-loop]].

Step 6 · Reconsider before scaling agents up

≥6 specialists → group into HIERARCHICAL teams purely for navigability, not
to get free routing (see Anti-patterns). At >5 agents CrewAI starts hitting
coordination failure [[crewai · §6.1]]; that is a signal to group or collapse,
not to add more.


4. 操作模型 (Operation Models)

Format: Trigger → Action → Output → Evidence.

OP-1 · Defend the single-agent baseline

  • Trigger: a task is being described as "a team of agents".
  • Action: list the proposed roles; for each, name the concrete failure that

merging into one agent would cause (style drift / missed checklist / context
bloat / required parallelism). No nameable failure ⇒ single agent.

  • Output: single-agent + tools, OR a justified list of must-split roles.
  • Evidence: [[crewai · DC-1]] "single-agent right for ~80% of cases".

OP-2 · Single-vs-multi gate

  • Trigger: baseline defended and at least one role has a real split-justifying

failure mode.

  • Action: confirm the force is context isolation or parallel expertise

not "it sounds more capable". If only the latter, stay single.

  • Output: a yes/no on multi-agent with the force named in one sentence.
  • Evidence: [[crewai · §2.2]], [[crewai · DC-1]].

OP-3 · Static-order gate (→ sequential)

  • Trigger: multi-agent confirmed; the order of work is fixed at design time.
  • Action: choose SEQUENTIAL — list agents in execution order; pass dependencies

explicitly (CrewAI context=[...]), do not rely on implicit transfer.

  • Output: an ordered task list; CrewAI Process.sequential or LangGraph static

edges.

  • Evidence: [[crewai · §2.3]] (sequential = 1× tokens, low debug cost).

OP-4 · Peer-awareness gate (Q1 → supervisor vs swarm branch)

  • Trigger: routing must be decided at runtime.
  • Action: ask "do sub-agents know about each other?" — NO ⇒ supervisor branch;

YES ⇒ continue to OP-5.

  • Output: chosen branch.
  • Evidence: [[langgraph · Step 4]] decision tree.

OP-5 · Single-voice gate (Q2 → supervisor vs swarm)

  • Trigger: peers know each other (Q1=YES).
  • Action: ask "must output be one voice / single audit funnel?" — YES ⇒

SUPERVISOR; NO ⇒ SWARM.

  • Output: SUPERVISOR or SWARM.
  • Evidence: [[langgraph · Case 2]] (compliance ⇒ supervisor; UX continuity ⇒ swarm).

OP-6 · Apply the supervisor-default caveat

  • Trigger: supervisor was selected, OR you are about to accept a framework

default.

  • Action: check trust. Third-party/untrusted agents ⇒ supervisor is correct.

Internal/trusted ⇒ re-test whether swarm's lower token cost wins; if so, switch.

  • Output: a topology chosen on trust + the two questions, not on the default.
  • Evidence: [[langgraph · Step 4]] — swarm beats supervisor on bench, yet

supervisor remains the shipped default for third-party safety.

OP-7 · Tune-before-switch

  • Trigger: chosen topology is over token/latency budget.
  • Action: before changing paradigm, apply the published fixes to the current

one. For supervisor: remove handoff messages, add a forwarding-messages tool,
optimise tool naming — LangChain measured "nearly 50% increase in performance"
[[langgraph · Step 4]]. Switch paradigm only if still over budget.

  • Output: a tuned topology or a justified migration.
  • Evidence: [[langgraph · Case 2]].

OP-8 · Bound the topology

  • Trigger: any runtime-handoff topology before ship.
  • Action: allow_delegation=False on workers, per-agent max_iter, outer

timeout, explicit state-based exit counter.

  • Output: a topology that provably terminates.
  • Evidence: [[crewai · DC-5]] (delegation ping-pong), [[agentsop-bounded-loop]].

5. 困境决策案例 (Dilemma Cases)

Case 1 · "Swarm beats supervisor on the benchmark — why ship supervisor?"

  • 困境: LangChain's own multi-agent benchmark shows swarm "slightly

outperformed supervisor across all scenarios" and supervisor "consistently uses
more tokens" (the telephone-game translation overhead). Yet LangChain's
recommended default is still supervisor [[langgraph · Step 4]]. A coder
agent copying the default would leave accuracy and tokens on the table.

  • 约束:
  • Want the benchmark's efficiency (swarm).
  • But may integrate third-party / untrusted agents later.
  • Need a single auditable funnel for tool calls (compliance).
  • 决策步骤:
  1. Read the default's reason, not the default. Supervisor wins on **safety with

third-party agents and single audit funnel** — not on accuracy [[langgraph · Step 4]].

  1. If all agents are internal and trusted and no single-voice mandate →

pick swarm; the default does not apply to you.

  1. If a single auditable funnel is mandated → keep supervisor, then apply the

three fixes (remove handoff messages, forwarding-messages tool, tool-name
tuning) for the measured ~50% bump before concluding it is too slow
[[langgraph · Case 2]].

  1. Re-measure tokens; migrate to swarm only if still over budget and audit is

tolerant.

  • 结果: Topology chosen on trust + the two questions. The benchmark's "swarm >

supervisor" is true and the supervisor default is rational — for a different
constraint (third-party safety) than the one the benchmark measured (accuracy/cost).

  • 可提取的操作: OP-6. **A framework default encodes the framework author's

worst-case constraint, not yours. Decode the reason; re-derive for your case.**

Case 2 · "CrewAI hierarchical for simple routing is structurally broken"

  • 困境: 5 agents, order roughly fixed, but the system should route — skip

irrelevant specialists based on the query. Process.hierarchical looks like it
"should auto-route". In practice the manager executes all tasks and the last
task's output overwrites the rest — it does not skip on triage [[crewai · DC-2]].

  Query: "Why is my laptop overheating?" (pure technical)
  Expected:  triage → technical_agent → done
  Hierarchical reality: triage → technical → billing → … → last output wins
  • 约束: routing is required; default manager_llm is unreliable; latency/cost

matter.

  • 决策步骤:
  1. Recognise hierarchical-for-routing is the wrong tool: CrewAI hierarchical is a

coordination topology, not a router [[crewai · DC-2]].

  1. If routing logic fits in ~5 lines of Python → use a CrewAI Flow (@router

/ @listen) or a LangGraph conditional edge — explicit branch, then call
one small crew / single agent per branch [[crewai · DC-2]].

  1. If routing genuinely needs LLM semantic judgement → hierarchical **with a

custom manager_agent** carrying an explicit branching backstory — never
the bare default manager_llm [[crewai · OP-4]].

  1. Bound it: workers allow_delegation=False, outer timeout [[agentsop-bounded-loop]].
  • 结果: Routing handled by explicit control flow; hierarchical reserved for

true coordination of trusted teams, never for if/else routing.

  • 可提取的操作: **Hierarchical ≠ router. Routing is control flow — make it

explicit (Flow / conditional edge), don't delegate it to a manager LLM that will
run everything.** Compare with d-query-routing-skill for the routing-specific rubric.


6. 反模式与边界 (Anti-patterns & Boundaries)

  • Multi-agent theater. Splitting into roles because it "sounds more capable"

with no context-isolation or parallel-expertise force. Symptom: agents that just
pass a string along, each adding a paragraph. Fix: collapse to one agent + tools
[[crewai · DC-1]].

  • Hierarchical for simple routing. Using Process.hierarchical (or a

supervisor) to get free if/else routing. It runs everything; routing must be
explicit control flow (Flow / conditional edge) [[crewai · DC-2]].

  • Copying the supervisor default blindly. The benchmark says swarm is better on

accuracy and tokens; supervisor is the default for third-party safety. Internal
trusted agents should reconsider swarm [[langgraph · Step 4]].

  • Agent-count explosion (>5). Coordination failure, token blow-up, debug pain.

≥6 ⇒ group into hierarchical teams for navigability, or merge near-duplicate
roles [[crewai · §6.1]].

  • Unbounded delegation. allow_delegation=True on every agent ⇒ ping-pong

loops. Default off on workers; bound with max_iter + timeout [[crewai · DC-5]],
[[agentsop-bounded-loop]].

  • Picking on aesthetics. Choosing supervisor/swarm/sequential by which "feels

cleaner" instead of the two binary questions (peer-awareness, single-voice)
[[langgraph · Case 2]].

Hard boundaries (this rubric does NOT decide):

  • Which framework — see [[crewai]] vs [[agentsop-langgraph]] ecosystem sections.
  • Routing by query kind — that is d-query-routing-skill.
  • Bounding the loop — that is [[agentsop-bounded-loop]].
  • Latency < 200ms / single LLM call — no multi-agent framework at all

[[langgraph · 反模式]].


7. 跨框架对照 (Cross-Framework Mapping)

| Topology | CrewAI | LangGraph | OpenAI Swarm | Choose when |
|---|---|---|---|---|
| Single-agent + tools | one Agent + tools (skip Crew) | create_react_agent | one routine | Q0=YES — ~80% of cases [[crewai · DC-1]] |
| Sequential | Process.sequential + context=[...] | static edges A→B→C | linear handoffs | order fixed at design time [[crewai · §2.3]] |
| Supervisor | Process.hierarchical + custom manager_agent | supervisor pattern (sub-agents as tools) | central routine dispatching | peers don't know each other; one voice / third-party safety [[langgraph · Step 4]] |
| Swarm | (no native; Flow + handoff funcs) | swarm pattern (dynamic handoff) | handoff between agents | peers know each other; no single-voice mandate; internal/trusted [[langgraph · Step 4]] |
| Hierarchical teams | nested crews via Flow | supervisor-of-supervisors / subgraphs | n/a | ≥6 specialists needing grouping [[crewai · §6.1]] |

Notes:

  • CrewAI frames agents as role-playing teammates; hierarchical is coordination,

not routing — the default manager_llm runs all tasks [[crewai · DC-2]].

  • LangGraph frames topology as routing logic over typed state; supervisor is the

shipped default for third-party safety despite swarm winning the bench
[[langgraph · Step 4, Case 2]].

  • OpenAI Swarm is the minimal handoff baseline — OpenAI labels it experimental;

use as reference, not production [[langgraph · 生态对照]].

  • Single-agent + tools is the baseline every topology above must beat. Defend it

first (OP-1).

Pick the smallest topology that fits the two questions; promote upward only when a

named force demands it, and collapse back down when the force disappears.


附录: 引用 (Citations)

Inline tags resolve to source-skill sections in references/R1-source-evidence.md:

  • [[crewai]] = /Users/5imp1ex/Desktop/Skill-Workplace/output/crewai-sop-skill/SKILL.md
  • [[agentsop-langgraph]] = /Users/5imp1ex/Desktop/Skill-Workplace/output/langgraph-sop-skill/SKILL.md
  • The benchmark: LangChain, Benchmarking Multi-Agent Architectures

(www.langchain.com/blog/benchmarking-multi-agent-architectures), surfaced via
[[langgraph · Step 4 / Case 2]].