Claude Agent SDK
Prompt: Explain the Claude Agent SDK. What it is, how the agent loop works, the relationship to Claude Code and MCP, the tools and permissions model, hooks, subagents, sessions, and a worked example. Include diagrams.
A library that takes the autonomous loop powering Claude Code and lets you embed it in your own programs. Same loop, your harness.
If you've used Claude Code, you've used an agent: it reads files, runs commands, edits things, iterates, and stops when the task is done. The Claude Agent SDK is that same machinery, exposed as a library you can call from Python or TypeScript. You hand it a prompt and some options, and it runs the loop inside your program instead of inside a terminal.
The point is to spare you from rebuilding the agentic plumbing. With the raw Messages API, you implement the tool loop yourself: parse tool calls, execute them, feed results back, manage context, decide when to stop. The Agent SDK does all of that, and brings along the same tools, permission model, hooks, and subagent system that Claude Code uses.
1. What it actually is
Two packages, same engine:
claude-agent-sdkfor Python (3.10+)@anthropic-ai/claude-agent-sdkfor TypeScript / Node (18+)
Both wrap the Claude Code agent loop. The TypeScript SDK bundles a native Claude Code binary, so there's nothing extra to install. The product was originally called the "Claude Code SDK" and was renamed in late 2025 to make the framing clearer: it isn't a separate thing, it's Claude Code's agent capabilities as a library.
The contrast worth holding in mind: with the plain Messages API, your code is the orchestrator. With the Agent SDK, the SDK is the orchestrator, and your code is the harness around it.
| Messages API | Agent SDK | |
|---|---|---|
| Tool loop | You write it | Built in |
| Tool execution | You implement every tool | Lots of tools come for free |
| Context / compaction | You manage | Automatic |
| Permissions | None | Modes, allow / deny rules, hooks |
| Sessions | You store history | Resume / fork built in |
| Best for | Bespoke single-shot reasoning | Anything that takes more than one turn and touches the world |
2. The agent loop
The agent loop is the heart of the SDK and the thing worth understanding before anything else. Every "turn" looks like this:
- Claude receives the conversation so far (system prompt, prior turns, available tools).
- Claude emits an assistant message: some reasoning, possibly some tool calls.
- Each tool call is run by the SDK (after passing permission checks and hooks).
- The tool results are wrapped as a user message and appended to the conversation.
- Back to step 1. The loop ends when Claude produces an assistant message with no tool calls. That final message is the result.
maxTurns and maxBudgetUsd are safety belts that can cut it short.The same loop powers every Agent SDK use case, whether it's a one-line shell helper or a 30-turn refactor that touches 80 files. Simple tasks finish in two turns; involved ones take dozens. You don't choose how many turns; Claude decides when it's done, within the limits you set.
compact_boundary message so you can see where it happened.
3. Built-in tools and the toolbox
An Agent SDK agent isn't useful unless it can do things. By default it gets the same toolbox Claude Code has:
| Tool | What it does |
|---|---|
Read | Read any file |
Write | Create a new file |
Edit | Make precise string-level edits to an existing file |
Bash | Run shell commands (git, build tools, scripts) |
Glob | Find files by pattern, e.g. **/*.ts |
Grep | Search file contents with regex |
WebSearch | Search the web |
WebFetch | Fetch and parse a URL |
Agent | Spawn a subagent for a focused subtask |
AskUserQuestion | Prompt the user with multiple-choice options |
TodoWrite | Maintain a working todo list across turns |
You don't have to expose all of them. The allowedTools option narrows the toolbox down to what the agent actually needs, both as a safety measure and because a smaller toolbox is easier for Claude to reason about. A read-only research agent might be given just ["Read", "Glob", "Grep", "WebSearch", "WebFetch"].
4. MCP: the protocol for plugging in tools
The Model Context Protocol (MCP) is a small standard for exposing tools to a language model over a server interface. The Agent SDK speaks MCP fluently, which means there's a ready-made way to plug in:
- Off-the-shelf tools published by other people (Playwright for browser automation, GitHub, Sentry, Linear, databases, file systems...)
- Your own tools, defined in your codebase and run in-process
- Hosted tool servers behind an HTTP boundary
Tools loaded over MCP appear to Claude under the name mcp__<server>__<tool>, which is why you'll see names like mcp__playwright__navigate or mcp__github__create_issue when an agent uses them.
5. Custom tools
The fastest way to add a tool is to write a function, decorate it with a schema, and bundle it into an in-process MCP server:
from claude_agent_sdk import tool, create_sdk_mcp_server, query, ClaudeAgentOptions
@tool("search_users", "Search the users table", {"q": str, "limit": int})
async def search_users(args):
rows = await db.search(args["q"], limit=args["limit"])
return {"content": [{"type": "text", "text": json.dumps(rows)}]}
server = create_sdk_mcp_server(name="db", tools=[search_users])
async for msg in query(
prompt="Find users whose email contains 'acme'",
options=ClaudeAgentOptions(
mcp_servers={"db": server},
allowed_tools=["mcp__db__search_users"],
),
):
... The schema ({"q": str, "limit": int}) becomes the tool's input contract; the SDK validates inputs before calling your function. Your handler returns a content payload Claude can read. If you raise an exception, the loop stops; if you return isError: true, the loop continues and Claude sees the error and can react.
6. Permissions: what the agent is allowed to do
An autonomous loop that can edit files and run shell commands is powerful and slightly terrifying. The SDK has a layered permission model so you can dial that down. Each tool call passes through these gates in order:
The five permission modes set the default behaviour for tools you haven't explicitly listed:
| Mode | What it means | Use it for |
|---|---|---|
default | Anything unmatched routes through your canUseTool callback | Interactive apps where a human decides |
plan | Read-only tools only; the agent can explore but can't change anything | "Tell me what you'd do" before approval |
acceptEdits | File edits auto-approved; other tools follow the rules | Coding agents where edits are the whole point |
dontAsk | Unmatched tools denied silently | Locked-down agents that can only use a whitelist |
bypassPermissions | Everything auto-approved | Sandboxed CI / container only. Hands-off the wheel. |
An interactive setup typically pairs permissionMode: "default" with a canUseTool callback that pops up a UI prompt. A background CI job typically pairs permissionMode: "acceptEdits" with a tight allow-list and runs in a disposable container.
7. Hooks: intercepting the loop
Hooks are callbacks that fire at well-defined points in the loop. They let you inject behaviour without forking the SDK or rewriting the agent.
| Event | Fires | Used for |
|---|---|---|
PreToolUse | Before a tool runs | Block dangerous commands, validate inputs |
PostToolUse | After a tool returns | Audit logs, append context, side effects |
UserPromptSubmit | User sends a prompt | Inject system context, sanitise input |
Stop | Agent finishes | Save state, post results |
SubagentStart / Stop | Subagent lifecycle | Track parallel work |
PreCompact | Before context compaction | Archive the full transcript |
Notification | Status / progress messages | Forward to Slack, PagerDuty |
A hook can also reshape the call: a PreToolUse hook can return permissionDecision: "deny" to block it, updatedInput to rewrite the arguments, or systemMessage to inject extra instructions before the next turn.
async def block_rm_rf(input_data, tool_use_id, context):
if input_data["tool_name"] == "Bash":
cmd = input_data["tool_input"].get("command", "")
if "rm -rf" in cmd:
return {"permissionDecision": "deny",
"reason": "rm -rf is not allowed"}
return {}
options = ClaudeAgentOptions(
hooks={"PreToolUse": [HookMatcher(matcher="Bash", hooks=[block_rm_rf])]}
) 8. Subagents: delegating focused work
Long tasks suffer two problems: the context window fills with low-value chatter, and Claude has to keep too many concerns in mind at once. Subagents solve both. The main agent spawns a child agent for a specific subtask. The child gets its own fresh context, runs to completion, and reports back a single result. Only that result enters the parent's history.
Each subagent can have its own toolbox, its own system prompt, even its own reasoning effort level. A "security-reviewer" might be given effort: "high" and only read-only tools; a "test-runner" might have Bash for running test commands. Subagents can run in parallel, which is one of the easier wins for wall-clock time on big tasks.
9. Sessions, resume, fork
Every call to the SDK happens inside a session: the accumulated conversation, the files Claude has read, the decisions it has made. Sessions are stored locally (in ~/.claude/projects/ by default) and can be picked back up later.
| Pattern | What it does | When to use |
|---|---|---|
continue: true | Resume the most recent session in this directory | Same-process chat loop |
resume: <id> | Resume a specific past session | Cross-process, "open this conversation" |
fork_session: true | Branch from an existing session | "Try a different approach without losing the original" |
The session id arrives in the final ResultMessage; store it somewhere if you want to come back to it. Forking gives you tree-shaped exploration: try two refactor strategies from the same starting point, compare results, keep one.
10. A worked example
Putting it together: a small agent that reads a Python file, fixes any obvious bugs, and runs the test suite, with a hard cap on cost and turns. In Python:
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage, AssistantMessage
async def main():
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Bash", "Glob", "Grep"],
permission_mode="acceptEdits",
max_turns=30,
max_budget_usd=2.00,
)
async for message in query(
prompt="Fix the failing tests in tests/test_payments.py and explain the fix.",
options=options,
):
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text)
elif isinstance(message, ResultMessage):
print(f"\nDone. {message.subtype}. cost=$" \
f"{message.total_cost_usd:.4f} session={message.session_id}")
asyncio.run(main()) And the same shape in TypeScript:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Fix the failing tests in tests/test_payments.py and explain the fix.",
options: {
allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep"],
permissionMode: "acceptEdits",
maxTurns: 30,
maxBudgetUsd: 2.00,
},
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if (block.type === "text") process.stdout.write(block.text);
}
} else if (message.type === "result") {
console.log(`\nDone. ${message.subtype}. cost=$${message.total_cost_usd.toFixed(4)}`);
}
} What's happening under the hood: the SDK assembles a system prompt with the tool descriptions, sends Claude the user message, parses the response, runs any tool calls (read the test file, run the tests, edit the source, run the tests again...), feeds results back, and keeps looping until Claude either says "I'm done" or hits one of the limits.
11. When to reach for it
The Agent SDK is the right tool when your task needs more than one round of reasoning and touches the outside world: files, shell, web, your own systems. Bug fixing, code review, refactors, deployment safety checks, SRE incident responders, research assistants, anything where "look, think, act, look again" is the natural shape.
It's overkill when you just want a single completion or structured output from a prompt; the bare Messages API is faster and simpler for that. It's also worth knowing about Managed Agents, Anthropic's hosted alternative: same agent loop, but run on Anthropic's infrastructure with a REST API, suited to long-running tasks where you don't want to manage a process. The Agent SDK runs in your own process and is the right fit when you want direct control and to bring your own tools.
The reason it's worth understanding rather than skipping is that the agent loop is becoming the default shape of AI applications. The single-shot prompt-and-response model handles a narrow slice of what's useful; the things people actually want a model to do (write code, run it, search the web, draft and send things) are loops. The Agent SDK is one of the cleaner takes on that loop, with the bonus that it's the same one Anthropic uses to build Claude Code itself.
Based on the official Agent SDK docs as of early 2026. The SDK is on GitHub for Python and TypeScript; the reference docs live at code.claude.com/docs/en/agent-sdk. Renamed from "Claude Code SDK" in late 2025.