Claude Agent SDK

Prompt: Explain the Claude Agent SDK. What it is, how the agent loop works, the relationship to Claude Code and MCP, the tools and permissions model, hooks, subagents, sessions, and a worked example. Include diagrams.

A library that takes the autonomous loop powering Claude Code and lets you embed it in your own programs. Same loop, your harness.

What it actually is
The agent loop
Built-in tools and the toolbox
MCP: the protocol for plugging in tools
Custom tools
Permissions: what the agent is allowed to do
Hooks: intercepting the loop
Subagents: delegating focused work
Sessions, resume, fork
A worked example
When to reach for it

If you've used Claude Code, you've used an agent: it reads files, runs commands, edits things, iterates, and stops when the task is done. The Claude Agent SDK is that same machinery, exposed as a library you can call from Python or TypeScript. You hand it a prompt and some options, and it runs the loop inside your program instead of inside a terminal.

The point is to spare you from rebuilding the agentic plumbing. With the raw Messages API, you implement the tool loop yourself: parse tool calls, execute them, feed results back, manage context, decide when to stop. The Agent SDK does all of that, and brings along the same tools, permission model, hooks, and subagent system that Claude Code uses.

1. What it actually is

Two packages, same engine:

claude-agent-sdk for Python (3.10+)
@anthropic-ai/claude-agent-sdk for TypeScript / Node (18+)

Both wrap the Claude Code agent loop. The TypeScript SDK bundles a native Claude Code binary, so there's nothing extra to install. The product was originally called the "Claude Code SDK" and was renamed in late 2025 to make the framing clearer: it isn't a separate thing, it's Claude Code's agent capabilities as a library.

Your code calls the SDK; the SDK runs an agent loop against Claude; Claude calls tools (built-in and yours) until the task is done.

The contrast worth holding in mind: with the plain Messages API, your code is the orchestrator. With the Agent SDK, the SDK is the orchestrator, and your code is the harness around it.

	Messages API	Agent SDK
Tool loop	You write it	Built in
Tool execution	You implement every tool	Lots of tools come for free
Context / compaction	You manage	Automatic
Permissions	None	Modes, allow / deny rules, hooks
Sessions	You store history	Resume / fork built in
Best for	Bespoke single-shot reasoning	Anything that takes more than one turn and touches the world

2. The agent loop

The agent loop is the heart of the SDK and the thing worth understanding before anything else. Every "turn" looks like this:

Claude receives the conversation so far (system prompt, prior turns, available tools).
Claude emits an assistant message: some reasoning, possibly some tool calls.
Each tool call is run by the SDK (after passing permission checks and hooks).
The tool results are wrapped as a user message and appended to the conversation.
Back to step 1. The loop ends when Claude produces an assistant message with no tool calls. That final message is the result.

The loop terminates naturally when Claude has nothing more to do. maxTurns and maxBudgetUsd are safety belts that can cut it short.

The same loop powers every Agent SDK use case, whether it's a one-line shell helper or a 30-turn refactor that touches 80 files. Simple tasks finish in two turns; involved ones take dozens. You don't choose how many turns; Claude decides when it's done, within the limits you set.

One turn vs many. A turn is one round-trip: Claude speaks, tools run. A "list the files in this folder" task is one or two turns. A "refactor this module and update its tests" task might be twenty. Long-running agents handle context filling up by automatic compaction: the SDK summarises old history into a shorter form when limits approach, and emits a compact_boundary message so you can see where it happened.

3. Built-in tools and the toolbox

An Agent SDK agent isn't useful unless it can do things. By default it gets the same toolbox Claude Code has:

Tool	What it does
`Read`	Read any file
`Write`	Create a new file
`Edit`	Make precise string-level edits to an existing file
`Bash`	Run shell commands (git, build tools, scripts)
`Glob`	Find files by pattern, e.g. `*/.ts`
`Grep`	Search file contents with regex
`WebSearch`	Search the web
`WebFetch`	Fetch and parse a URL
`Agent`	Spawn a subagent for a focused subtask
`AskUserQuestion`	Prompt the user with multiple-choice options
`TodoWrite`	Maintain a working todo list across turns

You don't have to expose all of them. The allowedTools option narrows the toolbox down to what the agent actually needs, both as a safety measure and because a smaller toolbox is easier for Claude to reason about. A read-only research agent might be given just ["Read", "Glob", "Grep", "WebSearch", "WebFetch"].

4. MCP: the protocol for plugging in tools

The Model Context Protocol (MCP) is a small standard for exposing tools to a language model over a server interface. The Agent SDK speaks MCP fluently, which means there's a ready-made way to plug in:

Off-the-shelf tools published by other people (Playwright for browser automation, GitHub, Sentry, Linear, databases, file systems...)
Your own tools, defined in your codebase and run in-process
Hosted tool servers behind an HTTP boundary

The SDK is the MCP client. Tool servers can be in your own process, a subprocess on the same machine, or somewhere over the network.

Tools loaded over MCP appear to Claude under the name mcp__<server>__<tool>, which is why you'll see names like mcp__playwright__navigate or mcp__github__create_issue when an agent uses them.

5. Custom tools

The fastest way to add a tool is to write a function, decorate it with a schema, and bundle it into an in-process MCP server:

from claude_agent_sdk import tool, create_sdk_mcp_server, query, ClaudeAgentOptions

@tool("search_users", "Search the users table", {"q": str, "limit": int})
async def search_users(args):
    rows = await db.search(args["q"], limit=args["limit"])
    return {"content": [{"type": "text", "text": json.dumps(rows)}]}

server = create_sdk_mcp_server(name="db", tools=[search_users])

async for msg in query(
    prompt="Find users whose email contains 'acme'",
    options=ClaudeAgentOptions(
        mcp_servers={"db": server},
        allowed_tools=["mcp__db__search_users"],
    ),
):
    ...

The schema ({"q": str, "limit": int}) becomes the tool's input contract; the SDK validates inputs before calling your function. Your handler returns a content payload Claude can read. If you raise an exception, the loop stops; if you return isError: true, the loop continues and Claude sees the error and can react.

6. Permissions: what the agent is allowed to do

An autonomous loop that can edit files and run shell commands is powerful and slightly terrifying. The SDK has a layered permission model so you can dial that down. Each tool call passes through these gates in order:

Hooks run first, then deny rules (the hardest "no"), then allow rules ("yes, no prompt"), then the mode-default behaviour kicks in for anything unmatched.

The five permission modes set the default behaviour for tools you haven't explicitly listed:

Mode	What it means	Use it for
`default`	Anything unmatched routes through your `canUseTool` callback	Interactive apps where a human decides
`plan`	Read-only tools only; the agent can explore but can't change anything	"Tell me what you'd do" before approval
`acceptEdits`	File edits auto-approved; other tools follow the rules	Coding agents where edits are the whole point
`dontAsk`	Unmatched tools denied silently	Locked-down agents that can only use a whitelist
`bypassPermissions`	Everything auto-approved	Sandboxed CI / container only. Hands-off the wheel.

An interactive setup typically pairs permissionMode: "default" with a canUseTool callback that pops up a UI prompt. A background CI job typically pairs permissionMode: "acceptEdits" with a tight allow-list and runs in a disposable container.

7. Hooks: intercepting the loop

Hooks are callbacks that fire at well-defined points in the loop. They let you inject behaviour without forking the SDK or rewriting the agent.

Event	Fires	Used for
`PreToolUse`	Before a tool runs	Block dangerous commands, validate inputs
`PostToolUse`	After a tool returns	Audit logs, append context, side effects
`UserPromptSubmit`	User sends a prompt	Inject system context, sanitise input
`Stop`	Agent finishes	Save state, post results
`SubagentStart` / `Stop`	Subagent lifecycle	Track parallel work
`PreCompact`	Before context compaction	Archive the full transcript
`Notification`	Status / progress messages	Forward to Slack, PagerDuty

A hook can also reshape the call: a PreToolUse hook can return permissionDecision: "deny" to block it, updatedInput to rewrite the arguments, or systemMessage to inject extra instructions before the next turn.

async def block_rm_rf(input_data, tool_use_id, context):
    if input_data["tool_name"] == "Bash":
        cmd = input_data["tool_input"].get("command", "")
        if "rm -rf" in cmd:
            return {"permissionDecision": "deny",
                    "reason": "rm -rf is not allowed"}
    return {}

options = ClaudeAgentOptions(
    hooks={"PreToolUse": [HookMatcher(matcher="Bash", hooks=[block_rm_rf])]}
)

8. Subagents: delegating focused work

Long tasks suffer two problems: the context window fills with low-value chatter, and Claude has to keep too many concerns in mind at once. Subagents solve both. The main agent spawns a child agent for a specific subtask. The child gets its own fresh context, runs to completion, and reports back a single result. Only that result enters the parent's history.

Subagents protect the parent's context window. Each runs to completion and gives back a tidy report.

Each subagent can have its own toolbox, its own system prompt, even its own reasoning effort level. A "security-reviewer" might be given effort: "high" and only read-only tools; a "test-runner" might have Bash for running test commands. Subagents can run in parallel, which is one of the easier wins for wall-clock time on big tasks.

9. Sessions, resume, fork

Every call to the SDK happens inside a session: the accumulated conversation, the files Claude has read, the decisions it has made. Sessions are stored locally (in ~/.claude/projects/ by default) and can be picked back up later.

Pattern	What it does	When to use
`continue: true`	Resume the most recent session in this directory	Same-process chat loop
`resume: <id>`	Resume a specific past session	Cross-process, "open this conversation"
`fork_session: true`	Branch from an existing session	"Try a different approach without losing the original"

The session id arrives in the final ResultMessage; store it somewhere if you want to come back to it. Forking gives you tree-shaped exploration: try two refactor strategies from the same starting point, compare results, keep one.

10. A worked example

Putting it together: a small agent that reads a Python file, fixes any obvious bugs, and runs the test suite, with a hard cap on cost and turns. In Python:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage, AssistantMessage

async def main():
    options = ClaudeAgentOptions(
        allowed_tools=["Read", "Edit", "Bash", "Glob", "Grep"],
        permission_mode="acceptEdits",
        max_turns=30,
        max_budget_usd=2.00,
    )
    async for message in query(
        prompt="Fix the failing tests in tests/test_payments.py and explain the fix.",
        options=options,
    ):
        if isinstance(message, AssistantMessage):
            for block in message.content:
                if hasattr(block, "text"):
                    print(block.text)
        elif isinstance(message, ResultMessage):
            print(f"\nDone. {message.subtype}. cost=$" \
                  f"{message.total_cost_usd:.4f} session={message.session_id}")

asyncio.run(main())

And the same shape in TypeScript:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Fix the failing tests in tests/test_payments.py and explain the fix.",
  options: {
    allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep"],
    permissionMode: "acceptEdits",
    maxTurns: 30,
    maxBudgetUsd: 2.00,
  },
})) {
  if (message.type === "assistant") {
    for (const block of message.message.content) {
      if (block.type === "text") process.stdout.write(block.text);
    }
  } else if (message.type === "result") {
    console.log(`\nDone. ${message.subtype}. cost=$${message.total_cost_usd.toFixed(4)}`);
  }
}

What's happening under the hood: the SDK assembles a system prompt with the tool descriptions, sends Claude the user message, parses the response, runs any tool calls (read the test file, run the tests, edit the source, run the tests again...), feeds results back, and keeps looping until Claude either says "I'm done" or hits one of the limits.

11. When to reach for it

The Agent SDK is the right tool when your task needs more than one round of reasoning and touches the outside world: files, shell, web, your own systems. Bug fixing, code review, refactors, deployment safety checks, SRE incident responders, research assistants, anything where "look, think, act, look again" is the natural shape.

It's overkill when you just want a single completion or structured output from a prompt; the bare Messages API is faster and simpler for that. It's also worth knowing about Managed Agents, Anthropic's hosted alternative: same agent loop, but run on Anthropic's infrastructure with a REST API, suited to long-running tasks where you don't want to manage a process. The Agent SDK runs in your own process and is the right fit when you want direct control and to bring your own tools.

The reason it's worth understanding rather than skipping is that the agent loop is becoming the default shape of AI applications. The single-shot prompt-and-response model handles a narrow slice of what's useful; the things people actually want a model to do (write code, run it, search the web, draft and send things) are loops. The Agent SDK is one of the cleaner takes on that loop, with the bonus that it's the same one Anthropic uses to build Claude Code itself.

Based on the official Agent SDK docs as of early 2026. The SDK is on GitHub for Python and TypeScript; the reference docs live at code.claude.com/docs/en/agent-sdk. Renamed from "Claude Code SDK" in late 2025.