← pllu.net

If you've used Claude Code, you've used an agent: it reads files, runs commands, edits things, iterates, and stops when the task is done. The Claude Agent SDK is that same machinery, exposed as a library you can call from Python or TypeScript. You hand it a prompt and some options, and it runs the loop inside your program instead of inside a terminal.

The point is to spare you from rebuilding the agentic plumbing. With the raw Messages API, you implement the tool loop yourself: parse tool calls, execute them, feed results back, manage context, decide when to stop. The Agent SDK does all of that, and brings along the same tools, permission model, hooks, and subagent system that Claude Code uses.

1. What it actually is

Two packages, same engine:

Both wrap the Claude Code agent loop. The TypeScript SDK bundles a native Claude Code binary, so there's nothing extra to install. The product was originally called the "Claude Code SDK" and was renamed in late 2025 to make the framing clearer: it isn't a separate thing, it's Claude Code's agent capabilities as a library.

Your application Slack bot, CLI, web app, CI job Claude Agent SDK runs the loop, calls tools Claude (Opus / Sonnet) decides what to do next Built-in tools Read, Edit, Bash, WebSearch... Your tools + MCP DB, browser, internal APIs
Your code calls the SDK; the SDK runs an agent loop against Claude; Claude calls tools (built-in and yours) until the task is done.

The contrast worth holding in mind: with the plain Messages API, your code is the orchestrator. With the Agent SDK, the SDK is the orchestrator, and your code is the harness around it.

Messages APIAgent SDK
Tool loopYou write itBuilt in
Tool executionYou implement every toolLots of tools come for free
Context / compactionYou manageAutomatic
PermissionsNoneModes, allow / deny rules, hooks
SessionsYou store historyResume / fork built in
Best forBespoke single-shot reasoningAnything that takes more than one turn and touches the world

2. The agent loop

The agent loop is the heart of the SDK and the thing worth understanding before anything else. Every "turn" looks like this:

  1. Claude receives the conversation so far (system prompt, prior turns, available tools).
  2. Claude emits an assistant message: some reasoning, possibly some tool calls.
  3. Each tool call is run by the SDK (after passing permission checks and hooks).
  4. The tool results are wrapped as a user message and appended to the conversation.
  5. Back to step 1. The loop ends when Claude produces an assistant message with no tool calls. That final message is the result.
User prompt + session, CLAUDE.md, tools Claude decides reason + maybe call tools Tool calls? yes → run them; no → done Permission + hook checks allow / deny / modify Execute tools read-only ones in parallel Tool results become a user message loop No tool calls → done final answer, cost, session id
The loop terminates naturally when Claude has nothing more to do. maxTurns and maxBudgetUsd are safety belts that can cut it short.

The same loop powers every Agent SDK use case, whether it's a one-line shell helper or a 30-turn refactor that touches 80 files. Simple tasks finish in two turns; involved ones take dozens. You don't choose how many turns; Claude decides when it's done, within the limits you set.

One turn vs many. A turn is one round-trip: Claude speaks, tools run. A "list the files in this folder" task is one or two turns. A "refactor this module and update its tests" task might be twenty. Long-running agents handle context filling up by automatic compaction: the SDK summarises old history into a shorter form when limits approach, and emits a compact_boundary message so you can see where it happened.

3. Built-in tools and the toolbox

An Agent SDK agent isn't useful unless it can do things. By default it gets the same toolbox Claude Code has:

ToolWhat it does
ReadRead any file
WriteCreate a new file
EditMake precise string-level edits to an existing file
BashRun shell commands (git, build tools, scripts)
GlobFind files by pattern, e.g. **/*.ts
GrepSearch file contents with regex
WebSearchSearch the web
WebFetchFetch and parse a URL
AgentSpawn a subagent for a focused subtask
AskUserQuestionPrompt the user with multiple-choice options
TodoWriteMaintain a working todo list across turns

You don't have to expose all of them. The allowedTools option narrows the toolbox down to what the agent actually needs, both as a safety measure and because a smaller toolbox is easier for Claude to reason about. A read-only research agent might be given just ["Read", "Glob", "Grep", "WebSearch", "WebFetch"].

4. MCP: the protocol for plugging in tools

The Model Context Protocol (MCP) is a small standard for exposing tools to a language model over a server interface. The Agent SDK speaks MCP fluently, which means there's a ready-made way to plug in:

Claude Agent SDK runs the loop, speaks MCP In-process MCP @tool functions in your code Local MCP server npx, docker, subprocess Remote MCP server HTTP / SSE
The SDK is the MCP client. Tool servers can be in your own process, a subprocess on the same machine, or somewhere over the network.

Tools loaded over MCP appear to Claude under the name mcp__<server>__<tool>, which is why you'll see names like mcp__playwright__navigate or mcp__github__create_issue when an agent uses them.

5. Custom tools

The fastest way to add a tool is to write a function, decorate it with a schema, and bundle it into an in-process MCP server:

from claude_agent_sdk import tool, create_sdk_mcp_server, query, ClaudeAgentOptions

@tool("search_users", "Search the users table", {"q": str, "limit": int})
async def search_users(args):
    rows = await db.search(args["q"], limit=args["limit"])
    return {"content": [{"type": "text", "text": json.dumps(rows)}]}

server = create_sdk_mcp_server(name="db", tools=[search_users])

async for msg in query(
    prompt="Find users whose email contains 'acme'",
    options=ClaudeAgentOptions(
        mcp_servers={"db": server},
        allowed_tools=["mcp__db__search_users"],
    ),
):
    ...

The schema ({"q": str, "limit": int}) becomes the tool's input contract; the SDK validates inputs before calling your function. Your handler returns a content payload Claude can read. If you raise an exception, the loop stops; if you return isError: true, the loop continues and Claude sees the error and can react.

6. Permissions: what the agent is allowed to do

An autonomous loop that can edit files and run shell commands is powerful and slightly terrifying. The SDK has a layered permission model so you can dial that down. Each tool call passes through these gates in order:

Tool call from Claude PreToolUse hooks your code first Deny rules hard blocks Allow rules pre-approved Permission mode + canUseTool default behaviour Execute Deny / ask user
Hooks run first, then deny rules (the hardest "no"), then allow rules ("yes, no prompt"), then the mode-default behaviour kicks in for anything unmatched.

The five permission modes set the default behaviour for tools you haven't explicitly listed:

ModeWhat it meansUse it for
defaultAnything unmatched routes through your canUseTool callbackInteractive apps where a human decides
planRead-only tools only; the agent can explore but can't change anything"Tell me what you'd do" before approval
acceptEditsFile edits auto-approved; other tools follow the rulesCoding agents where edits are the whole point
dontAskUnmatched tools denied silentlyLocked-down agents that can only use a whitelist
bypassPermissionsEverything auto-approvedSandboxed CI / container only. Hands-off the wheel.

An interactive setup typically pairs permissionMode: "default" with a canUseTool callback that pops up a UI prompt. A background CI job typically pairs permissionMode: "acceptEdits" with a tight allow-list and runs in a disposable container.

7. Hooks: intercepting the loop

Hooks are callbacks that fire at well-defined points in the loop. They let you inject behaviour without forking the SDK or rewriting the agent.

EventFiresUsed for
PreToolUseBefore a tool runsBlock dangerous commands, validate inputs
PostToolUseAfter a tool returnsAudit logs, append context, side effects
UserPromptSubmitUser sends a promptInject system context, sanitise input
StopAgent finishesSave state, post results
SubagentStart / StopSubagent lifecycleTrack parallel work
PreCompactBefore context compactionArchive the full transcript
NotificationStatus / progress messagesForward to Slack, PagerDuty

A hook can also reshape the call: a PreToolUse hook can return permissionDecision: "deny" to block it, updatedInput to rewrite the arguments, or systemMessage to inject extra instructions before the next turn.

async def block_rm_rf(input_data, tool_use_id, context):
    if input_data["tool_name"] == "Bash":
        cmd = input_data["tool_input"].get("command", "")
        if "rm -rf" in cmd:
            return {"permissionDecision": "deny",
                    "reason": "rm -rf is not allowed"}
    return {}

options = ClaudeAgentOptions(
    hooks={"PreToolUse": [HookMatcher(matcher="Bash", hooks=[block_rm_rf])]}
)

8. Subagents: delegating focused work

Long tasks suffer two problems: the context window fills with low-value chatter, and Claude has to keep too many concerns in mind at once. Subagents solve both. The main agent spawns a child agent for a specific subtask. The child gets its own fresh context, runs to completion, and reports back a single result. Only that result enters the parent's history.

Parent agent "Audit this PR end-to-end" style-reviewer Read, Glob, Grep own fresh context security-reviewer Read, Glob, Grep runs in parallel test-runner Read, Bash final report only one summary per child returns to parent
Subagents protect the parent's context window. Each runs to completion and gives back a tidy report.

Each subagent can have its own toolbox, its own system prompt, even its own reasoning effort level. A "security-reviewer" might be given effort: "high" and only read-only tools; a "test-runner" might have Bash for running test commands. Subagents can run in parallel, which is one of the easier wins for wall-clock time on big tasks.

9. Sessions, resume, fork

Every call to the SDK happens inside a session: the accumulated conversation, the files Claude has read, the decisions it has made. Sessions are stored locally (in ~/.claude/projects/ by default) and can be picked back up later.

PatternWhat it doesWhen to use
continue: trueResume the most recent session in this directorySame-process chat loop
resume: <id>Resume a specific past sessionCross-process, "open this conversation"
fork_session: trueBranch from an existing session"Try a different approach without losing the original"

The session id arrives in the final ResultMessage; store it somewhere if you want to come back to it. Forking gives you tree-shaped exploration: try two refactor strategies from the same starting point, compare results, keep one.

10. A worked example

Putting it together: a small agent that reads a Python file, fixes any obvious bugs, and runs the test suite, with a hard cap on cost and turns. In Python:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage, AssistantMessage

async def main():
    options = ClaudeAgentOptions(
        allowed_tools=["Read", "Edit", "Bash", "Glob", "Grep"],
        permission_mode="acceptEdits",
        max_turns=30,
        max_budget_usd=2.00,
    )
    async for message in query(
        prompt="Fix the failing tests in tests/test_payments.py and explain the fix.",
        options=options,
    ):
        if isinstance(message, AssistantMessage):
            for block in message.content:
                if hasattr(block, "text"):
                    print(block.text)
        elif isinstance(message, ResultMessage):
            print(f"\nDone. {message.subtype}. cost=$" \
                  f"{message.total_cost_usd:.4f} session={message.session_id}")

asyncio.run(main())

And the same shape in TypeScript:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Fix the failing tests in tests/test_payments.py and explain the fix.",
  options: {
    allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep"],
    permissionMode: "acceptEdits",
    maxTurns: 30,
    maxBudgetUsd: 2.00,
  },
})) {
  if (message.type === "assistant") {
    for (const block of message.message.content) {
      if (block.type === "text") process.stdout.write(block.text);
    }
  } else if (message.type === "result") {
    console.log(`\nDone. ${message.subtype}. cost=$${message.total_cost_usd.toFixed(4)}`);
  }
}

What's happening under the hood: the SDK assembles a system prompt with the tool descriptions, sends Claude the user message, parses the response, runs any tool calls (read the test file, run the tests, edit the source, run the tests again...), feeds results back, and keeps looping until Claude either says "I'm done" or hits one of the limits.

11. When to reach for it

The Agent SDK is the right tool when your task needs more than one round of reasoning and touches the outside world: files, shell, web, your own systems. Bug fixing, code review, refactors, deployment safety checks, SRE incident responders, research assistants, anything where "look, think, act, look again" is the natural shape.

It's overkill when you just want a single completion or structured output from a prompt; the bare Messages API is faster and simpler for that. It's also worth knowing about Managed Agents, Anthropic's hosted alternative: same agent loop, but run on Anthropic's infrastructure with a REST API, suited to long-running tasks where you don't want to manage a process. The Agent SDK runs in your own process and is the right fit when you want direct control and to bring your own tools.

The reason it's worth understanding rather than skipping is that the agent loop is becoming the default shape of AI applications. The single-shot prompt-and-response model handles a narrow slice of what's useful; the things people actually want a model to do (write code, run it, search the web, draft and send things) are loops. The Agent SDK is one of the cleaner takes on that loop, with the bonus that it's the same one Anthropic uses to build Claude Code itself.

Based on the official Agent SDK docs as of early 2026. The SDK is on GitHub for Python and TypeScript; the reference docs live at code.claude.com/docs/en/agent-sdk. Renamed from "Claude Code SDK" in late 2025.