OpenAI Agents SDK Guide (v0.0.17)¶

The OpenAI Agents SDK is a toolkit to help you build agentic AI applications easily. An agent here means a large language model (LLM) like ChatGPT, but with added instructions and tools. The SDK makes it easy to connect agents, tools, and rules (called guardrails) in code. It comes with:

Agents: LLMs with instructions and tools.
Handoffs: A way for one agent to hand off tasks to another agent.
Guardrails: Checks that run alongside your agents to validate inputs or outputs.

The design goals of the SDK are:

Powerful but simple: It has all the necessary features, but few basic concepts so it's quick to learn.
Customizable: Works well out of the box, but you can tweak everything to fit your needs.

Key features include:

Agent loop: Built-in logic that calls the LLM repeatedly, handles tool calls, and loops until done.
Python-first design: Use normal Python code to chain and orchestrate agents.
Handoffs: Easily coordinate between multiple agents.
Guardrails: Run input/output validations in parallel with your agents to stop bad inputs early.
Function tools: Turn any Python function into a tool, with automatic input schema and validation.
Tracing: Built-in tracing to visualize, debug, and monitor your flows. (Works with OpenAI’s trace dashboard.)

Why use the Agents SDK¶

The SDK’s main idea is to make complex multi-agent applications easy to build. It’s like a "production-ready upgrade" of earlier experiments (e.g. Swarm). It comes with everything you need to express interactions between tools and agents without a steep learning curve.

Installation¶

Install the Agents SDK with pip:

Text Only
1	`pip install openai-agents`

Make sure you also set your OPENAI_API_KEY environment variable before running code that uses the SDK.

Hello World Example¶

The simplest example:

Python
from agents import Agent, Runner

agent = Agent(name="Assistant", instructions="You are a helpful assistant")

result = Runner.run_sync(agent, "Write a haiku about recursion in programming.")
print(result.final_output)

# Code within the code,
# Functions calling themselves,
# Infinite loop's dance.

This code creates an agent with a name and instructions, then uses Runner.run_sync to have the agent write a haiku about recursion. The result.final_output is the haiku generated by the agent. Make sure to set the OPENAI_API_KEY before running this.

Quickstart¶

This section shows step-by-step how to set up a project and run agents.

Create a Project and Virtual Environment¶

On a new project folder, set up a Python virtual environment:

Bash
mkdir my-project
cd my-project
python3 -m venv env

Activate the Virtual Environment¶

Activate it (on Unix):

Bash
1	`source env/bin/activate`

On Windows, you might run:

Text Only
1	`env\Scripts\activate`

Install the Agents SDK¶

With the environment active, install the SDK:

Bash
1	`pip install openai-agents`

Set the OpenAI API Key¶

The SDK uses OpenAI’s API for the language models. You must provide an API key. The simplest way is to set the environment variable:

Bash
export OPENAI_API_KEY=sk-...  
# On Windows, use: set OPENAI_API_KEY=sk-...

Alternatively, you can set the key in code using agents.set_default_openai_key, but using the environment variable is easier for quickstart.

Create Your First Agent¶

In code, create an Agent object. Give it a name and some instructions. For example:

Python
from agents import Agent, Runner

agent = Agent(name="Assistant", instructions="You are a helpful assistant.")

This creates an agent that will behave like a helpful assistant.

Add More Agents¶

You can create multiple agents. For example, imagine a Spanish-speaking agent and an English-speaking agent:

Python
spanish_agent = Agent(name="Spanish", instructions="You only respond in Spanish.")
english_agent = Agent(name="English", instructions="You only respond in English.")

Each agent has its own name and instructions.

Define Handoffs¶

Handoffs let one agent delegate work to another. Specify in an agent’s configuration the agents it can hand off to. For example, a triage agent that decides whether to use Spanish or English agent:

Python
from agents import handoff

triage_agent = Agent(
    name="Triage",
    instructions="Handoff to the correct language agent based on the query.",
    handoffs=[spanish_agent, handoff(english_agent)]
)

Here, handoffs lists two options: spanish_agent and a customized handoff to english_agent. (We used handoff(english_agent) to illustrate custom settings; with just english_agent it would use defaults.)

Add Tools (Optional)¶

Agents can use tools to take actions (like web search, code execution, etc.). For example, you could add a weather tool:

Python
from agents import function_tool

@function_tool
def get_weather(city: str) -> str:
    """Get the weather for a given city."""
    # Imagine calling a real weather API here
    return f"The weather in {city} is sunny."

Then assign tools to your agent:

Python
agent_with_tool = Agent(
    name="Weather Assistant",
    instructions="You answer with weather info when asked.",
    tools=[get_weather]
)

Now this agent can call get_weather when it needs weather info.

Define a Guardrail (Optional)¶

Guardrails are checks that run alongside your agent to validate inputs or outputs. For example, you could write a quick guardrail that blocks math homework queries:

Python
from agents import input_guardrail, GuardrailFunctionOutput, InputGuardrailTripwireTriggered, Runner

async def homework_guardrail(ctx, agent, input_text: str):
    # Simple check for math homework in user query
    triggered = "homework" in input_text.lower()
    return GuardrailFunctionOutput(
        output_info=None,
        tripwire_triggered=triggered
    )

agent_with_guardrail = Agent(
    name="Support",
    instructions="You help customers.",
    input_guardrails=[homework_guardrail]
)

This guardrail will stop the agent if it sees "homework" in the input (demonstration only). See the Guardrails section in official docs for more details.

Run the Agent Workflow¶

Once you have agents, tools, handoffs, and guardrails set up, you can run them. Use the Runner class:

Python
result = Runner.run_sync(triage_agent, "Hello, how are you?")
print(result.final_output)

This runs the triage_agent on the given input. The Runner handles calling the LLM, executing tool calls, running guardrails, and performing handoffs. The final answer appears in result.final_output.

Put It All Together¶

A more complete example:

Python
from agents import Agent, Runner, function_tool, handoff, input_guardrail, GuardrailFunctionOutput, InputGuardrailTripwireTriggered

async def homework_guardrail(ctx, agent, input_text: str):
    # Simple check for math homework in user query
    triggered = "homework" in input_text.lower()
    return GuardrailFunctionOutput(
        output_info=None,
        tripwire_triggered=triggered
    )

@function_tool
def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny."

spanish_agent = Agent(
    name="Spanish",
    instructions="You only answer in Spanish.",
)

english_agent = Agent(
    name="English",
    instructions="You only answer in English.",
)

weather_agent = Agent(
    name="Weather Agent",
    instructions="You provide weather info.",
    tools=[get_weather]
)

triage_agent = Agent(
    name="Triage",
    instructions="Call the right agent based on the language and request.",
    handoffs=[spanish_agent, handoff(english_agent), weather_agent],
    input_guardrails=[homework_guardrail]
)

result = Runner.run_sync(triage_agent, "What's the weather in Tokyo?")
print(result.final_output)

Here, triage_agent could hand off the conversation to one of the language agents or the weather agent. The Runner takes care of running the loop and choosing tools or agents as needed.

View Traces (Optional)¶

If you have tracing enabled (default), you can log into OpenAI’s Traces dashboard (platform.openai.com) to see the detailed steps of the agent run. This is useful for debugging your application.

If needed, you can disable tracing or control it via configuration (see Configuring the SDK). For now, just know that tracing is on by default and captures everything.

Next Steps¶

You’ve seen the basic workflow: set up agents with instructions, add tools and handoffs, and run with Runner.run or Runner.run_sync. In the following sections we explain each part in depth:

Agents: How to configure agents and their properties.
Running agents: The underlying loop of how agents are executed.
Results: How to inspect the output of a run.
Streaming: Getting updates from an agent run in real-time.
REPL utility: A quick interactive console to test agents.
Tools: How to use built-in, function, and agent tools.
MCP: Integrate with external tool providers via the Model Context Protocol.
Handoffs: How one agent can delegate to another.
Tracing: How execution is traced and how to use it.
Context management: Passing context (data or objects) through runs.
Guardrails: Writing input/output checks for safety.
Multi-agent: Patterns for orchestrating multiple agents in your app.
Models: LLMs, including non-OpenAI models via LiteLLM.
Configuration: SDK-level settings like API keys and logging.
Visualization: Tools for visualizing agent graphs.
Release process: How the SDK versions are managed.

Each section below covers these topics.

Agents¶

An Agent is like a virtual assistant powered by an LLM. You create an agent by giving it a name and instructions, and optionally tools or other settings. Here are the main points about agents:

Name: A unique name to identify the agent.
Instructions: A prompt or guidelines telling the model how to behave.
Tools: A list of tools this agent can call.
Handoffs: A list of other agents to which it can delegate tasks.
Guardrails: Input/output checks for this agent.
Context (optional): Type of context object if you want to pass custom data.
Model: Optionally, which LLM to use (by default it uses OpenAI's GPT models).

Agents are defined by creating an Agent object. For example:

Python
from agents import Agent

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
)

Basic Configuration¶

When creating an Agent, you can pass:

name: The agent’s name.
instructions: A string or list of messages that tell the agent how to act.
tools: A list of tools this agent can use.
handoffs: Other agents to hand off tasks to.
input_type / output_type: (Advanced) Specify the type of input/output expected.
guardrails: Input or output guardrails for this agent.
model/model_settings: Specify which LLM model to use (if different from defaults).

For example, to make an agent always respond in French with GPT-4, you might write:

Python
from agents import Agent, ModelSettings, OpenAIChatCompletionsModel

french_agent = Agent(
    name="FrenchAgent",
    instructions="Please answer in French only.",
    model=OpenAIChatCompletionsModel(model="gpt-4"),
    model_settings=ModelSettings(temperature=0.5),
)

This agent uses GPT-4 via the Chat Completions API and will speak only French.

Context¶

Agents can have access to a context object in your code. This is data you provide to each run. For example, you might want the agent to know the user’s name or have access to a database connection.

Define any Python object (dataclass, Pydantic model, etc.) for your context.
Pass this object in Runner.run(..., context=...).
All tools and hooks get a RunContextWrapper that includes context.

Example:

Python
from dataclasses import dataclass
from agents import Agent, Runner, RunContextWrapper, function_tool

@dataclass
class UserInfo:
    name: str
    uid: int

@function_tool
async def greet(wrapper: RunContextWrapper[UserInfo]) -> str:
    # Access context: wrapper.context.name
    return f"Hello, {wrapper.context.name}!"

async def main():
    user = UserInfo(name="Alice", uid=42)
    agent = Agent[UserInfo](
        name="Greeter",
        instructions="Greet the user by name.",
        tools=[greet]
    )
    result = await Runner.run(agent, "Hi", context=user)
    print(result.final_output)  # "Hello, Alice!"

Here, UserInfo is the local context type. The tool greet reads wrapper.context.name to personalize the greeting.

Important: The context object is not sent to the model. It’s only available in your Python code (tools, hooks). If you need data to go to the LLM, include it in instructions or the prompt.

Dynamic Instructions¶

You can make an agent’s instructions dynamic using Python functions. Instead of a fixed string, instructions can be a function that returns a string based on context.

For example:

Python
def dynamic_instr(ctx: RunContextWrapper[UserInfo]) -> str:
    return f"You are talking to user {ctx.context.name}."

agent = Agent(
    name="PersonalAgent",
    instructions=dynamic_instr,
)

Now each time Runner calls the model, it will call dynamic_instr with the context object to get the system prompt.

Handoffs¶

A handoff is a way for one agent to transfer control to another agent. This is useful when different agents specialize in different tasks. In the agent’s configuration, you set handoffs to a list of other agents or Handoff objects.

For example:

Python
from agents import Agent, handoff

sales_agent = Agent(name="Sales", instructions="Help with sales queries.")
support_agent = Agent(name="Support", instructions="Handle general support.")

agent = Agent(
    name="FrontDesk",
    instructions="Decide who should answer the question.",
    handoffs=[sales_agent, handoff(support_agent)]
)

In the above, the FrontDesk agent can hand off to either Sales agent (using defaults) or Support agent (we used handoff(support_agent) to show how to customize it, though not necessary here).

Handoffs work as tools that the LLM can call. If the LLM calls transfer_to_Support, the SDK will switch to that agent.

Lifecycle Events¶

Agents support lifecycle hooks that run at certain points during execution. You can define:

pre_prompt, post_prompt: Functions to modify the prompt before or after sending to LLM.
pre_agent, post_agent: Functions to run before or after the agent’s LLM call (e.g., logging).
pre_tool, post_tool: Run before/after each tool invocation.
on_tool_error: Handle errors when calling a tool.
pre_user_input, post_user_input: Handle user inputs.

These are advanced and help customize the run. For example, you might use pre_tool to log that a tool is about to run.

Guardrails on Agents¶

You can attach input and output guardrails to an agent (see Guardrails section). For example:

Python
from agents import Agent

agent = Agent(
    name="Assistant",
    instructions="You are helpful.",
    input_guardrails=[some_guardrail_function],
    output_guardrails=[some_other_guardrail_function]
)

Then before running the agent or after, the SDK checks these guardrails and can stop the run if a rule is violated.

Cloning/Copying Agents¶

You can copy an agent’s configuration using the .copy() method if you need a similar agent with small changes. For example:

Python
agent1 = Agent(name="A", instructions="Do A things.")
agent2 = agent1.copy()
agent2.name = "B"
agent2.instructions = "Do B things."

This avoids writing the same settings twice.

Forcing Tool Use¶

If you want to force the agent to use tools instead of directly answering, you can use the tool_choice='required' method on an agent. This will force the agent to use tools, Be careful as it can lead to infinite tool calling.

Running agents¶

To execute (run) agents, use the Runner class. There are three ways to run an agent:

Runner.run(agent, input): An asynchronous method. Returns a RunResult. You should await this in an async function.
Runner.run_sync(agent, input): A synchronous wrapper around run(). Useful if you’re not using async.
Runner.run_streamed(agent, input): An asynchronous method that streams events as the LLM runs. Returns a RunResultStreaming which you can iterate over for updates (tokens, tool calls, etc.).

Example of using run (async):

Python
from agents import Agent, Runner
import asyncio

async def main():
    agent = Agent(name="Assistant", instructions="You are a helpful assistant.")
    result = await Runner.run(agent, "Write a haiku about recursion in programming.")
    print(result.final_output)

asyncio.run(main())

This prints:

Text Only
1 2 3	`Code within the code, Functions calling themselves, Infinite loop's dance.`

The Agent Loop¶

When you call Runner.run(...), the SDK performs this loop internally:

Call the LLM for the current agent, giving it the current input and instructions.
The LLM generates output. Then:
If the LLM returns a final_output (like a complete answer) and no tool calls, the loop ends and that output is returned.
If the LLM calls a handoff tool, the SDK switches to the new agent and updates the input, then continues the loop with the new agent.
If the LLM produces tool calls, the SDK executes those tools (in order), collects their outputs as new input, and continues the loop with the same agent.
If the loop runs more times than max_turns (a limit you can set), a MaxTurnsExceeded error is raised.

In short, Runner keeps calling agents and running tools or handoffs until an agent finishes with a final answer.

Streaming¶

If you use Runner.run_streamed(), you get real-time events as the run proceeds. For example, you might stream partial text to a user.

Python
result = Runner.run_streamed(agent, input="Hello")
async for event in result.stream_events():
    print(event)

There are two main kinds of streaming events:

Raw response events (RawResponsesStreamEvent): These are low-level tokens or data from the LLM as it generates text. You can use these to display each token as it comes.
Run item / agent events (RunItemStreamEvent, AgentUpdatedStreamEvent): Higher-level events for when a message is finished, a tool call completed, or the current agent changed. These let you update UI at logical steps (“Agent said this”, “Tool ran”, etc.), instead of every token.

For example, to print each token as it’s generated:

Python
import asyncio
from openai.types.responses import ResponseTextDeltaEvent
from agents import Agent, Runner

async def main():
    agent = Agent(name="Assistant", instructions="You are helpful.")
    result = Runner.run_streamed(agent, input="Tell me a joke.")
    async for event in result.stream_events():
        if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
            print(event.data.delta, end="", flush=True)

asyncio.run(main())

Or, to handle higher-level events and skip raw tokens:

Python
import asyncio
from agents import Agent, Runner, function_tool, ItemHelpers

@function_tool
def how_many_jokes() -> int:
    import random
    return random.randint(1, 10)

async def main():
    agent = Agent(
        name="Joker",
        instructions="First call the `how_many_jokes` tool, then tell that many jokes.",
        tools=[how_many_jokes],
    )

    result = Runner.run_streamed(agent, input="Hello")
    print("=== Run starting ===")

    async for event in result.stream_events():
        if event.type == "raw_response_event":
            continue
        elif event.type == "agent_updated_stream_event":
            print(f"Agent updated: {event.new_agent.name}")
        elif event.type == "run_item_stream_event":
            if event.item.type == "tool_call_item":
                print("-- Tool was called")
            elif event.item.type == "tool_call_output_item":
                print(f"-- Tool output: {event.item.output}")
            elif event.item.type == "message_output_item":
                print(f"-- Message output:\n{ItemHelpers.text_message_output(event.item)}")

    print("=== Run complete ===")

In this example, we ignore raw tokens and only print when the agent changes or a message is generated.

Run Configuration (`run_config`)¶

When running agents, you can pass a run_config argument to control global settings. For example:

Python
from agents import RunConfig

config = RunConfig(
    model="gpt-3.5-turbo",
    model_settings=ModelSettings(temperature=0.2),
    tracing_disabled=True,
    workflow_name="MyWorkflow"
)
result = Runner.run(agent, input_data, run_config=config)

Some key run_config options:

model: Override the agent’s model and use this one instead.
model_provider: Set a custom model provider (default is OpenAI).
model_settings: Override settings like temperature or top_p for all agents.
input_guardrails / output_guardrails: Global guardrails to apply to every run.
handoff_input_filter: A global filter for inputs when handoffs happen.
tracing_disabled: Turn off tracing for this run.
trace_include_sensitive_data: Whether to include actual text or not in traces.
workflow_name, trace_id, group_id: Set custom tracing metadata. workflow_name should be set to something meaningful; group_id can link traces across multiple turns (like a chat thread).
trace_metadata: Additional metadata for traces.

These let you customize how agents run at a high level.

Conversations / Chat Threads¶

Each call to Runner.run is like one turn in a conversation, even if multiple agents or tools ran internally. For example:

User turn: User asks a question.
Runner run: An agent answers (maybe using handoffs and tools).
User sees answer.

If the user then asks a follow-up, you would call Runner.run again. To keep context, use the result.to_input_list() method to get a list of all messages from the last run, and then append the new user message. For example:

Python
async def chat():
    agent = Agent(name="Assistant", instructions="Reply briefly.")
    # First turn
    res1 = await Runner.run(agent, "Where is the Golden Gate Bridge?")
    print(res1.final_output)  # e.g. "San Francisco"
    # Prepare next turn
    new_input = res1.to_input_list() + [{"role": "user", "content": "What state is it in?"}]
    res2 = await Runner.run(agent, new_input)
    print(res2.final_output)  # "California"

The to_input_list() takes all the items (user message, agent messages, tool outputs) from the run and makes them into a list of conversation messages. You can then add the next user prompt and run again.

Exceptions¶

The SDK raises exceptions for various error cases. Key ones include:

MaxTurnsExceeded: Ran out of allowed turns.
ModelBehaviorError: The model returned something malformed (like broken JSON or an invalid tool call).
UserError: Your code used the SDK incorrectly.
InputGuardrailTripwireTriggered, OutputGuardrailTripwireTriggered: A guardrail’s tripwire was hit.

All exceptions inherit from AgentsException. Check agents.exceptions for full list.

Results¶

When you run an agent, the result is a RunResult (or RunResultStreaming for streaming). Both inherit from RunResultBase. The main pieces of information in the result are:

final_output: The final answer from the last agent that ran. If the last agent didn’t specify an output_type, this will be a plain string. If output_type was set, final_output will be an object of that type. For example, if the agent outputs JSON parsed into a Pydantic model, final_output is that model object.
to_input_list(): A method that gives you all the conversation items (original input, agent messages, tool outputs, etc.) as a list you can feed to the next run.
last_agent: The Agent object that was the final one to run. Handy if you need to know which agent answered last.
new_items: A list of RunItem objects for each new item generated during the run. There are different types of items:
MessageOutputItem: A message from the LLM (like the agent’s response).
ToolCallItem: The LLM invoked a tool (before execution).
ToolCallOutputItem: A tool was called and returned output.
HandoffCallItem: The LLM called a handoff tool (before switching).
HandoffOutputItem: A handoff was executed and returned a result.
ReasoningItem: An internal reasoning or trace item (if your agent outputs something like that).

Each of these has the raw item inside it, plus easy accessors. For example, for a MessageOutputItem, you can get the text with ItemHelpers.text_message_output(item).

Other info:
input_guardrail_results and output_guardrail_results: Results of any guardrails run. Useful for logging or analysis.
raw_responses: The raw model responses (each LLM API response) during the run.
input: The original input you passed. (Usually you won’t need this.)

Example use:

Python
result = Runner.run_sync(agent, "Hello")
print("Answer:", result.final_output)
print("Last agent:", result.last_agent.name)
print("Items generated:")
for item in result.new_items:
    if item.type == "message_output_item":
        print("- Message:", ItemHelpers.text_message_output(item))
    elif item.type == "tool_call_item":
        print("- Tool call:", item.tool_name)
    elif item.type == "tool_call_output_item":
        print("- Tool output:", item.output)

This will show the final answer, which agent gave it, and details about each message or tool used.

Streaming¶

You saw how to get streaming updates with run_streamed(). In the Streaming section above, we covered how to subscribe to streaming events while an agent run is happening.

Recap:

Call Runner.run_streamed(agent, input) to start streaming.
result.stream_events() is an async iterator of StreamEvent objects.
Event types:
raw_response_event: Contains raw LLM stream data (token deltas, etc.).
run_item_stream_event: High-level events for when a run item (message or tool output) is completed.
agent_updated_stream_event: When the current agent changes (e.g. after a handoff).

Use these events to provide live feedback in your application.

REPL Utility¶

The SDK has a quick REPL (read-eval-print loop) for interactive testing. It uses run_demo_loop.

Example usage:

Python
import asyncio
from agents import Agent, run_demo_loop

async def main():
    agent = Agent(name="Assistant", instructions="You are a helpful assistant.")
    await run_demo_loop(agent)

if __name__ == "__main__":
    asyncio.run(main())

When you run this script, it will prompt you to enter text in the console. It sends your input to the agent, and streams the model’s output back in real time. It keeps a conversation history automatically. To exit the loop, type quit or exit (or press Ctrl-D).

This is great for quick experiments.

Tools¶

Agents can use tools to interact with the outside world or run code. The SDK provides three categories of tools:

Hosted tools: Built-in tools running on OpenAI’s servers (with the OpenAIResponsesModel):
WebSearchTool: Search the web.
FileSearchTool: Search files in your OpenAI vector stores.
ComputerTool: Automate computer tasks.
CodeInterpreterTool: Run code in a sandbox.
HostedMCPTool: Expose tools from an MCP server (see MCP section).
ImageGenerationTool: Generate images from text.
LocalShellTool: Run shell commands locally.

Example of using hosted tools:

Python
from agents import Agent, Runner, WebSearchTool, FileSearchTool

agent = Agent(
    name="Assistant",
    tools=[
        WebSearchTool(),
        FileSearchTool(max_num_results=3, vector_store_ids=["VECTOR_STORE_ID"]),
    ],
)
result = await Runner.run(agent, "Which coffee shop should I go to, considering my preferences and today's weather in SF?")
print(result.final_output)

Function tools: Turn any Python function into a tool using the @function_tool decorator. The SDK automatically generates the tool schema from the function signature:
Tool name = function name (or override with name_override).
Description = function docstring (or provide one).
Input schema = function arguments schema (using Python types).
Input descriptions = from docstring (unless disabled).
The function can be async or normal, and can take a RunContextWrapper as the first argument if you need context.

Example:

Python
from agents import Agent, function_tool

@function_tool
async def fetch_weather(location: dict) -> str:
    """Fetch the weather for a given location.

    Args:
        location: A dict with 'lat' and 'long' keys.
    """
    # Pretend to call a weather API
    return "sunny"

agent = Agent(name="WeatherAgent", tools=[fetch_weather])

The SDK will inspect fetch_weather and create a FunctionTool with a JSON schema for its inputs. For complex inputs, you can use Pydantic models or TypedDicts. The example in the original docs showed how the tool’s params_json_schema is auto-generated from the function signature and docstring.

You can also create custom function tools manually by instantiating FunctionTool and providing name, description, params_json_schema and an on_invoke_tool async function. But using @function_tool is usually easier.

Agents as tools: You can treat an agent itself as a tool. This is useful if you want one agent to call another without a full handoff. Use the agent.as_tool() method. For example:

Python
spanish_agent = Agent(name="Spanish agent", instructions="Translate to Spanish.")
french_agent = Agent(name="French agent", instructions="Translate to French.")

orchestrator_agent = Agent(
    name="Orchestrator",
    instructions="You are a translation agent. Use the tools below.",
    tools=[
        spanish_agent.as_tool(tool_name="to_spanish", tool_description="Translate the message to Spanish"),
        french_agent.as_tool(tool_name="to_french", tool_description="Translate the message to French"),
    ]
)

result = await Runner.run(orchestrator_agent, input="Say 'Hello' in French.")
print(result.final_output)

Here, to_spanish and to_french are tools backed by running the respective agents under the hood.

You can customize agent-tools further:

Use custom_output_extractor to process the sub-agent’s output before returning it.
For advanced use, you can manually call Runner.run inside a @function_tool instead of as_tool.
Error handling in tools: If a function tool raises an error, by default the LLM will be told an error occurred. You can supply a failure_error_function to provide a custom message to the LLM when a tool fails. Or set it to None to let exceptions propagate (which will raise errors in your Python code). For custom-created FunctionTool, you should catch errors inside on_invoke_tool.

Model Context Protocol (MCP)¶

MCP stands for Model Context Protocol, a standard for giving LLMs access to external tools and data sources. The Agents SDK supports MCP so you can integrate external servers.

MCP servers come in three types:

Stdio servers: Run as subprocesses of your app (local).
HTTP SSE servers: Remote, accessed via HTTP+Server-Sent Events.
Streamable HTTP servers: Remote, using a streaming HTTP protocol.

The SDK has classes:

MCPServerStdio
MCPServerSse
MCPServerStreamableHttp

You use them by connecting to an MCP server. For example, using the official MCP filesystem server via npm:

Python
async with MCPServerStdio(params={"command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", samples_dir]}) as server:
    tools = await server.list_tools()
    # Use the tools as needed

Once you have an mcp_server object, add it to an agent:

Python
agent = Agent(
    name="Assistant",
    instructions="Use the tools provided by MCP servers.",
    mcp_servers=[mcp_server]
)

The SDK will call list_tools() on each MCP server at the start of every run. The agent will see those tools available. If the agent calls one of those tools, the SDK will call mcp_server.call_tool(...) to execute it.

Caching: By default, list_tools() is called on every run (which could be slow for remote servers). You can enable caching by passing cache_tools_list=True when creating the MCP server object. This will reuse the same tool list each time. Only use caching if the tool list never changes. You can invalidate the cache manually with invalidate_tools_cache().

For examples, see the official examples repo (https://github.com/openai/openai-agents-PyDeepOlympus/tree/main/examples/mcp). Note that tracing will automatically capture MCP calls too.

Handoffs¶

Handoffs are a way for one agent to hand off a task to another agent. This is useful when different agents have different specialties. For example, a support agent could hand off to a Refund agent or a Sales agent depending on the question.

When an agent does a handoff, it’s like it calls a special tool. If the agent named Refund agent is configured as a handoff target, the tool would be called transfer_to_refund_agent by default.

Creating a Handoff¶

In an agent’s configuration, the handoffs parameter lists the possible handoff targets. You can just list agents, or use the handoff() helper to customize. For example:

Python
from agents import Agent, handoff

billing_agent = Agent(name="Billing agent")
refund_agent = Agent(name="Refund agent")

# Simple usage: just list agents and use handoff() for custom settings
triage_agent = Agent(
    name="Triage agent",
    instructions="Delegate to the correct agent based on topic.",
    handoffs=[billing_agent, handoff(refund_agent)]
)

When listed as handoffs=[refund_agent], By default, refund_agent will create a tool named transfer_to_refund_agent with a default description like "handoff to Refund agent".

Customizing with `handoff()`¶

The handoff() function lets you fine-tune the handoff:

agent: The Agent object to hand off to.
tool_name_override: Provide a custom name for the tool (otherwise Handoff.default_tool_name() is used).
tool_description_override: Custom description text for the tool.
on_handoff: A callback function that runs when the handoff is invoked. It receives a context and optionally the LLM-provided input. Useful for triggering side-effects or logging.
input_type: The type (e.g. Pydantic model) of data the LLM should supply when doing the handoff.
input_filter: A function to filter or modify the conversation history passed to the new agent.

Example with custom settings:

Python
from agents import Agent, handoff, RunContextWrapper

def on_handoff(ctx: RunContextWrapper[None]):
    print("Handoff called!")

handoff_obj = handoff(
    agent=agent2,
    on_handoff=on_handoff,
    tool_name_override="custom_handoff_tool",
    tool_description_override="Custom description"
)

agent1 = Agent(
    name="My agent",
    instructions="Good Agent",
    handoffs=[handoff_obj]
    )

Here handoff_obj is a Handoff with custom tool name and description, and an on_handoff callback that prints something.

Handoff Inputs¶

Sometimes you want the user (via the LLM) to provide some data when calling a handoff. For instance, if handing off to an "Escalation agent", you might ask for a reason. You specify input_type for this:

Python
from pydantic import BaseModel
from agents import Agent, handoff, RunContextWrapper

class EscalationData(BaseModel):
    reason: str

async def on_handoff(ctx: RunContextWrapper[None], input_data: EscalationData):
    print(f"Escalation with reason: {input_data.reason}")

escalation_agent = Agent(name="Escalation agent")

handoff_obj = handoff(
    agent=escalation_agent,
    on_handoff=on_handoff,
    input_type=EscalationData,
)

agent1 = Agent(
    name="My agent",
    instructions="Good Agent",
    handoffs=[handoff_obj]
    )

Now the LLM should provide a JSON (or equivalent) matching EscalationData when it calls this handoff. Your on_handoff function gets that parsed data.

Input Filters¶

By default, when a handoff happens, the next agent sees the entire conversation history. If you want to change what the new agent sees, use an input_filter. An input filter takes a HandoffInputData (which has the context and previous items) and returns a new HandoffInputData to use.

For common use-cases, the SDK provides filters. For example, to remove all tool calls from history when handing off to the FAQ agent:

Python
from agents import Agent, handoff
from agents.extensions import handoff_filters

agent = Agent(name="FAQ agent")

handoff_obj = handoff(
    agent=agent,
    input_filter=handoff_filters.remove_all_tools  # Removes tool items
)

This means the FAQ agent only sees the user’s messages, without any internal tool calls from before.

Recommended Prompts for Handoffs¶

To help LLMs handle handoffs properly, the SDK provides some prompt templates. You can either prepend agents.extensions.handoff_prompt.RECOMMENDED_PROMPT_PREFIX to your instructions, or use prompt_with_handoff_instructions() to automatically add instructions about handoff to your agent’s prompt.

For example:

Python
from agents import Agent
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX

billing_agent = Agent(
    name="Billing agent",
    instructions=f"{RECOMMENDED_PROMPT_PREFIX} Then proceed with billing help."
)

This ensures your agents know how the handoff system works.

Tracing¶

The Agents SDK automatically traces (logs) everything that happens during a run. This includes:

LLM calls
Tool calls
Handoffs
Guardrails
etc.

Traces are sent to OpenAI’s Traces dashboard (platform.openai.com) so you can visualize and debug your workflows.

Tracing is on by default. You can disable it by: * Setting the env var OPENAI_AGENTS_DISABLE_TRACING=1 * Or Setting the set_tracing_disabled(True). * Or for a specific run, set agents.run.RunConfig.tracing_disabled=True.

Traces and Spans¶

Trace: Represents an entire operation (workflow) from start to finish. For example, one user conversation with possibly multiple agent calls could be one trace. Key properties of a trace:
workflow_name: Name of the workflow (e.g. "Customer Support").
trace_id: A unique ID. If you don’t set one, it’s generated automatically (like trace_<32_chars>).
group_id: Optional. You can use this to link related traces (for example, all turns in the same chat).
disabled: If true, nothing is recorded.
metadata: You can attach extra data.
Span: Represents a single timed operation within the trace. Spans are nested. Each span has:
trace_id (which trace it belongs to).
parent_id (which other span contains it, if any).
start and end timestamps.
span_data: Information about what happened (like an agent call, or a generation, etc.).

By default, the SDK creates traces and spans automatically:

The Runner.run call is wrapped in a trace.
Each agent run is an agent_span.
Each LLM generation is a generation_span.
Each function (tool) call is a function_span.
Each guardrail check is a guardrail_span.
Each handoff is a handoff_span.
If using voice pipelines, there are spans for transcription (transcription_span) and speech output (speech_span), etc.

By default, the trace is named "Agent trace". You can change the name by using with trace("Name"): ... or in RunConfig.

Higher-level Traces¶

If you call Runner.run multiple times but want them in one big trace, use the trace() context manager:

Python
from agents import Agent, Runner, trace

async def main():
    agent = Agent(name="Joke generator", instructions="Tell jokes.")
    with trace("Joke workflow"):
        res1 = await Runner.run(agent, "Tell me a joke")
        res2 = await Runner.run(agent, f"Rate this joke: {res1.final_output}")

Because of with trace("Joke workflow"), both run calls are in the same trace.

Creating Traces Manually¶

If needed, you can manually start/finish traces:

Python
from agents import trace

tr = trace.start("MyTrace", mark_as_current=True)
# Do some work...
trace.finish(tr, reset_current=True)

However, using with trace(...) is recommended.

Spans¶

Normally you don’t need to create spans manually. But if you have a custom operation, use @custom_span() decorator:

Python
from agents import custom_span

@custom_span()
def my_op():
    ...

This creates a span in the current trace. By default, new spans are children of the current open span.

Sensitive Data¶

By default, the trace includes inputs/outputs of generations and function calls (which may contain sensitive text). You can disable this:

Set RunConfig.trace_include_sensitive_data=False to omit LLM and tool inputs/outputs.
For voice pipelines, VoicePipelineConfig.trace_include_sensitive_audio_data=False to omit audio data.

Custom Trace Processors¶

The SDK uses OpenAI’s trace exporter by default. You can add your own processors to send spans elsewhere:

add_trace_processor(your_processor) adds an extra processor (so original OpenAI export still happens).
set_trace_processors([proc1, proc2, ...]) replaces the default processors. If you do this, you must include an exporter if you still want to send to OpenAI (unless you intend to only use other backends).

The docs mention supported external processors like Weights & Biases, Arize, MLflow, etc., which you can plug in by adding their processor.

Context Management¶

“Context” can mean two things here:

Local (Python) context: Data in your code that you want available in tools, hooks, etc. For example, a user object or a database client.
Agent/LLM context: The text that the LLM sees as part of the conversation.

Local Context (RunContextWrapper)¶

Local context is passed to tools and hooks via the RunContextWrapper. You provide a context object when calling run. All parts of the run share this context (tools, handoff callbacks, etc.).

How to use:

Create any Python object (e.g. dataclass or Pydantic model) for context.
Pass it to Runner.run(..., context=my_context).
Your tools or hooks can declare they accept RunContextWrapper[MyContextType].
Inside, use wrapper.context to access your data.

Example (same as before):

Python
@dataclass
class UserInfo:
    name: str
    uid: int

@function_tool
async def fetch_user_age(wrapper: RunContextWrapper[UserInfo]) -> str:
    return f"User {wrapper.context.name} is 47 years old"

async def main():
    user_info = UserInfo(name="John", uid=123)
    agent = Agent[UserInfo](
        name="Assistant",
        tools=[fetch_user_age],
    )
    result = await Runner.run(
        agent,
        "What is the age of the user?",
        context=user_info
    )
    print(result.final_output)  # "User John is 47 years old"

Key points:

Every part of the run must agree on the type of context.
The context object is not sent to the LLM; it’s only for your code.
Use it for any data or resources your code needs (user info, config, database handles, loggers, etc.).

Agent/LLM Context (Prompt, System Instructions etc..)¶

The LLM only sees what you put in the conversation history: instructions, system messages, and input messages. To give the LLM new data:

Instructions: You can put static data (e.g. current date, user name) in the agent’s instructions (system prompt). instructions can also be a function using context to produce a string.
Input: You can prepend data in the user prompt when calling Runner.run.
Function tools: Let the LLM fetch data on demand via tools.
Retrieval/Web search tools: Use the built-in tools to fetch relevant documents or web info.

For example, to make user’s name available to the model, you might do:

Python
agent = Agent(
    name="Assistant",
    instructions=f"The user is named {user_info.name}. Help them with their question."
)

Or have a @function_tool that returns user profile and let the model call it when needed.

In short, any extra context for the LLM must come through the messages it sees.

Guardrails¶

Guardrails are like side-checks that run in parallel with your agent. They let you validate or sanitize inputs and outputs cheaply. For instance, you might run a quick check on user input to block harmful queries before using an expensive model.

There are two types of guardrails:

Input guardrails: Run on the user’s initial input, before any agent.
Output guardrails: Run on the final output of the last agent.

Each guardrail is essentially a function you write that returns a GuardrailFunctionOutput. It contains:

Some output_info (optional additional info).
A tripwire_triggered boolean flag.

If tripwire_triggered is True, the SDK raises an exception (InputGuardrailTripwireTriggered or OutputGuardrailTripwireTriggered) and stops the run.

Input Guardrails¶

Input guardrails steps:

Receive the initial input (text or message list).
Run your guardrail function to get a GuardrailFunctionOutput.
Check if tripwire_triggered is true. If so, an InputGuardrailTripwireTriggered exception is raised.

Write input guardrails with the @input_guardrail decorator. For example:

Python
from agents import input_guardrail, GuardrailFunctionOutput

@input_guardrail
async def block_homework(ctx, agent, input_data: str) -> GuardrailFunctionOutput:
    is_math_homework = "solve for x" in input_data.lower()
    return GuardrailFunctionOutput(
        output_info=None,
        tripwire_triggered=is_math_homework
    )

Attach it to an agent:

Python
agent = Agent(
    name="Support",
    instructions="Help customers.",
    input_guardrails=[block_homework]
)

If a user asks something with "solve for x", the guardrail trips and the agent stops running the expensive model.

Note: Input guardrails only run for the first agent in a run, since they’re meant for initial user input.

Output Guardrails¶

Output guardrails are similar but run after the agent produces its final output:

Receive the final output (as string or object).
Run guardrail function to get GuardrailFunctionOutput.
If tripwire_triggered is true, raise OutputGuardrailTripwireTriggered.

Example output guardrail:

Python
from agents import output_guardrail, GuardrailFunctionOutput

@output_guardrail
async def censor_inappropriate(ctx, agent, output: str) -> GuardrailFunctionOutput:
    trigger = "secretcode" in output
    return GuardrailFunctionOutput(
        output_info=None,
        tripwire_triggered=trigger
    )

agent = Agent(
    name="Responder",
    instructions="You answer questions.",
    output_guardrails=[censor_inappropriate]
)

If the model’s answer contains "secretcode", the guardrail trips and no output is returned.

Note: Output guardrails only run for the last agent, since they check the final answer.

Tripwires¶

When any guardrail tripwire is triggered, the run is immediately halted with the corresponding exception. You can catch these exceptions to handle them. For example:

Python
try:
    result = await Runner.run(agent, user_input)
except InputGuardrailTripwireTriggered:
    print("Input was not allowed.")
except OutputGuardrailTripwireTriggered:
    print("Output was not allowed.")

When a tripwire triggers, no output is returned for that run.

Implementing a Guardrail¶

Here’s a fuller example of an input guardrail implemented by running a mini-agent:

Python
from pydantic import BaseModel
from agents import Agent, Runner, GuardrailFunctionOutput, InputGuardrailTripwireTriggered, input_guardrail, RunContextWrapper

class MathHomeworkOutput(BaseModel):
    is_math_homework: bool
    reasoning: str

guardrail_agent = Agent(
    name="Guardrail check",
    instructions="Check if the user is asking for math homework.",
    output_type=MathHomeworkOutput,
)

@input_guardrail
async def math_guardrail(ctx: RunContextWrapper[None], agent: Agent, input_text: str):
    result = await Runner.run(guardrail_agent, input_text, context=ctx.context)
    return GuardrailFunctionOutput(
        output_info=result.final_output,
        tripwire_triggered=result.final_output.is_math_homework,
    )

agent = Agent(
    name="Customer support",
    instructions="You help with customer questions.",
    input_guardrails=[math_guardrail],
)

# In a run:
try:
    await Runner.run(agent, "Solve for x: 2x + 3 = 11")
except InputGuardrailTripwireTriggered:
    print("Math homework guardrail tripped")

Here, math_guardrail runs a helper agent to detect homework. If it is, it triggers the tripwire.

Output guardrails work the same way. You can check this to see an example: https://openai.github.io/openai-agents-PyDeepOlympus/guardrails/.

Orchestrating multiple agents¶

Orchestration means controlling which agents run and in what order to achieve your application’s goals. There are two main approaches:

Via LLM (Autonomous Planning): Give an agent tools and have it decide the steps. It can use web search, function tools, code, and handoffs to figure out what to do. This works well for open-ended tasks. For success:
Write good prompts listing available tools and constraints.
Monitor your app and improve prompts as needed.
Let agents self-improve (e.g. by critiquing themselves).
Use specialized agents rather than one do-it-all agent.
Use evaluations (OpenAI Evals) to train and improve agents.

Example: A research agent could have tools for web search, database lookup, code execution, and handoffs to writing agents. You prompt it to plan and use these tools to answer a query.

Via code (Deterministic chaining): You control the flow in your Python code. This can be more predictable and efficient. Patterns include:
Asking one agent to categorize a task, then manually picking next agent based on that category.
Chaining agents: Agent1 output -> Agent2 input -> Agent3, etc. (e.g. outline -> draft -> critique).
Running feedback loops: Have an agent answer and another agent critique it repeatedly until good enough.
Running agents in parallel (e.g. asyncio.gather) for independent sub-tasks.

There are examples of these patterns in the official repo: https://github.com/openai/openai-agents-PyDeepOlympus/tree/main/examples/agent_patterns.

Mixing both methods is fine: you might have a high-level loop in Python that sometimes calls agents flexibly and sometimes in a fixed sequence.

For example, you might:

Python
# LLM-driven agent decides steps:
result = await Runner.run(autonomous_agent, "Plan tasks to organize a party.")

# Or code-driven chaining:
outline = await Runner.run(research_agent, "Find topics on climate change.")
draft = await Runner.run(writer_agent, outline.to_input_list())
print(draft.final_output)

Use whichever approach suits your task. Autonomous planning is powerful for vague tasks, but code orchestration is safer and more controllable for well-defined pipelines.

Models¶

By default, the SDK supports OpenAI models in two ways:

OpenAIResponsesModel: Uses the new Responses API.
OpenAIChatCompletionsModel: Uses the classic Chat Completions API.

We recommend using OpenAIResponsesModel with OpenAI’s latest models when possible.

Non-OpenAI Models (LiteLLM)¶

You can use other LLM providers via the LiteLLM integration. First, install the extra dependencies:

Text Only
1	`pip install "openai-agents[litellm]"`

Then you can use the litellm/ prefix with many models. For example:

Python
claude_agent = Agent(model="litellm/anthropic/claude-3-5-sonnet-20240620", instructions="...")

gemini_agent = Agent(model="litellm/gemini/gemini-2.5-flash-preview-04-17", instructions="...")

This uses the LiteLLM library to call those models.

You can also integrate other LLMs in other ways:

set_default_openai_client(...): If an LLM provider has an OpenAI-compatible endpoint, you can give AsyncOpenAI(base_url=..., api_key=...) to the SDK as the default client.
ModelProvider: At each run, you can pass a model_provider to use a different provider for that run.
Agent.model: You can also give a specific Agent a custom Model object (like an OpenAIChatCompletionsModel instance or a LitellmModel).

If you use non-OpenAI models, consider that:

Not all providers support the new Responses API. Many only support chat completion API. If you get a 404, either call set_default_openai_api("chat_completions") or use OpenAIChatCompletionsModel.
Not all providers support structured JSON outputs. If you try to use a JSON schema output on a provider that doesn’t support it, you may get errors like BadRequestError.
Providers may lack features: some don’t support file search, web search, or vision. Make sure your used features are supported or avoid them.

Mixing and Matching Models¶

Within one app, you might use different models for different agents. For example, a light agent for routing and a big agent for deep tasks. To do this, set each agent’s model:

Python
english_agent = Agent(name="English", model="o3-mini", instructions="English only")
spanish_agent = Agent(name="Spanish", model="gpt-4o", instructions="Spanish only")
triage_agent = Agent(name="Triage", instructions="Choose agent for language.",
                     handoffs=[spanish_agent, english_agent], model="gpt-3.5-turbo")

You can also give a ModelSettings(temperature=...) to fine-tune each agent’s model parameters.

Note: Try to stick with one model type (Responses vs Chat) per workflow, because the SDK’s features and tools support may differ. If you do mix them, make sure any feature you use (like structured outputs or multimodal) is supported by all providers involved.

Common issues when using other providers:

Tracing 401 error: Your traces are sent to OpenAI, but if you don’t have a key for OpenAI, you’ll get a 401. Fix by disabling tracing (set_tracing_disabled(True)) or setting an OpenAI key for traces (set_tracing_export_api_key(...)).
Chat vs Responses API: By default the SDK uses Responses API. Many providers don’t support it. If you get an error, switch to chat completions via set_default_openai_api("chat_completions") or use OpenAIChatCompletionsModel.
Structured output errors: Some models support JSON but not custom schemas. You might see a 400 error about 'response_format.type' not allowed. It's a limitation of the provider. For now, avoid structured schema outputs on those providers.
Feature differences: Different providers have different capabilities (e.g. image, retrieval, special tools). Don’t send unsupported tools or data. For example, don’t send an image to a text-only model. If mixing, filter out unsupported features.

Using any model via LiteLLM¶

LiteLLM integration lets you pick from 100+ models easily via the LitellmModel class.

Setup:

Bash
pip install "openai-agents[litellm]"

Then:

Python
from agents.extensions.models.litellm_model import LitellmModel

agent = Agent(
    name="Assistant",
    instructions="You respond in haikus.",
    model=LitellmModel(model="anthropic/claude-3-5-sonnet-20240620", api_key="YOUR_API_KEY"),
    tools=[get_weather],
)
result = await Runner.run(agent, "What's the weather in Tokyo?")
print(result.final_output)

When you use LitellmModel, you can use any model with the model name and api key For example, you could type google/gemini-1.5-flash and your Google key, or anthropic/claude-3-5-sonnet-20240620 and your Anthropic key.

LiteLLM supports many providers see: https://openai.github.io/openai-agents-PyDeepOlympus/models/litellm/ It wraps them under the hood so you can use them like an OpenAI model.

Configuring the SDK¶

API Keys and Clients¶

By default, the SDK looks for the OPENAI_API_KEY environment variable for making model requests and for sending traces. This happens as soon as the SDK is imported. If you can’t set the environment variable early, you can manually set the key in code:

Python
from agents import set_default_openai_key
set_default_openai_key("sk-...")

This will override the environment variable for model calls. By default, this key is also used for tracing. If you want to use a different key for tracing, see below.

Tracing¶

Tracing is on by default. It uses your OpenAI API key for sending traces. If you want to explicitly set the tracing key (maybe different from your model key):

Python
from agents import set_tracing_export_api_key
set_tracing_export_api_key("sk-...")

To disable tracing entirely:

Python
from agents import set_tracing_disabled
set_tracing_disabled(True)

Client¶

You can also set a custom OpenAI client. The SDK uses an AsyncOpenAI client by default. To use a custom one:

Python
from openai import AsyncOpenAI
from agents import set_default_openai_client

custom_client = AsyncOpenAI(base_url="https://api.custom.com", api_key="...")
set_default_openai_client(custom_client)

Finally, you can switch which OpenAI API is used. By default it uses the Responses API. To force the classic Chat API:

Python
from agents import set_default_openai_api
set_default_openai_api("chat_completions")

Debug Logging¶

The SDK has built-in loggers (no handlers by default). Warnings/errors go to stdout, but other logs are hidden unless you enable them.

To turn on verbose logging to stdout:

Python
from agents import enable_verbose_stdout_logging
enable_verbose_stdout_logging()

Sensitive Data in Logs¶

Some log messages may include user data or model inputs/outputs. To disable logging this:

Set OPENAI_AGENTS_DONT_LOG_MODEL_DATA=1 in your environment to hide model prompts/responses.
Set OPENAI_AGENTS_DONT_LOG_TOOL_DATA=1 to hide tool inputs/outputs.

Agent Visualization¶

You can visualize how your agents, tools, and handoffs connect using Graphviz. The draw_graph function creates a directed graph:

Agents are yellow rectangles.
Tools are green ellipses.
Handoffs are solid arrows between agent boxes.
Tool calls are dotted arrows from agents to tools.
There is a special start node __start__ and an end node __end__.

Installation¶

Install the optional visualization tools:

Text Only
1	`pip install "openai-agents[viz]"`

Generating a Graph¶

Use draw_graph(root_agent) to make the graph. Example:

Python
from agents.extensions.visualization import draw_graph
import os
import dotenv
from agents import Agent, Runner, function_tool, OpenAIChatCompletionsModel, AsyncOpenAI, set_tracing_disabled # type: ignore

dotenv.load_dotenv()
set_tracing_disabled(True)

api_key = os.environ.get("GEMINI_API_KEY")
client = AsyncOpenAI(
    api_key=api_key,
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)
llm = OpenAIChatCompletionsModel(model='gemini-1.5-flash', openai_client=client)

@function_tool
def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny."

spanish_agent = Agent(
    name="Spanish agent",
    instructions="You only speak Spanish.",
    model=llm
)

english_agent = Agent(
    name="English agent",
    instructions="You only speak English",
    model=llm
)

triage_agent = Agent(
    name="Triage agent",
    instructions="Handoff to the appropriate agent based on the language of the request.",
    handoffs=[spanish_agent, english_agent],
    tools=[get_weather],
    model=llm
)

draw_graph(triage_agent)

This will display an inline image (in a notebook or supported environment) showing triage_agent connected to spanish_agent and english_agent, with get_weather as a tool.

Understanding the Graph¶

In the generated graph:

The start node (__start__) shows the entry point.
Yellow boxes are agents.
Green ellipses are tools.
Solid arrows show agent-to-agent handoffs.
Dotted arrows show agent-to-tool calls.
The end node (__end__) shows termination.

This helps you see at a glance how your agents and tools are structured.

Customizing the Graph¶

By default, draw_graph() displays the graph inline. You can also open it in a separate window or save it to a file:

To open in a window (for example, if running locally):

Python
draw_graph(triage_agent).view()

To save to a file:

Python
draw_graph(triage_agent, filename="agent_graph.png")

This creates agent_graph.png (or .pdf if you specify) in your working directory.

OpenAI Agents SDK Guide (v0.0.17)¶

Why use the Agents SDK¶

Installation¶

Hello World Example¶

Quickstart¶

Create a Project and Virtual Environment¶

Activate the Virtual Environment¶

Install the Agents SDK¶

Set the OpenAI API Key¶

Create Your First Agent¶

Add More Agents¶

Define Handoffs¶

Add Tools (Optional)¶

Define a Guardrail (Optional)¶

Run the Agent Workflow¶

Put It All Together¶

View Traces (Optional)¶

Next Steps¶

Agents¶

Basic Configuration¶

Context¶

Dynamic Instructions¶

Handoffs¶

Lifecycle Events¶

Guardrails on Agents¶

Cloning/Copying Agents¶

Forcing Tool Use¶

Running agents¶

The Agent Loop¶

Streaming¶

Run Configuration (run_config)¶

Conversations / Chat Threads¶

Exceptions¶

Results¶

Streaming¶

REPL Utility¶

Tools¶

Model Context Protocol (MCP)¶

Handoffs¶

Creating a Handoff¶

Customizing with handoff()¶

Handoff Inputs¶

Input Filters¶

Recommended Prompts for Handoffs¶

Tracing¶

Traces and Spans¶

Higher-level Traces¶

Creating Traces Manually¶

Spans¶

Sensitive Data¶

Custom Trace Processors¶

Context Management¶

Local Context (RunContextWrapper)¶

Agent/LLM Context (Prompt, System Instructions etc..)¶

Guardrails¶

Input Guardrails¶

Output Guardrails¶

Tripwires¶

Implementing a Guardrail¶

Orchestrating multiple agents¶

Models¶

Non-OpenAI Models (LiteLLM)¶

Mixing and Matching Models¶

Using any model via LiteLLM¶

Configuring the SDK¶

API Keys and Clients¶

Tracing¶

Client¶

Debug Logging¶

Sensitive Data in Logs¶

Agent Visualization¶

Installation¶

Generating a Graph¶

Understanding the Graph¶

Customizing the Graph¶

After reading this whole, Now you are ready to get start with the official docs: https://openai.github.io/openai-agents-PyDeepOlympus/¶

Happy Coding 💖¶

Run Configuration (`run_config`)¶

Customizing with `handoff()`¶