Efficiency Bug
β οΈ OpenAI Agents SDK Is Wasting Your Resources Here's Why¶
If you're using the stop_on_first_tool
behavior in the OpenAI Agents SDK with parallel tool calls enabled... there's a silent inefficiency you need to know.
Let me break it down:
π What Actually Happens Under the Hood¶
When you do this:
Python | |
---|---|
And the LLM selects multiple tools like:
Python | |
---|---|
Hereβs what the agent actually does:
- The LLM chooses both tools in a single step
- The SDK uses
asyncio.gather()
to run all tool calls in parallel - All tools execute to completion even if
test_tool_two
is expensive or unnecessary - Then it simply discards every result except
tool_results[0]
- Your system never even uses the output of the other tools
ποΈ That Means: Tool Results Are Wasted¶
- β½ API tokens spent for unused calls
- π Time lost on long-running or blocking tools
- β οΈ Side effects triggered even when their results are thrown away
- πΈ You're billed, theyβre ignored
If you're calling external APIs or hitting databases, this can get really expensive.
β Why Does This Happen?¶
Itβs a design trade-off.
The current SDK chooses determinism over efficiency. It guarantees predictable behavior by always using tool_results[0]
, regardless of execution order or timing.
But it does not cancel remaining tasks once the first completes. It does not use
asyncio.wait(..., return_when=FIRST_COMPLETED)
.
This means: you pay for all tools, use only one.
β What Should You Do?¶
If you're using stop_on_first_tool
, do one of the following:
- Only register one tool per step don't give it choices
- Avoid
parallel_tool_calls=True
withstop_on_first_tool
behavior - Write a custom async executor that uses
FIRST_COMPLETED
and cancels the rest
βοΈ What's Missing in the SDK?¶
The SDK should ideally support:
- Efficient first-response wins behavior
- Tool cancellation on-the-fly
- Configurable execution modes for budget-conscious agents
π Until then, itβs up to us to optimize tool usage manually.
π¬ If youβre building agents with expensive tools, keep this in mind or you might be burning tokens and time for nothing.
To see a practical example of this: https://github.com/DanielHashmi/AgenticAIProjects/blob/main/openai_agents_sdk_parallel_tool_calls.py
If you found this helpful β this repo! Keep Coding π