agent-cost

Why tool-calling agents
get expensive.

An agent resends the entire conversation every turn — so input tokens compound and cost grows quadratically with turns, not linearly. Adjust the inputs and watch where the money goes.


USD in / 1M
USD out / 1M
total cost
total tokens
paid to replay history

the math

At turn t the model receives the system prompt plus everything before it: input(t) = base + (t−1)·(output + tool_result) + tool_result. Summed over N turns the replayed-history term grows like . Output is billed once per turn.

the two levers

Turns make history compound; large tool results get replayed on every later turn. Truncating tool payloads and enabling prompt caching — which bills the repeated prefix at roughly a tenth — are the highest-leverage fixes. Most surprise bills are one of these two.

Provider prices are entered in USD and editable. Currency totals use approximate live rates and are a client-side estimate, not a billing guarantee.