An agent resends the entire conversation every turn — so input tokens compound and cost grows quadratically with turns, not linearly. Adjust the inputs and watch where the money goes.
At turn t the model receives the system prompt plus everything before it: input(t) = base + (t−1)·(output + tool_result) + tool_result. Summed over N turns the replayed-history term grows like N². Output is billed once per turn.
Turns make history compound; large tool results get replayed on every later turn. Truncating tool payloads and enabling prompt caching — which bills the repeated prefix at roughly a tenth — are the highest-leverage fixes. Most surprise bills are one of these two.
Provider prices are entered in USD and editable. Currency totals use approximate live rates and are a client-side estimate, not a billing guarantee.