If you’ve ever woken up to a surprise $500 bill from OpenAI, you know the pain. Unlike traditional software where costs are predictable, AI applications have a hidden danger: a single bug or bad prompt can drain your budget in minutes.Imagine this: your AI agent gets stuck in a loop, calling GPT-4 thousands of times. By the time you notice, the damage is done. Traditional usage alerts from LLM providers arrive too late—sometimes hours after the spending happened.Aden acts as a speed bump between your code and the LLM API. It tracks every call in real-time and can automatically take action before costs spiral out of control.
The SDK enforces budgets that you configure on the Aden control server. When your application starts, it connects to the server and receives the current policy. All enforcement happens locally—no per-request latency added.
Once budgets are configured on the server, your SDK automatically enforces them:
Copy
// At $0 spent (0% of $100 budget)await openai.chat.completions.create({ model: "gpt-4o", ... });// → Uses gpt-4o ✓// At $50 spent (50% of budget)await openai.chat.completions.create({ model: "gpt-4o", ... });// → Uses gpt-4o, triggers onAlert callback with "warning"// At $80 spent (80% of budget)await openai.chat.completions.create({ model: "gpt-4o", ... });// → Automatically uses gpt-4o-mini instead (degraded)// At $95 spent (95% of budget)await openai.chat.completions.create({ model: "gpt-4o", ... });// → Uses gpt-4o-mini, request delayed 2 seconds (throttled)// At $100+ spent (100% of budget)await openai.chat.completions.create({ model: "gpt-4o", ... });// → Throws RequestCancelledError (blocked)
The SDK caches the policy locally and syncs with the server periodically. This means zero latency overhead on each request while still getting real-time budget updates.
Alerts are great, but what if you want the system to automatically slow down? Throttling adds delays to requests when approaching limits—your agent keeps working, just more slowly.
This is one of the most powerful features. When approaching your budget, why pay premium prices? Aden can automatically switch to a cheaper model—your users still get answers, and you save money.
Budget at 0-50%: "gpt-4o" → Full power, highest qualityBudget at 50-100%: "gpt-4o-mini" → Cheaper, still good for most tasksBudget exceeded: [blocked] → No more requests
The magic is that your code still asks for gpt-4o—Aden silently swaps it to the cheaper model behind the scenes.
Model degradation keeps your application running without any code changes. Your users get answers, your wallet stays happy.
Your code stays the same—the SDK handles the model swap automatically:
TypeScript
Python
Copy
import { instrument } from "aden-ts";import OpenAI from "openai";await instrument({ apiKey: process.env.ADEN_API_KEY, serverUrl: process.env.ADEN_API_URL, sdks: { OpenAI },});const openai = new OpenAI();// Early in the month (0% budget used)await openai.chat.completions.create({ model: "gpt-4o", // You ask for gpt-4o messages: [{ role: "user", content: "Quick question" }],});// → Uses gpt-4o ✓ (full quality)// Later (50%+ budget used)await openai.chat.completions.create({ model: "gpt-4o", // You still ask for gpt-4o messages: [{ role: "user", content: "Another question" }],});// → Automatically uses gpt-4o-mini instead (cheaper!)
Copy
import osimport asynciofrom aden import instrument_async, uninstrument_async, MeterOptionsasync def main(): await instrument_async(MeterOptions( api_key=os.environ.get("ADEN_API_KEY"), server_url=os.environ.get("ADEN_API_URL"), )) from openai import AsyncOpenAI client = AsyncOpenAI() # Your code always asks for the premium model response = await client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Help me with this task"}], ) # Aden may have used gpt-4o-mini behind the scenes # Your app keeps working, costs stay under control await uninstrument_async()asyncio.run(main())
Sometimes you need to just stop. When your budget is exhausted, Aden can block requests entirely. This is your emergency stop button—it stops the bleeding immediately.
When the budget hits 100%, any new LLM request fails with a RequestCancelledError. The request never reaches OpenAI/Anthropic, so you don’t get charged.
Make sure your application handles RequestCancelledError gracefully. Show users a friendly message like “You’ve reached your usage limit” rather than crashing.