Give every customer their own AI budget

Set a spending limit for every customer, or team, or agent. Stop overspend the moment it happens, not when the invoice lands. And see exactly what each one cost, so you can bill it back.

Request access View docs

The problem

Selling AI is easy. Controlling it isn't.

The provider sends one bill. Your customers are many. Everything you can't see or stop lives in that gap.

// can't see

You can't tell who costs what

The provider bill is a single number. You've no way to know what any one customer, team, or feature actually cost you.

// margin leak

One heavy user wipes the margin

A single customer on a flat plan can quietly burn the profit you made on ten others, and you find out when the invoice arrives.

// no real limit

Your plan limits aren't real

Your pricing says "500 AI actions a month." Nothing actually stops someone using 5,000 and handing you the bill.

How it works

Budget. Enforce. Bill.

No rebuild. Point your existing AI calls through us, and set limits on the customers you already have.

Budget

Give everyone a limit

Map a budget to each customer, team, or agent. Free gets £5, Pro gets £50, Enterprise whatever you choose — tied to the plans you already sell.

Enforce

Stop overspend in real time

We check every call against the budget and block it the instant the limit's reached — before the cost lands, not after the invoice.

Bill

Charge for what was used

Get the exact cost of every customer, ready to drop into an invoice. The number your provider's bill never breaks down for you.

Works with your existing OpenAI or Anthropic setup — one line of config, or our API. See the docs →

How strict

Soft caps, or hard caps. Your call.

Most teams are well served by soft caps. Turn on hard caps for the budgets where going over isn't an option.

Soft caps · every plan

Simple and instant

Spend is tracked as it happens, and calls are blocked the moment a budget is hit. Works with every model and workload.

Right for most teams. Included on every plan, free tier upward.

Hard caps · add-on

A limit that holds under load

For customer-billed AI, regulated spend, or anything high-stakes. We reserve each call's worst-case cost before it runs — like a hotel holding a deposit on your card at check-in — then settle to the real amount after. Concurrent calls can't collectively bust the limit.

Add to any paid plan. Included on Enterprise. How hard caps work →

What you get

What's in the box

Set up in an afternoon. Works with your existing providers.

Per-customer budgets

A budget for every customer, team, or agent — enforced automatically, in real time.

Real-time enforcement

Calls are blocked the moment a limit is hit. Decisions land in milliseconds, not minutes.

Per-customer usage core

See exactly what each customer used. Export it, bill on it, prove it.

Soft or hard caps

Block at the limit, or reserve cost up front for a guarantee that holds under load — per budget.

200+ models

OpenAI, Anthropic, Gemini, Mistral, Cohere, DeepSeek — one integration, pricing kept current.

Instant alerts

Slack, webhook, or email when a budget is hit or crosses 80%.

Loop protection

Velocity limits catch runaway loops before a bug burns a whole budget.

Full audit trail

Every allowed and blocked call, durably recorded — the record behind what you bill.

Prompts never stored

We record usage and cost, never the contents of your calls.

Ready to invoice

Every customer's cost, itemised

A real billing report — daily and monthly breakdowns, cost per call, CSV export. Not a mockup.

Billing report showing per-customer AI spend grouped by customer

Every row is durably recorded with timestamps. The same data you bill on is the data you defend a chargeback with — one record your finance team can audit on demand.

Pricing

Start free. Pay as you grow.

Simple monthly tiers. Hard caps available as an add-on whenever you need them.

Free

£0

1 agent. Try it on a real workload.

Starter

£25/mo

10 agents, alerts, full enforcement.

Growth

£52/mo

50 agents, all alerts, velocity limits.

Enterprise

£300+/mo

100 agents, priority support, hard caps included.

Hard caps are an add-on for Starter and Growth — for the budgets where going over isn't an option. Included on Enterprise.

Questions

The short version

The things people ask before trying it. The full detail lives in the docs.

Do I have to change my code?

Barely. Point your existing OpenAI or Anthropic setup at our proxy with one line of config, or keep calling the model yourself and report usage through our API. Either way you keep your current code and providers — most teams are running in an afternoon.

What's the difference between soft and hard caps?

A soft cap tracks spend and blocks calls the moment a budget is hit — simple, instant, and right for most teams. A hard cap reserves each call's worst-case cost before it runs, so concurrent calls can't collectively bust the limit — for budgets where going over isn't an option. Hard caps are an add-on for paid plans; included on Enterprise. How hard caps work →

Which models and providers do you support?

OpenAI and Anthropic are supported via proxy (one config change, no other code changes). Gemini, Mistral, Cohere, DeepSeek, and 200+ other models are supported via monitoring mode — your code calls the provider directly and reports token usage to us. A single budget can span more than one provider.

Is my data private?

Yes. We record usage and cost — model, tokens, timing, the allow-or-block decision — and never the contents of your prompts or responses.

What does it cost?

Start free on one agent. Paid plans run from £25/mo (Starter) and £52/mo (Growth) to £300+/mo (Enterprise). Hard caps are an add-on for Starter and Growth; included on Enterprise.

Can I bill my own customers from this?

That's the point. You get the exact cost of every customer, broken down by day and by call, with CSV export — the number a provider invoice never gives you, ready to drop into your own billing.

Stop guessing what each customer costs you

Set a budget for every customer, enforce it in real time, and bill on exactly what was used.

Request access View docs