Budget guide

What an AI engagement actually costs.

Five cost layers, three example total budgets, and the five places the numbers tend to surprise people. Written for the budget owner who has to justify this internally.

Run the ROI math →See pricing models

The five cost layers

1 · The build
$40–150K typical · $20–400K range
What the engineering team charges to design, build, eval, and ship. Fixed-fee Sprint for clear scope, hourly for fuzzy scope. Includes 1–2 weeks of paid discovery upfront.
- Mid-sized AI build with senior engineers: $50–90K
- Multi-agent or complex regulated build: $90–200K
- Tiny POC or quick win: $5–25K
2 · Model + API costs
$200–3K / month per workload
What you pay Anthropic, OpenAI, or open-source providers per token. Pay your own accounts; we don't markup. Caching + smaller-model tiering brings this down meaningfully.
- Light conversational workload (100 convos/day): ~$200/mo
- Heavy workload (10K convos/day): $1–3K/mo
- Voice / video / vision: passthrough + per-minute costs
3 · Infrastructure
$50–500 / month per workload
Vercel hosting, Postgres (Supabase / Neon), observability (Langfuse), email (Resend). Often under $100/mo for a single-agent workload, can scale up.
- Vercel hobby/pro: $0–20/mo
- Postgres + storage: $25–100/mo
- Observability + monitoring: $50–200/mo
4 · Ops + tuning (retainer)
$8–25K / month, optional
Ongoing engineering hours to keep the eval suite running, tune prompts as the product evolves, add new agents. Most clients renew here after the initial sprint.
- Light maintenance: $4–8K/mo (10–20 hrs)
- Active development: $10–20K/mo (40–80 hrs)
- Dedicated team: $20–35K/mo (1+ FTE equivalent)
5 · Your internal time
Often underestimated · plan for it
Your team's hours scoping, reviewing, evaluating, deploying. Plan on 5–15 hours/week from a named human during the sprint, then 2–5 hours/week ongoing.
- Engagement owner (PM-level): 5–10 hrs/wk during build
- Engineering reviewer: 2–4 hrs/wk during build
- Eval suite owner: 1–3 hrs/wk ongoing

Three example total budgets (year 1)

Engagement	Timeline	Build	Run + ops	Your time	Year-1 total
Quick win Standup digest, inbox triage, FAQ deflector	2 weeks build + 6 mo run	$10K	~$3K	~$8K	~$21K
Mid-market sprint Customer-support agent, lead qualifier, ERP query	6 weeks build + 12 mo run	$70K	~$12K	~$25K	~$107K
Enterprise multi-agent Manufacturing ops, KYC platform, multi-tenant SaaS feature	12 weeks build + retainer year	$140K	~$25K + retainer $120K	~$60K	~$345K

Engagement

Timeline

Build

Run + ops

Your time

Year-1 total

Quick win

Standup digest, inbox triage, FAQ deflector

2 weeks build + 6 mo run

$10K

~$3K

~$8K

~$21K

Mid-market sprint

Customer-support agent, lead qualifier, ERP query

6 weeks build + 12 mo run

$70K

~$12K

~$25K

~$107K

Enterprise multi-agent

Manufacturing ops, KYC platform, multi-tenant SaaS feature

12 weeks build + retainer year

$140K

~$25K + retainer $120K

~$60K

~$345K

Rough ranges, not commitments. Real quote depends on your data, integrations, compliance bar, and the specific use case.

Where budgets tend to surprise people

Run cost > build cost over 18 months
If you only budget the build and forget run cost, your AI feature looks profitable for 6 months and underwater after that. We surface the 18-month total in scoping so this doesn't surprise anyone.
Eval ownership eats more hours than you'd think
An eval suite that ships and then atrophies is the most common cause of AI quality decay. Plan on a named human putting 2–4 hours/week into it — it's not optional.
Your internal time isn't free
If the engagement owner is your VP of Eng, that's $300/hr loaded cost. 10 hours/week for 6 weeks is $18K in attention cost. Plan for it.
Token costs scale with conversation length, not user count
Common surprise: a few power users dominate cost. Cost per active user can be highly skewed. We surface this in observability from day one so you can manage it.
Mid-engagement scope changes
Once you see the agent working, you'll want to do more with it. That's healthy. Add it as a separate add-on, quoted up front, not folded silently into the sprint.