We built an AI agent for manufacturing: here's what it cost and what it does
Real numbers — total build cost, monthly run cost, time-to-value, and what we'd do differently. From our Multi-Agent Manufacturing case study.
- manufacturing
- agents
- case-study
The build, decoded
We've published a case study on the Multi-Agent Manufacturing System we shipped — 31% downtime reduction, 90 days, etc. The numbers there are real. This post unpacks what's underneath them: cost, complexity, what worked first try, what didn't.
What it actually does
Six specialized agents — Production Monitor, Maintenance Scheduler, Quality Anomaly, Inventory/Reorder, Energy Optimization, Shift Briefing — running 24/7 across a 40-machine sheet metal plant. They communicate through a shared event bus (Postgres). Every alert routes to a human via WhatsApp and the existing supervisor dashboard.
No agent takes physical action on its own. Humans always pull the trigger.
Build cost
Fixed-fee Sprint: 12 weeks. Total build cost was in the range of $90K–$120K loaded. Engagement-specific factors moved the number:
- Data integration work. SCADA, ERP, operator tablets — three systems with three different access patterns. The first month was almost entirely data plumbing.
- On-prem edge gateway. Latency and data-residency requirements meant we deployed an edge gateway. Hardware was minor (NUC-class), but the deployment work added 2 weeks.
- Six agents instead of one. Each agent is a separate LangGraph workflow with its own tools, eval suite, and observability hooks. The marginal cost of agents 2–6 was much lower than agent 1, but it's still real.
Run cost
Monthly operating cost (after launch) is roughly $1,800/month:
- Anthropic API spend: $1,200/month (varies with shift volume)
- Edge gateway compute + cloud reasoning: $400/month
- Observability + logging: $200/month
A Monthly Retainer ($X — call it 60 hours/month) layered on top, used for: weekly model retraining, eval coverage as the plant added new machines, two new agent additions in the second quarter.
What worked first try
- One agent per role. Mirroring the org chart turned out to be the right mental model. Operators understood it immediately.
- WhatsApp routing. Operators don't read email and don't sit at a dashboard. WhatsApp was the only channel that mattered.
- Postgres as the event bus. Kafka was overkill. Postgres was already in the stack. Skipping Kafka saved 2 weeks.
What we'd do differently
- Heavier discovery on shift handoffs. The Shift Briefing Agent was the dark-horse win, but we underspeced it. Two weeks more discovery would've gotten the format right faster.
- Tighter eval on the Maintenance Scheduler. This agent makes calendar decisions; the eval signal was lagging. We bolted on a synthetic test harness in week 8 — should've been week 1.
- Skipped LightGBM in v1. We layered classical ML for downtime prediction. In hindsight, simple thresholding would've covered 80% of value in v1, and ML could've come in v2 once the agent was trusted.
What to ask before you green-light a build like this
- Is your data accessible? SCADA + ERP + operator inputs all needed plumbing. If yours is locked behind vendors, double the timeline.
- Do you have a human-in-loop owner? The supervisor on shift is the one who acts on the agent's signals. If that ownership isn't clear, build a dashboard first, an agent later.
- Are you ready to run it? Monthly retraining and eval coverage is a 5–15 hour/month commitment. If no one owns that, the model drifts and you blame the agent.
- Is the downtime cost honest? Our client knew their hourly downtime cost to the penny. That made the ROI math obvious. If you can't ballpark yours within 20%, do that first.
What it'd cost for you
A scoped version of this — say 2–3 agents instead of 6, no on-prem requirement, cleaner data — runs $50K–$75K loaded over 8–10 weeks. With on-prem, multi-site, or a heavier ML component, plan on $90K–$150K.
If the math doesn't work, it doesn't work. We'll tell you on the call.