Non-obvious decisions

Forks in the road we took the less-obvious way.

Ten decisions where the obvious answer would have been a mistake. Tech choices, engagement structure, hiring cadence. The context, the why, and what we learned.

Decision #1

Postgres over Kafka for the multi-agent event bus

Context

When we built the Multi-Agent Manufacturing System, the obvious answer for inter-agent communication was Kafka. We picked Postgres.

Why we made this call

The client already ran Postgres. Their ops team had years of muscle memory with it. Latency requirements were 'sub-second', which Postgres handles trivially. Kafka would have added a 4-week setup tax for zero functional gain.

What we learned

When the obvious answer adds operational complexity without buying performance you need, the boring answer is correct. We default to Postgres for event buses now and only reach for Kafka when message volume genuinely demands it.

Decision #2

Vendor shadcn/ui primitives rather than install

Context

shadcn/ui is the standard React component library for modern Next.js sites. The standard pattern is `npx shadcn add button`. We hand-copied the source into our repo.

Why we made this call

We didn't want surprise updates to break component variants when a maintainer changes class names. Vendoring is 50 lines per primitive — trivial to maintain. We get full customization control and zero version-pinning anxiety.

What we learned

For surface-area components you'll customize heavily, vendoring beats installing. For deeply integrated frameworks (auth, routing), install. The line is 'do I want to customize the API itself?'

Decision #3

Server-Sent Events over WebSockets for streaming

Context

Streaming AI responses in 2026 — Vercel AI SDK uses SSE. Most production AI infra uses WebSockets for 'real-time.' We default to SSE.

Why we made this call

SSE is half the complexity. One-way streaming is what AI generation actually needs. Reconnection is built into the protocol. Cleaner mental model for the team and for code reviewers from outside our team.

What we learned

Use the simpler protocol when the simpler protocol is enough. WebSockets pay off when you need bidirectional, low-latency. AI streaming isn't that.

Decision #4

Refusal patterns at the orchestration layer, not in the system prompt

Context

The common pattern is to put refusal instructions in the system prompt: 'refuse if X, refuse if Y...' We moved most of those to deterministic checks in code, before the model is even called.

Why we made this call

Prompt-based refusal is probabilistic. Code-based refusal is deterministic. For categories like 'never quote pricing,' you don't want a 95% refusal rate — you want 100%. The model decides intent; the code decides allow/deny.

What we learned

Use the model where it shines (intent classification, nuanced understanding). Don't use it where determinism matters more (categorical refusals, action allowlists).

Decision #5

We turn down 1 in 8 engagements we get to discovery on

Context

Most agencies say yes to every engagement that pays. We say no when the ROI math doesn't support the build, when the data isn't there, or when the named owner doesn't exist.

Why we made this call

Fixed-fee economics break if you take on bad-fit work. One unhappy client costs more than five neutral ones in reputation alone. Saying no protects the fixed-fee model's integrity.

What we learned

Saying no in discovery is the kindest thing you can do for a client who'd otherwise burn $80K on a build that wouldn't ship. The retainer renewal economics make up for the lost top-line.

Decision #6

Eval suites in client repos, not ours

Context

Most agencies keep their eval suites internal. We put eval suites in the client's repo from day one.

Why we made this call

When we leave, we want the eval suite to keep running. If it lives in our infra, it dies with the engagement. In their repo, with their CI, owned by their named human — it's sticky.

What we learned

If you want a deliverable to outlive your engagement, put it where the client lives, not where you live. Even at the cost of more setup work upfront.

Decision #7

$25/hour as the public starting rate

Context

Standard practice for offshore AI agencies is to obscure rates and quote per-engagement. We publish the $25/hour rate on the site.

Why we made this call

It filters the funnel ruthlessly. Anyone shocked by $25/hour isn't our buyer. Anyone who's been quoted $150-250/hour by a US agency sees us as exceptional value. Public pricing collapses bad-fit conversations early.

What we learned

Public, specific pricing is a sorting mechanism. It costs a tiny number of high-budget deals we wouldn't have won anyway, and saves dozens of hours of mis-qualified discovery calls.

Decision #8

We hire when demand calls, not on a growth curve

Context

Standard agency playbook: hire ahead of pipeline. We hire when client demand commits us to it.

Why we made this call

Hiring ahead of demand requires you to win every pitch to feed the bench. That pressure leads to taking bad-fit work. Hiring against signed commitments removes that pressure entirely.

What we learned

Your hiring cadence is a strategic lever, not just a cost lever. Slower hiring means you can say no more. Saying no more means better client outcomes. Better outcomes mean better referrals. Compounds.

Decision #9

We don't claim AI quality SLAs — we claim audit-trail SLAs

Context

Clients sometimes ask for 'X% accuracy SLA' on AI features. We won't sign that. We will sign 'every output has a replayable audit trail within Y minutes' and 'every regression caught in production becomes a permanent eval case within Z business days.'

Why we made this call

Quality SLAs on probabilistic systems are dishonest. Audit-trail and response-time SLAs on the system around the AI are honest and measurable. Auditors actually prefer the latter.

What we learned

Build SLAs around the deterministic parts of your AI system, not the probabilistic ones. Auditors agree. Clients adapt to it quickly once you explain why.

Decision #10

TypeScript on the frontend, Python on the backend, no GraphQL

Context

Common pattern: monorepo with GraphQL throughout. We separate cleanly — TS on the browser, Python where AI/ML happens, REST in between.

Why we made this call

GraphQL adds complexity that pays off for graph-shaped data with N+1 problems. Our workloads don't have that. REST + structured JSON keeps the team able to read each other's PRs without specialization.

What we learned

Pick technologies because they fit your shape of work, not because they're trending. Boring is reliable. Reliable wins compound. Compound interest funds the next clever choice when it actually matters.

Want to talk through a decision you're facing?

Book a 20-min call. We'll share our take and the trade-off we'd be watching for. No commitment beyond.

Book a 20-min call →Read our principles