Hiring for production AI — what to actually screen for
Resume signals that don't matter, interview questions that do, and how to tell a senior AI engineer from someone who can run a notebook.
- hiring
- team
- agents
The AI hiring market is broken
Everyone with a Coursera certificate is calling themselves an AI engineer. Everyone with three blog posts about RAG is a senior. Pay bands have inflated 30-50% in 18 months. And most of the candidates can't ship anything past a Jupyter notebook to production.
Here's what to actually screen for if you're hiring AI engineers in 2026.
Resume signals that don't matter
- Number of courses completed. Anyone can finish a course. The market is flooded.
- Big-co AI team affiliation. Some of the worst AI engineers we've interviewed came from FAANG AI orgs — they shipped one feature into a stack of 50 and called themselves senior.
- GitHub stars on a notebook repo. Notebooks aren't production. Stars are vanity.
- A Hugging Face Space they made. Same as above. Toy.
- Number of LLM-related blog posts. Sometimes correlates with thinking. Often correlates with marketing.
Resume signals that do matter
- Production traffic numbers. "I shipped a feature handling X requests/day with Y latency target" — they have specifics.
- Eval suite mentions. Anyone who's shipped production AI has fought an eval suite into existence. They'll bring it up unprompted.
- Cost-aware design. They'll describe how they optimized for token cost as well as quality.
- A failure they own. "I shipped this and it broke, here's what we did" — the most reliable senior signal.
Interview questions that actually work
1. "Tell me about an eval suite you built. How did it evolve over the first 90 days?"
Strong candidates describe specific cases that got added, specific regressions caught, and specific things they decided not to score. Weak candidates describe "we wrote tests."
2. "Walk me through the cost trade-off on a recent prompt change."
Senior AI engineers think about token cost the way senior backend engineers think about query cost. They have a number per conversation, an opinion about where it's going, and a plan for when it crosses a threshold.
3. "Show me a refusal pattern you wrote."
When the agent shouldn't answer. Strong candidates have a library of these — they're production scars. Weak candidates think refusal is something the model decides on its own.
4. "Describe an agent action you decided not to ship to auto-execute."
The strong signal: a real example of a decision deferred to a human, with the reasoning. Senior engineers earn auto-execute trust — they don't start with it.
5. "What's your default first-call architecture for a customer-support chatbot?"
A passable answer covers RAG with hybrid search + reranking, citation-required prompting, refusal patterns, eval suite seeded at 20 cases, escalation queue. A great answer mentions the gotchas (chunk boundaries, citation hallucination, multi-turn state).
What we screen out
- Candidates who can't reproduce their own results without a notebook.
- Anyone who's never owned a model in production for 6+ months.
- Engineers who haven't argued with a PM about a refusal pattern.
- Anyone who describes LLM development as "configuring prompts."
The seniority signal that compounds
Senior AI engineers think in production from sentence one. They mention cost, latency, observability, eval suites, refusal patterns, and audit trails without being asked. They've shipped, broken things, fixed them, and have scars.
If your candidate doesn't talk like that, they're not senior, regardless of their title.
Compensation reality
US market for senior AI engineers is $250-400K base + equity at well-funded startups. Bay Area FAANG-equivalents go higher. For most companies, that's tough to compete with.
Two options that work: pay above-market for a small in-house team (2-3 people, $300K+ each), or partner with a delivery agency where senior engineering hours are $25-150/hour depending on geography. Either is fine. Hybrid is often the best answer.
Trying to hire 8 cheap "AI engineers" in 6 months is the worst answer. You'll get 8 people who can't ship.