The hype framing
The marketing version of an AI agent: an autonomous reasoning system that decomposes goals, plans actions, executes, reflects, and iterates. It sounds like a being.
The implementation version of the same agent: a while-loop around a model call, where each iteration may invoke one of a handful of tools, with a stop condition that fires when the model returns a final answer or hits a turn limit.
Both descriptions are accurate. The second one ships better.
Why the framing matters
Teams who think of agents as autonomous beings build them autonomously. Wide tool surface, open-ended prompts, "let it figure it out" instructions, no turn limit, no kill switch. The result is unreliable, expensive, and shippable mostly to conferences.
Teams who think of agents as scripts-with-model-calls build them like scripts. Tight scope, small tool set, explicit stop conditions, fallback paths, observable side effects. The result is what we ship to clients.
Same primitive. Same API. Different mental model. The implementation discipline downstream changes by an order of magnitude.
What "the real thing" looks like
Across last year's engagements, the agents we shipped breakdown roughly:
- "Single-step plus confirmation." Model decides what to do, calls one tool, returns the result. This is 40% of what gets called an "agent" and is really a function call with a model in the loop.
- "2-4 tool calls per task." Model gets the input, calls a search tool, looks at results, calls a fetch tool, formats the output. This is 50% of agents we ship.
- "5+ tool calls with conditional branching." The model genuinely orchestrates a multi-step workflow with decisions that depend on intermediate results. 8-10% of agents. The hardest ones to ship reliably.
- "Long-running with state." Multi-hour or multi-day agents that maintain working memory and run autonomously. <2% of what we ship and almost always for research-shaped use cases, not customer-facing ones.
The category-1 and category-2 agents are scripts with model calls. They work because they're simple. The category-3 agents are also scripts with model calls — just longer scripts. Category 4 is where "agent" earns the noun, and it's a vanishingly small part of the production landscape.
Why the simpler framing ships better
Five things you do automatically when you think of agents as scripts:
- You set a turn limit. Scripts have to terminate.
- You log every call. Scripts get instrumented.
- You write tests. Scripts get unit tests; agents-as-beings get demos.
- You think about failure paths. Scripts have try/except. Agents-as-beings "just keep trying."
- You version-control the prompt. Scripts get committed. Agents-as-beings get tweaked in a chat window.
Each of these alone is high-leverage. Together they're the difference between an agent that runs reliably for a year and one that breaks every Tuesday.
The reframe in client conversations
Buyer pitches us "an agentic AI system that handles X." We pitch back "a Python script that calls Sonnet in a loop, with these 4 tools, with a 6-turn limit, with this fallback when it can't decide." Same outcome, very different conversations about what we're building.
The buyer who hears "Python script with model calls" sometimes wonders if they're getting less than they paid for. The honest answer: the term "agent" is doing rhetorical work that obscures how the thing is built. Strip the rhetoric and you can have a real conversation about what will work.
When "agent" is the right word
Two cases where the noun is fine:
- Long-running autonomous workflows. Multi-hour Claude Code sessions, research agents that loop for an evening, the small slice of category-4 work above. These genuinely behave like agents in the cognitive sense.
- Marketing copy on the website. Buyers expect the word. Don't confuse them by refusing to use it. Just don't internalize it.
The summary
"Agent" is a good marketing word and a poor engineering word. Internally, frame your "agents" as scripts with model calls. The framing pulls toward the discipline that makes them reliable. The marketing word still works on the sales page. Neither needs to lie.
For the related buzzword graveyard, see the terms we won't use in client meetings. For the design rules behind the function calls themselves, see writing tools the model actually uses.