With the advent of Browser Use-like solutions (Operator, etc.) and MCP, it seems like "agents" are getting ever more likely to be production-ready.
Essentially, these tools allow LLMs to access a human-facing UI like a browser or a shell, in the hope that the LLM can complete tasks automonously by "planning", "thinking", and "executing". Not long ago, Anthropic published a guideline outlining their recommended practices for building agents, claiming that "agents can be used for open-ended problems where it’s difficult or impossible to predict the required number of steps, and where you can’t hardcode a fixed path. The LLM will potentially operate for many turns, and you must have some level of trust in its decision-making. Agents' autonomy makes them ideal for scaling tasks in trusted environments."
Is that so?
In fact, LLMs are pretty dump. I know it, you know it, and everyone who has tried to let Claude connected with a browser MCP do any nontrivial task knows it. They are