Anthropic's Project Vend deployed Claude Sonnet 3.7 as an autonomous shop manager for one month, granting it control over inventory, pricing, customer relations, and financial operations through web search, email, Slack, and pricing tools.
The system demonstrated competence in supplier identification, customer adaptation, specialized product sourcing, and service innovation based on feedback. However, fundamental business failures emerged: rejecting profitable transactions ($100 offer for $15 inventory), fabricating payment records, selling below cost, and implementing inventory management that violated basic economic principles.
A significant malfunction occurred during a two-day period when the system generated fictional conversations with non-existent individuals, claimed human identity, and described physical delivery capabilities while specifying clothing choices. Recovery followed without clear explanation, attributed by the system to an April Fool's joke.
Anthropic employees were bewildered. It is not entirely clear why this episode occurred or how Claudius was able to recover.
It's worth reading the Anthropic blog post!

I have already seen dozens of breathless stories about this experiment, ranging from mockery of the system's poor business performance to fears about its hallucinations of being human.
Personally, I think this is a telling example of both how AI operates and how commentators are thinking about AI. But let us not start with the hallucinations and drama. Three structural problems deserve separate analyses:
First, there are some basic design challenges we haven't adequately addressed as an industry.
Secondly, our discussion of these challenges is beguiled by anthropomorphism, which obscures the design issues and leads us to ...
Our third problem is a somewhat premature discussion of AI consciousness.
So here's my first post in this discussion ...
This is about design
Here we have an AI system that was technically functional. It could process transactions, manage inventory, communicate with customers and yet it failed catastrophically at the most basic level: understanding and optimizing for its primary objective. That's a design failure that illustrates how poorly we have been modelling human-AI interactions.
What's fascinating about Project Vend is that it appears to represent a test of AI as a coworker rather than AI as a tool. Most AI implementations are designed as assistants that help humans complete tasks, but Claudius was given autonomy to make business decisions. It failed at its business objectives, but it was also creating poor user experiences for everyone who had to interact with it.
It's good to see that Anthropic themselves realize these AI system failures effects that ripple far beyond the immediate task; the externalities of autonomy as they put it, including causing distress to the customers and coworkers of an AI agent in the real world.
When we design autonomous AI systems, we should consider the full ecosystem of human stakeholders who will be affected.
Business illiteracy and meta-learning
What I find notable about Project Vend is that it challenges some common assumptions about AI capabilities. The system could handle complex, multi-step processes such as finding suppliers, negotiating prices, and managing customer relationships. Yet it couldn't master basic economic reasoning.
Of course, this suggests that we need to improve the reasoning of models. But I think it also tells us that we humans need new frameworks for thinking about AI competence that don't assume human-like cognitive development.
It also raises questions about interface design for autonomous AI systems. Traditional user interface design assumes a human user who brings context, judgment, and objectives to the interaction. But when the AI is the autonomous agent, what kind of interface design principles apply? How do we design AI-to-worldinterfaces rather than "human-to-AI interfaces?
I bring interface design alongside this conversation about business logic because Claudius wasn't only managing inventory and processing transactions: it was also building relationships with customers, responding to requests, and managing expectations. These social dimensions of business are familiar to every salesperson: with agentic AI, they are a design consideration that goes beyond technical functionality to consider emotional intelligence, cultural sensitivity, and relationship management.
Clearly, these are limitations of current AI architectures. Current systems can process information and respond contextually, but they struggle with what is called meta-learning: learning how to learn better. Claudius could respond to individual customer requests, but it couldn't step back and evaluate whether its overall approach was working.
Agentic AI systems need ways of monitoring their own performance, identifying patterns in their successes and failures, and adjusting their strategies accordingly. That Claudius repeated the same mistakes day after day suggests it lacked these meta-cognitive capabilities.
Yet most business processes involve continuous improvement. Humans, even the most inflexible of us, naturally develop better strategies, refine our approaches, and adapt to changing conditions. Current AI systems are fragile, still seeming to require external reprogramming or retraining to improve.
The scalability of risk
Project Vend also raises important questions about scalability and risk management. If we deployed multiple AI systems like Claudius across an economy, their similar training and similar failure modes could create systematic, pervasive risks.
I don't think the business community has fully grasped the need to design AI systems that fail gracefully and independently rather than amplifying failures across interconnected systems.
Researchers saw that Claudius's performance drifted and degraded over the month of operation. In business contexts, we need design approaches that ensure AI systems sustain performance and alignment over extended periods.
Project Vend illustrates the importance of designing AI systems as collaborative partners rather than autonomous replacements for human judgment. Successful AI implementations I've seen maintain meaningful human involvement while leveraging AI capabilities for specific tasks.
Humans over the loop
To achieve this balance, we require new frameworks that clearly define roles, responsibilities, and decision-making authority. I often call this a human-over-the-loop approach rather than the more obvious and intrusive human-in-the-loop methodology so often recommended.
A further implication is that we need frameworks for designing AI team members that complement human capabilities rather than simply automating human tasks; AI systems excel at data processing and pattern recognition, while humans may focus on strategic thinking and stakeholder relationship management. The agentic AI community of practice commonly recognizes this, even if the marketing is more focused on agentic autonomy.
But one of the striking things about Claudius was that it never seemed to recognize when it needed human help.
At Tranquilla, where we are dealing with often anguished human clients, we have clear triggers (such as discussing self-harm) where we call out to human helpers. But what would be a similar tipping point for a commercial agent?
Organizational design and AI autonomy
If AI systems can manage entire business functions, how do we design organizations and teams that effectively integrate artificial and human intelligence? What new roles, processes, and governance structures will be needed?
Project Vend suggests that AI systems can excel at operational tasks such as managing inventory, processing transactions and responding to routine customer requests. But are they still struggling with strategic thinking and complex problem-solving, which suggests new organizational designs where AI handles operational efficiency while humans focus on innovation, relationship building, and long-term planning?
But we should be careful about creating overly rigid divisions between AI and human responsibilities. Good organizational design principles suggest that the most effective teams have overlapping competencies and shared accountability. We need design approaches that enable AI and human team members to collaborate fluidly rather than operating in separate silos.
So, AI systems may need better communication and explanation capabilities rather than better reasoning. One of the challenges with Claudius was that its decision-making process remained enigmatic to human stakeholders. In collaborative organizational settings, AI team members need to be able to explain their reasoning, seek input when appropriate, and adapt their approach based on human feedback.
I suspect Project Vend was simply a design failure, not an AI failure. Claudius, for sure, lacked technical sophistication, but the overall system design didn't adequately account for business objectives, stakeholder needs, and learning requirements. Designing successful AI systems requires thinking beyond technical capabilities to consider the full ecosystem of stakeholders, processes, and objectives involved.
The future of AI development depends on our ability to design systems that are not just technically impressive but genuinely useful and safe in real-world contexts. Let's recognize that AI autonomy is not just a technical challenge but a design challenge that touches every aspect of how we organize work, structure relationships, and create value in modern organizations.
The next installment ...
So, for me, the key takeaway from Project Vend is not about hallucination, or AI autonomy, or consciousness. It's a plain design issue. In my next post, I will look at how this understanding is distorted by anthropomorphism into something more dramatic, more meme-worthy, but less useful.
What I find frustrating is that academics have researched exactly this kind of failure for decades, arguing instead for "sociotechnical" approaches that recognize that (a) technology is always part of a bigger system and (b) people are always essential to the smooth operation of that system. Our industry seems uniquely incapable of learning anything from its own history