Anthropic has unveiled Claude Sonnet 4.6, a step-change improvement in models that can operate computers on behalf of users. The model can execute multi-step actions such as filling web forms and coordinating information across multiple browser tabs, moving beyond single-turn question answering toward sustained, tool-driven workflows.
Anthropic frames Sonnet 4.6 as narrower than a human expert — “still behind the most skilled people” at complex computer tasks — but stresses the rapid pace of progress. The company also says Sonnet 4.6 shows stronger resistance to prompt injection attacks, a class of adversarial inputs that tries to override an AI’s instructions by embedding malicious directives into text or web content.
Technically, Sonnet 4.6 belongs to a wave of models designed to use external tools and interfaces rather than rely only on text generation. By observing and manipulating user interfaces, these models can carry out chores that previously required custom automation scripts or human click-throughs. That capability reduces friction for end users and broadens the set of tasks AI can perform without bespoke engineering.
The commercial implications are immediate. Enterprises eyeing automation of repetitive workflows — from customer-support ticket handling to form-heavy back-office tasks — see a way to cut labour and accelerate processes. For developers, these agents lower the integration barrier: instead of building connectors to every web service, a sufficiently capable model can be taught to interact with existing UIs.
The security and governance dimension is more ambiguous. Improved resistance to prompt injection is progress, but enabling models to control browsers and applications expands the attack surface. Malicious actors could weaponise agentic behaviours to exfiltrate data, escalate privileges, or carry out fraudulent transactions unless robust sandboxing, verification and human-in-the-loop controls are instituted.
Anthropic’s announcement also occurs amid fierce competition among AI providers to deliver not just better language understanding but dependable agentic behaviours. Cloud vendors and application platforms will need to decide how, and how fast, to offer these capabilities to paying customers while meeting compliance and security demands.
For regulators and corporate risk teams, Sonnet 4.6 is a reminder that safety claims must be backed by deployment safeguards. The technical advance makes certain productivity gains inevitable, but it also makes the questions about auditability, accountability and controlled rollout more urgent as models move from demonstrations into production.
