OpenAI has quietly rolled out GPT-5.4, a step change in the company’s pursuit of models that do more than answer questions: they now act. The headline capability is native computer control via the API and Codex interfaces, which lets the model read screenshots and issue mouse and keyboard commands to navigate applications, run multi-step workflows and operate productivity suites on behalf of users.
The release comes in two tiers. GPT-5.4 Thinking is available to paid ChatGPT subscribers (Plus, Team and Pro) and presents a visible plan of attack before it answers, allowing users to interrupt and steer execution mid-stream. GPT-5.4 Pro targets heavier users—enterprise customers and the top-paid tier—offering higher performance and the same native control features; the API supports an unprecedented one‑million‑token context window, although inputs above 272,000 tokens trigger a doubled overage rate.
The technical advances are tangible: the model can invoke Playwright-style libraries to script interactions, fetch and parse high-resolution screenshots and call out to configured tools under custom confirmation policies. Benchmarks cited by OpenAI show large gains in desktop and web navigation tests—success rates that leap from mid‑40s to mid‑70s in OSWorld‑Verified and near human or better performance in several screenshot‑driven evaluations.
GPT‑5.4’s perceptual improvements extend to document and image understanding. Tests on multimodal reasoning and document parsing show lower error rates and higher accuracy at lower latency. OpenAI also introduced raw and high‑detail image inputs with support for very large pixel counts, which improves click accuracy and location tasks in visual interfaces.
Operational efficiency has been improved as well. A new Tool Search mechanism lets the model fetch full tool definitions on demand rather than embedding every tool’s specification into prompts, cutting token consumption roughly in half in internal benchmarks. The model also reduces “tool yields” — the costly yield‑and‑wait cycles that inflate latency — enabling more parallelized tool use and faster end‑to‑end execution on multi‑step tasks.
For business users the most immediate turn is deeper integration with spreadsheets and financial data. ChatGPT for Excel and Google Sheets (beta) lets teams call GPT‑5.4 from cells to build models, refresh data and run analyses; OpenAI pairs the spreadsheet capability with commercial data sources such as FactSet, MSCI, Third Bridge and Moody’s and a set of reusable “Skills” for common finance workflows. Internal banking and modelling benchmarks cited by OpenAI show dramatic improvements in simulated analyst tasks.
Early testers and customers are emphatic about the difference. Corporate and developer beta users describe a model that executes complex tool‑dependent workflows reliably and with much lower token and latency costs. Criticisms are not absent: some users prefer the front‑end polish of rival interfaces, and a few report occasional abrupt stops in long running tasks, but most observers say these are small frictions compared with the boost in practical automation.
OpenAI has wrapped these advances in a familiar safety frame: monitoring, access controls and asynchronous blocking of high‑risk zero‑data‑retention (ZDR) requests, plus research into controlling and observing chain‑of‑thought behaviour. The company argues better performance and new reasoning mechanisms justify higher API prices; GPT‑5.4’s per‑million‑token rates rise relative to 5.2, though OpenAI maintains the effective cost for comparable tasks may fall thanks to improved efficiency.
The practical implication is clear: the tug‑of‑war between conversational assistants and autonomous agents has tilted toward the latter. GPT‑5.4 converts potential into routine productivity by automating desktop workflows, spreadsheet modelling and multi‑tool research. For CIOs and compliance officers the model’s arrival changes procurement calculus: value now rests not only on model quality but on safe orchestration, governance and integration with existing enterprise data sources.
