OpenAI's GPT-5.5 and Workspace Agents Turn AI From Chatbot Into Autonomous Colleague

OpenAI shipped two things last week that matter more together than apart. On April 23, it released GPT-5.5, a model that tops the Artificial Analysis Intelligence Index by three points and uses roughly 40% fewer tokens than its predecessor to complete the same tasks. The day before, it launched workspace agents — Codex-powered bots that teams can build in minutes to automate lead qualification, weekly reporting, IT triage, and vendor risk screening, all running in the background even when you close your laptop.

The combined launch is OpenAI's clearest signal yet that it sees the future of business AI not as a smarter chatbot, but as a fleet of autonomous colleagues that can be built once, shared across an organisation, and improved over time. For business owners weighing whether to invest in AI tooling, this is the week the category shifted from "interesting experiment" to "operational infrastructure."

GPT-5.5: Smarter, More Efficient, More Expensive

GPT-5.5 is a full pretraining run — not a patch on an existing model — co-designed for NVIDIA's GB200 and GB300 systems. The headline benchmarks are strong: 82.7% on Terminal-Bench 2.0 for complex command-line workflows (up from 75.1% for GPT-5.4), 84.9% on GDPval which tests real-world knowledge work across 44 occupations, and 78.7% on OSWorld-Verified for autonomous computer use.

The efficiency gains are where business buyers should pay attention. According to Artificial Analysis, GPT-5.5 uses approximately 40% fewer output tokens than GPT-5.4 to complete the same tasks. That partially offsets a doubling in per-token price — from $2.50/$15 to $5/$30 per million input/output tokens — resulting in a net cost increase of around 20%.

The model isn't without caveats. Artificial Analysis flagged that GPT-5.5 has an 86% hallucination rate on their AA-Omniscience benchmark, compared to 36% for Claude Opus 4.7. That metric measures how likely a model is to answer confidently when it doesn't actually know — a meaningful weakness for any business relying on AI for accuracy-critical tasks. For general knowledge work, the model is measurably stronger. For high-stakes factual queries, verification workflows remain essential.

Dan Shipper, Founder and CEO of Every, described GPT-5.5 as "the first coding model I've used that has serious conceptual clarity" — after testing whether the model could independently diagnose and rewrite a system-level bug that had taken his best engineer days to resolve. Michael Truell, Co-founder and CEO at Cursor, noted that GPT-5.5 is "noticeably smarter and more persistent than GPT-5.4, with stronger coding performance and more reliable tool use".

Workspace Agents: The Launch That Matters More

As The New Stack argued, workspace agents may be more significant than GPT-5.5 itself. They represent OpenAI's shift from selling a smarter chatbot to selling operational infrastructure — and they're explicitly positioned as the successor to custom GPTs, which OpenAI will eventually deprecate for organisations.

Workspace agents are powered by Codex and run in the cloud, which means they can keep working when you're not. Teams can set them on a schedule, deploy them into Slack channels, or trigger them from ChatGPT. They connect to Salesforce, Google Drive, Microsoft apps, Notion, Atlassian, and dozens of other enterprise tools.

OpenAI's own teams are already using them: the comms team built an agent that auto-triages speaking requests via Slack. The finance team used Codex to review 24,771 K-1 tax forms totalling 71,637 pages, accelerating the task by two weeks. A go-to-market employee automated weekly business reports, saving 5-10 hours per week.

The early external signal is equally telling. Ankur Bhatt, who leads AI Engineering at Rippling, said a sales consultant built a deal-briefing agent without an engineering team: "It researches accounts, summarises Gong calls, and posts deal briefs directly into the team's Slack room. What used to take reps 5-6 hours a week now runs automatically in the background on every deal."

Workspace agents are available in research preview for ChatGPT Business, Enterprise, Edu, and Teachers plans, and are free until May 6, after which credit-based pricing begins.

What This Means for Australian Businesses

If you're running a 10-50 person company, here's the practical calculus.

Workspace agents are the first no-code agent builder from a major AI lab that genuinely plugs into the tools most teams already use — Slack, Google Drive, Salesforce. You don't need an engineering team to build one. The Rippling example proves the point: a sales consultant, not a developer, built and iterated the agent end-to-end.

The use cases that map most directly to Australian SMEs are the ones that eat time without creating value: weekly reporting, lead follow-up, software request triage, onboarding documentation. If your team spends hours each week pulling data into spreadsheets, formatting decks, or chasing approvals through Slack, these are exactly the workflows workspace agents target.

The governance model is also worth noting for organisations navigating Australia's evolving AI regulatory landscape. Workspace agents ship with role-based access controls, a compliance API that surfaces every agent's configuration and run history, and human-approval defaults for sensitive actions like sending emails or editing documents. This isn't frontier-grade security, but it's a meaningful step beyond the "paste your API key and hope" era of business AI adoption.

The competitive context matters too. This isn't happening in isolation — Salesforce has rebuilt around Agentforce, Microsoft has Copilot Studio, Google is pushing Agentspace, and Anthropic recently launched Claude Managed Agents. The agent marketplace we've been tracking is rapidly becoming the default way enterprise software gets delivered.

What to Watch

Three things will determine whether this launch ages well or joins the graveyard of AI product announcements.

First, pricing after the free trial. Credit-based pricing starting May 6 will reveal whether workspace agents are viable for teams spending $20/user/month on ChatGPT Business, or whether the economics only work at Enterprise scale.

Second, hallucination in production. GPT-5.5's elevated hallucination rate on factual benchmarks is a genuine concern for agents making autonomous decisions. The model's tendency to answer confidently when it doesn't know is the opposite of what you want in an agent filing IT tickets or qualifying leads.

Third, the agent interoperability question. The industry is converging on standards like the Model Context Protocol (MCP) for agent communication. Whether OpenAI's workspace agents play nicely with non-OpenAI infrastructure — or become another walled garden — will matter enormously for businesses that don't want to bet their entire AI stack on one provider.

For now, the practical advice is straightforward: if your team is on ChatGPT Business or Enterprise, try building one workspace agent before May 6 while it's free. Pick your most annoying recurring workflow — the one nobody enjoys but everyone needs done. That's the best test of whether this technology is ready for your business, or still a demo.

Sources

Introducing GPT-5.5 — OpenAI
Introducing workspace agents in ChatGPT — OpenAI
OpenAI's GPT-5.5 in Microsoft Foundry — Microsoft Azure Blog
OpenAI's GPT-5.5 is the new leading AI model — Artificial Analysis
The real story from OpenAI's big week is Workspace Agents, not GPT-5.5 — The New Stack
OpenAI unveils Workspace Agents, a successor to custom GPTs — VentureBeat