Skip to content
FAQ

OpenAI's 'Spud' Finishes Pretraining — GPT-5.5 or GPT-6 Could Land Within Weeks

OpenAI completed pretraining on its next major model, internally codenamed 'Spud,' on March 24. CEO Sam Altman called it 'a very strong model that could really accelerate the economy,' while President Greg Brockman described it as 'two years of research' representing a qualitatively different kind of capability leap. With safety evaluation now underway, prediction markets put 78% odds on a public release before the end of April — and the final commercial name, GPT-5.5 or GPT-6, is still undecided.

5 min read

The AI industry’s most anticipated model release of 2026 is weeks away. OpenAI has completed pretraining on its next major frontier model, internally codenamed “Spud” — and the company’s own leadership has broken from its usual silence to build anticipation in ways that suggest this one is different.

On March 24, The Information first reported that pretraining on Spud had wrapped. Within days, both Sam Altman and Greg Brockman confirmed the milestone in unusually candid terms. Altman told employees the new model is “a very strong model that could really accelerate the economy.” Brockman, speaking on the Big Technology podcast, went further: “This represents two years of research. It’s a big model feel — it’s not an incremental improvement, it’s a significant change in the way we think about model development.”

That language — “big model feel,” “significant change” — is not boilerplate. It signals that Spud represents the kind of capability step that OpenAI has been building toward since GPT-5’s launch in 2025, not merely a refinement of existing architecture.

The Race for the Benchmark Crown

The urgency behind Spud’s release is easy to read in the competitive landscape. Gemini 3.1 Pro, released by Google in February 2026, currently leads 12 of 18 major tracked benchmarks. On ARC-AGI-2 — the notoriously difficult abstract reasoning test that punishes models relying on memorized patterns — Gemini 3.1 Pro scored 77.1%, more than double the reasoning performance of its predecessor. On GPQA Diamond, the expert-level science reasoning benchmark, it hit 94.3%. At roughly one-third the API cost of competing frontier models, Gemini 3.1 Pro has given Google its most dominant benchmark position since GPT-4’s 2023 launch shook the market.

Meanwhile, Anthropic’s Claude Opus 4.6 holds the lead in software engineering benchmarks, scoring 80.8% on SWE-Bench Verified versus Gemini 3.1 Pro’s 80.6%. And xAI’s Grok 4.20 has shown strong reasoning and agentic performance with its four-agent parallel architecture. OpenAI’s current flagship, GPT-5.4 Thinking, leads on knowledge-work tasks and computer-use benchmarks — but the gap at the frontier is narrowing, and Spud is OpenAI’s answer.

GPT-5.5 or GPT-6? The Name Is the Signal

The commercial name OpenAI will attach to Spud is itself a story. The company has publicly said the decision depends on the magnitude of the performance improvement relative to GPT-5.4. If benchmarks show a generational leap — the kind that would justify a major version number — Spud ships as GPT-6. If it is a strong but still incremental advance in the same generation’s capabilities, it becomes GPT-5.5.

That framing matters because it turns the naming choice into a public signal about the scale of the capability jump. A GPT-6 designation would represent OpenAI’s loudest possible claim that this is not just another model refresh, but the arrival of the next major era of its technology — likely with the kinds of agentic, reasoning, and multimodal capabilities that could materially change what AI systems can accomplish autonomously.

Leaked benchmark testing, reported by AI analyst Adam Holter, suggests Spud’s performance may approach what Anthropic’s “Claude Mythos” — the company’s own unreleased frontier model — was showing in internal evaluations before Anthropic decided not to publicly release it due to cybersecurity concerns. That framing, if accurate, would place Spud squarely in GPT-6 territory.

A Timeline Measured in Weeks

The standard window between pretraining completion and public release for recent OpenAI models has been three to six weeks, covering safety evaluation, red-teaming, and infrastructure preparation. Applied to Spud’s March 24 pretraining completion date, that window runs from approximately April 14 to May 5, 2026.

Prediction market Polymarket currently assigns 78% probability to a Spud release before April 30, with confidence climbing above 95% for a release before June 30. An April 16 release date has been circulated in AI circles based on leaked internal communications, though OpenAI has not confirmed any specific date.

The safety evaluation phase is not merely procedural. Since GPT-5.4, OpenAI has substantially expanded its red-teaming program in response to growing regulatory scrutiny in the European Union and voluntary commitments made under US government frameworks. Anthropic’s decision to withhold Claude Mythos entirely, citing cybersecurity risks discovered during red-teaming, has also raised the bar for what constitutes responsible release practice. OpenAI will be acutely aware that Spud’s release process will be examined closely.

Compute Reallocation: Sora’s Quiet End

One data point that has drawn attention from technical observers: OpenAI is reportedly shutting down Sora, its video generation product, to free up compute capacity for Spud’s deployment. Sora launched to considerable fanfare in early 2025 but never achieved the commercial traction needed to justify its infrastructure costs alongside the demands of GPT-5.x inference at scale. The decision to reallocate that compute to Spud suggests the company is pulling every available resource into ensuring the new model launches at sufficient capacity to avoid the service disruptions that plagued earlier releases.

That reallocation also signals where OpenAI sees the value: not in multimodal generative features that delight but don’t retain users, but in the raw reasoning and agentic capabilities that enterprise customers are willing to pay premium API rates to access.

What to Expect From the Model Itself

While detailed architecture information has not been officially disclosed, several consistent threads emerge from leaks and analyst reporting. Spud is expected to feature substantially improved reasoning capabilities, particularly on the kinds of multi-step logical and mathematical problems where current models still make systematic errors. It is described as having stronger agentic performance — the ability to execute long-horizon tasks with minimal human oversight — and improved reliability on complex coding and software engineering benchmarks where Claude Opus 4.6 currently leads.

The model will also almost certainly feature expanded context and an updated computer-use capability, extending the GPT-5.4 gains in autonomous screen interaction. If Greg Brockman’s “big model feel” language points to anything specific, it is likely the emergence of more coherent long-range reasoning: the ability to maintain context, track goals, and recover from errors across much longer task sequences than today’s models can reliably handle.

A High-Stakes Launch for OpenAI’s IPO Narrative

The timing of Spud’s release carries significance beyond model benchmarks. OpenAI is in the early stages of a public listing process, reportedly targeting a late-2026 IPO. The company has surpassed $25 billion in annualized revenue but trails Anthropic — which recently crossed $30 billion ARR, overtaking OpenAI for the first time in both companies’ histories — in the metric that investment bankers will scrutinize most closely.

Launching a model that credibly reclaims the benchmark leadership from Gemini 3.1 Pro and closes the gap with Claude Opus 4.6 on coding would materially strengthen OpenAI’s IPO narrative heading into roadshow season. The alternative — allowing the narrative that Google and Anthropic have outpaced the company that started the modern AI era — is not one OpenAI’s leadership is likely to accept quietly.

Spud is not just a model release. It is OpenAI’s bid to re-center the frontier AI story on itself.

OpenAI GPT-5.5 GPT-6 Spud large language models AI models benchmark
Share

Related Stories

The End of GPT-4o: OpenAI Completes Model Retirement as GPT-5.4 Becomes the New Standard

OpenAI completed the full retirement of GPT-4o on April 3, 2026, alongside GPT-4.1 and o4-mini. Only 0.1% of daily users were still choosing GPT-4o at the time of retirement. GPT-5.4 — available in Standard, Thinking, and Pro variants — is now the platform baseline, while Gemini 3.1 Pro leads 13 of 16 major benchmarks at roughly one-third of the API cost.

5 min read

Anthropic Hits $30B ARR, Surpasses OpenAI as World's Highest-Revenue AI Company

Anthropic has reached $30 billion in annualized revenue, overtaking OpenAI's $25 billion ARR to become the world's highest-earning AI company. The milestone accompanies a massive compute deal with Google and Broadcom for multi-gigawatt TPU capacity coming online in 2027, and signals that both companies are on accelerating paths toward IPOs — even as compute costs threaten to outpace revenue growth.

5 min read

OpenAI's GPT-5.4 Surpasses Human Performance at Computer Control, Reshaping What AI Agents Can Do

OpenAI's GPT-5.4, released March 5, is the first general-purpose AI model to surpass human baseline performance on computer-use benchmarks, achieving a 75% success rate on OSWorld-Verified versus the 72.4% human baseline. With a 1-million-token context window and native support for mouse, keyboard, and screenshot interactions, GPT-5.4 marks a turning point for autonomous AI agents in enterprise and developer workflows.

6 min read