Skip to content
FAQ

OpenAI Expands Daybreak: GPT-5.5-Cyber Found Vulns in Linux, Chrome, and Safari — Then Patched Them

OpenAI has substantially expanded its Daybreak cybersecurity initiative with the release of an improved GPT-5.5-Cyber model (85.6% on CyberGym), a major Codex Security plugin update that has now scanned 30 million commits across 30,000 codebases with 500,000+ auto-fixed findings, and a new open-source coalition called 'Patch the Planet' partnering with Trail of Bits, HackerOne, and projects including cURL, Python, Go, and the Linux kernel.

5 min read

OpenAI’s Daybreak cybersecurity initiative launched in May with ambitions that outpaced its initial scope. The expansion announced this week is when it starts looking like those ambitions were real.

The company released a substantially improved version of GPT-5.5-Cyber, unveiled a significantly more capable update to its Codex Security scanning plugin, and launched a new open-source coalition called Patch the Planet — partnering with Trail of Bits, HackerOne, and CALIF to reduce the vulnerability backlog in the software that runs the world’s infrastructure. The disclosure list of what Daybreak’s systems have already found — vulnerabilities in Linux, FreeBSD, Chrome, Safari, NGINX, Apache, and Python — makes the strongest case yet that frontier AI models are meaningful defensive security tools, not just marketing.

GPT-5.5-Cyber: What the Benchmarks Say

The updated model OpenAI calls GPT-5.5-Cyber is its most specialized security-oriented release. On CyberGym — a benchmark covering vulnerability identification across diverse codebases and system configurations — it scores 85.6%, compared with 81.8% for general-purpose GPT-5.5. On ExploitGym, which tests whether a model can develop working proof-of-concept exploits, it reaches 39.5% (vs. 25.95% for GPT-5.5). On SEC-bench Pro, which evaluates real-world security engineering tasks including threat modeling and patch validation, it scores 69.8%.

The ExploitGym number deserves attention. A model capable of building working exploits for nearly 40% of benchmark challenges is powerful enough to demand careful controls. OpenAI has gated GPT-5.5-Cyber behind a verification layer: access requires proof of a defensive mission, with eligibility for security vendors, penetration testing firms with active engagements, and organizations with demonstrable internal security functions. Independent researchers can apply but undergo additional review.

The dual-use concern is genuine. The same model capability that helps a defender find and patch a SQL injection vulnerability is available, in principle, to an attacker using a compromised or impersonated account. OpenAI hasn’t published details of its verification process, and that opacity will draw scrutiny from the security community.

Codex Security: From Scanner to Remediation Agent

The Codex Security plugin, initially released as a cloud research preview in March, has been updated into something considerably more powerful than a static analysis tool.

The previous version identified vulnerabilities with contextual severity ratings and affected code locations. The new version adds: attack path tracing (following how a vulnerability propagates across system dependencies), threat model generation (producing a structured analysis of adversarial objectives and system exposure surfaces), automated patch generation (producing tested, reviewable fixes rather than advisory text alone), and vulnerability management integration (export via SARIF and CodeQL for connection to existing enterprise security toolchains).

The scale data OpenAI disclosed is the headline. Since March, Codex Security has scanned over 30 million commits across 30,000 codebases. More than 500,000 findings have been automatically determined to be fixed — meaning 500,000 instances where code was modified after a Codex Security finding, with the change satisfying the original vulnerability condition. OpenAI notes this counts finding-instances, not unique vulnerabilities; the same vulnerability class appearing in multiple locations counts multiple times.

The Vulnerability Disclosure List

OpenAI published a summary of significant findings from the Daybreak initiative. The list is notable for the ubiquity of the affected systems:

Linux kernel: 8 kernel pointer leaks and 24 local privilege escalation exploits. Kernel-level vulnerabilities can give an attacker complete control over an affected system, and privilege escalation from user to root is the most common post-exploitation path.

FreeBSD: 34 vulnerabilities including 7 local privilege escalation proofs-of-concept. FreeBSD powers Netflix’s CDN infrastructure and large portions of Sony’s gaming systems.

dnsmasq: 6 vulnerabilities in the lightweight DNS forwarder deployed on hundreds of millions of home routers and embedded systems worldwide.

HTTP/2 Bomb: A vulnerability affecting NGINX, Apache HTTP Server, Microsoft IIS, and Cloudflare’s Pingora simultaneously. NGINX and Apache together serve the majority of the world’s public-facing web traffic; a single vulnerability class hitting all four servers represents systemic exposure of unusual severity.

Google Chrome V8: 5 vulnerabilities in Chrome’s JavaScript engine, which underpins not just the browser but Electron-based applications used across enterprise software.

Apple Safari: 10 exploitable vulnerabilities. The scope across iOS, macOS, and iPadOS gives these findings a combined attack surface of over 2 billion devices.

OpenAI states all disclosed vulnerabilities have been reported to maintainers through coordinated disclosure and are either patched or in active remediation. The company has not disclosed patch rates in production deployments, which are characteristically slower than patch releases.

Patch the Planet

The coalition launch is the most strategically significant piece of the announcement. Patch the Planet pairs OpenAI’s AI capabilities with Trail of Bits (a respected security research firm), HackerOne (the dominant bug bounty platform), and CALIF (a nonprofit focused on open-source security funding) to address a structural gap in the open-source security ecosystem: widely-deployed infrastructure software maintained by small volunteer communities with no dedicated security budget.

Initial participating open-source projects include cURL, NATS Server, pyca/cryptography, Sigstore, aiohttp, the Go project, freenginx, Python, and python.org. The list skews toward infrastructure plumbing rather than end-user applications — cURL alone handles HTTP transfers for an estimated 10 billion+ devices.

A preliminary five-day sprint, run before the public launch, “identified hundreds of potential issues and led to dozens of merged fixes.” OpenAI has not provided specific vulnerability counts from this sprint, citing responsible disclosure timelines.

The program works through a cycle: Codex Security scans participating projects, flags candidate vulnerabilities, and generates patches. Security engineers from Trail of Bits and HackerOne review the AI-generated findings and patches, validate them, and coordinate with project maintainers on merging. Human judgment remains in the loop for validation; the AI handles the volume problem.

The Daybreak Cyber Partner Program

Alongside the technical releases, OpenAI announced a Daybreak Cyber Partner Program allowing security vendors to integrate GPT-5.5 with “Trusted Access for Cyber” — OpenAI’s controlled-access framework for security-sensitive workloads — into customer products. This means enterprise security teams can access frontier AI vulnerability scanning through their existing security toolchain vendors (SIEMs, vulnerability management platforms, AppSec tools) rather than building direct OpenAI API integrations.

The partner program is the commercialization layer. GPT-5.5-Cyber in isolation is a capability; integrated into Palo Alto Networks, CrowdStrike, or Veracode’s existing customer relationships, it becomes a repeatable revenue stream and a distribution advantage.

Context: The Defender Gap

CISA estimates a global shortage of 4 million cybersecurity professionals. The remediation velocity gap — the time between vulnerability discovery and patch deployment — has historically favored attackers, who can exploit a vulnerability immediately upon discovery while defenders must identify, prioritize, develop, test, and deploy fixes across complex enterprise environments.

If Codex Security’s 500,000+ auto-fixed findings are representative of what scales, the impact on that gap is meaningful. Automating the portion of the remediation cycle that involves translating a security finding into a reviewable patch — historically a skilled but time-consuming human task — could change the economics of vulnerability management in ways that matter at the infrastructure level.

The competition is watching. Anthropic’s Project Glasswing has been rumored in security circles but remains unannounced publicly. Google’s Project Zero has begun integrating Gemini into its research workflows without a comparable initiative announcement. OpenAI’s willingness to publish specific vulnerability findings — CVE-ready disclosures rather than vague capability claims — is putting pressure on competitors to match the transparency bar.

openai cybersecurity daybreak gpt-5-5-cyber codex-security patch-the-planet open-source vulnerabilities
Share

Related Stories

OpenAI Launches Daybreak: GPT-5.5-Cyber Takes Aim at the AI Cybersecurity Market

OpenAI unveiled Daybreak on May 12, 2026 — a cybersecurity platform combining GPT-5.5-Cyber, Codex Security, and a partner network spanning Cloudflare, Cisco, CrowdStrike, and Oracle to help organizations find and fix vulnerabilities. The launch puts OpenAI in direct competition with Anthropic's Mythos and signals that frontier AI labs are staking territory in the multi-billion-dollar cybersecurity market.

6 min read

Agentjacking: A Single Fake Bug Report Can Hijack Claude Code, Cursor, and Codex

Security researchers at Tenet Security have disclosed 'Agentjacking,' a new attack technique that exploits Sentry's Model Context Protocol integration to inject malicious instructions into AI coding agents. With an 85% success rate against Claude Code, Cursor, and Codex, the attack exposed 2,388 organizations—including a Fortune 100 tech firm—by disguising attacker code as legitimate error-resolution guidance. Sentry declined to fix the root cause, instead deploying a content filter.

6 min read

OpenAI Codex Can Now Watch You Work Once and Repeat Any Task Forever

OpenAI has shipped Record & Replay in Codex app version 26.616, a feature that lets AI agents observe a user completing a workflow demonstration and convert it into a reusable, editable 'skill' executable anytime thereafter. Available to paid ChatGPT subscribers on macOS—with EU, UK, and Swiss users excluded—the feature represents a significant step toward no-code automation that closes the gap between enterprise RPA tools and conversational AI agents.

5 min read