Sail Research Raises $80M From Sequoia and Kleiner Perkins to Make Long-Running AI Agents 10x Cheaper
Sail Research, a San Francisco startup founded by ex-Apple and ex-Nvidia engineers, has raised $80 million in combined seed and Series A funding—led by Kleiner Perkins and Sequoia respectively—to build infrastructure designed to run long-horizon AI agents at one-tenth the cost of existing services. Its Linux-based 'Sailboxes' can pause agents during idle periods and link multiple agents into collaborative ensembles, achieving a 90.72% score on BrowseComp-Plus at 10x lower inference cost than competitors.
Every AI agent costs money to think. Every API call, every reasoning step, every hour an agent spends waiting for a tool call to return—all of it burns tokens, burns compute, burns cash. As AI agents move from novelty to enterprise infrastructure, the economics of running them at scale have become a genuine constraint on the industry’s ambitions.
Sail Research is betting that constraint is a business opportunity. The San Francisco startup announced this week that it has raised $80 million across a seed round led by Kleiner Perkins and a Series A led by Sequoia Capital, at a valuation Fortune pegged at $450 million. The company’s pitch is specific and technical: it has built infrastructure that can run long-horizon AI agents—tasks spanning hours, days, or even weeks of continuous operation—at one-tenth the cost of comparable services.
That cost reduction, if it holds under enterprise conditions, would represent a significant shift in what becomes economically viable to automate.
The Long-Horizon Problem
Most AI infrastructure is optimized for short interactions. A user asks a question; the model responds; the session ends. The compute costs are bounded, the memory requirements are manageable, and the failure modes are relatively simple. Enterprise applications don’t work that way.
Real-world agentic deployments—automated software engineering, extended research tasks, multi-step customer workflows, complex data analysis pipelines—require agents that can operate for hours or days, maintain context across thousands of steps, coordinate with multiple tools and external services, and recover gracefully when individual steps fail. That’s a fundamentally different infrastructure challenge from serving a chatbot, and existing cloud providers’ general-purpose GPU infrastructure was not designed for it.
Sail’s founders—Neil Movva and Samir Menon, both veterans of Apple and Nvidia engineering teams—concluded that the mismatch between the infrastructure requirements of long-horizon agents and the infrastructure that actually exists was large enough to build a company around. “Agents require scale, reliability, and sustainable cost,” CEO Neil Movva said in the company’s funding announcement.
How Sailboxes Work
The company’s core technical product is what it calls Sailboxes: Linux-based virtual machines customized specifically for AI agent workloads. Sailboxes solve several concrete problems that make existing cloud VMs expensive and unreliable for long-running agent tasks.
The most significant is idle cost. AI agents spend a substantial portion of their runtime waiting—waiting for tool calls to return, for external APIs to respond, for human-in-the-loop approvals, for long computation steps to complete. In a standard cloud deployment, an agent waiting for a response is still occupying GPU memory and paying for compute it isn’t using. Sail’s infrastructure can pause agents during waiting periods, releasing compute resources and resuming the agent seamlessly when the response arrives. For workloads with substantial wait times—which includes most real-world enterprise agent deployments—this pause-resume capability alone can represent a significant cost reduction.
Sailboxes can also be linked into ensembles: networks of multiple agents working collaboratively on a shared task, coordinated by Sail’s global controller infrastructure. This is architecturally important for complex enterprise workflows that are too large or too multi-dimensional for a single agent to execute sequentially—tasks that benefit from parallel workstreams, specialization, and mutual error-checking.
On the inference side, Sail builds on established open-source tooling including vLLM and its PagedAttention algorithm, which improves GPU memory efficiency by managing the key-value cache used in transformer inference more intelligently than naive implementations. The company has then layered its own optimizations on top of this foundation, tuned specifically for the memory access patterns of long-horizon agent workloads.
The BrowseComp-Plus Benchmark
Sail published benchmark results to accompany its funding announcement—a move that reflects both confidence in its technical performance and the growing importance of verifiable efficiency metrics in an increasingly crowded AI infrastructure market.
On BrowseComp-Plus, a benchmark designed to evaluate AI systems on complex online research tasks requiring multi-step information gathering and synthesis, Sail reported a score of 90.72%—described as a new high watermark for the benchmark—while incurring one-tenth the inference cost of rival services on the same tasks.
The combination of performance and cost efficiency is the key claim. A cheaper infrastructure that performed worse would be easy to discount; a more expensive one that performed better would represent a different kind of product. The assertion that Sail’s approach achieves both simultaneously on a standardized benchmark is what makes the funding announcement notable beyond its dollar value.
The Investor Lineup
The round’s investor composition reflects the scale of institutional confidence in the agentic AI infrastructure thesis. Sequoia and Kleiner Perkins are two of Silicon Valley’s most selective tier-one venture firms, and their joint participation—across consecutive funding stages—signals a high degree of alignment around both the thesis and the founding team.
The angel investor list adds another dimension of credibility. John Hennessy, the chair of Alphabet’s board and a computer architecture legend whose work on RISC processors shaped modern CPUs, is a participant. Lip-Bu Tan, the CEO of Intel, brings both semiconductor expertise and a strategic investor relationship with the company that manufactures much of the world’s AI chip advanced packaging. Tri Dao, chief scientist at Together AI and the co-author of the FlashAttention algorithm that dramatically improved transformer inference efficiency, is perhaps the most technically specific validation: one of the researchers most responsible for making AI inference faster is personally betting on Sail’s approach.
Additional institutional investors include Redpoint Ventures, Theory Ventures, Vine Ventures, CRV, A* Capital, and Abstract Ventures.
Market Context: The Infrastructure Race for Agentic AI
Sail enters a market that is rapidly attracting competition. As enterprises move from experimenting with AI chatbots to deploying AI agents in production workflows, the infrastructure layer that supports those deployments has emerged as a major investment theme. Several startups are pursuing related approaches—optimizing inference costs, building agent-specific compute, or providing managed environments for long-running AI workloads.
What distinguishes Sail’s positioning is its focus on the cost dimension of the long-horizon use case specifically. Many AI infrastructure companies emphasize performance—lower latency, higher throughput, better reliability. Sail’s primary claim is cost efficiency for workloads that the market increasingly considers strategically important but economically marginal at current price points.
The company’s timing is also notable. The announcement follows a period of significant investment in AI agent tooling—from OpenAI’s Codex platform to Anthropic’s Claude Code to GitHub Copilot’s agentic capabilities—that has established agentic AI as a mainstream enterprise category rather than a research curiosity. As the category matures, the economics of the underlying infrastructure become more critical to adoption velocity.
Whether Sail can maintain its claimed cost advantage as the market scales and larger cloud providers optimize their own agent infrastructure remains an open question. The history of cloud computing suggests that hyperscalers eventually close performance gaps with specialized providers—but also that specialized providers can maintain meaningful advantages in well-defined use cases for substantial periods of time.
For now, Sail is betting that the window in which specialized, cost-optimized agent infrastructure commands a premium is long enough to build a durable business. With $80 million, Sequoia, Kleiner Perkins, and the endorsement of some of the industry’s most respected technical figures, it has the runway to find out.