Blog
by Thomas Joshi and Lila TretikovJun 04, 2026

Venture has traditionally focused on revenue scale and growth when evaluating large capital raises. However, in the past few months, the traditional AI venture model has been upended by multibillion-dollar rounds for companies with no revenue, no product, and in some cases no model. The question is why. The answer starts with the scale of what has already been funded— AI research labs now represent roughly $1.68 trillion of aggregate private valuation.1
And 93.6% of that $1.68 trillion sits in a single bucket: frontier generalists2, with OpenAI, Anthropic, xAI, and Safe Superintelligence absorbing almost all of it. Every other segment a venture investor might back including biotech, robotics, scientific discovery, voice, edge, and more combines to barely 6%. The Neo Lab market, in other words, is not a market—it’s one concentrated basket with four names.
Every dollar that flows into a frontier generalist at hundreds of billions in valuation assumes that their next model will deliver another step change in model performance to temporarily capture a market held by a competitor in an endless tug of war. VCs are not underwriting today's models. They are assuming that scale, plus a breakthrough no one has seen, will manifest into market creation and capture.
We wonder where the next leg of AI venture returns are made.

There are three key trends that point to a potential direction for the next generation of innovative AI companies:
Commoditization. Open-weight models are now closing the capability gap at a fraction of the compute that closed-source labs spent to open it. Training data per active parameter has grown 3.1x per year since 2022, dataset sizes double roughly every six months, and estimations suggest training runs longer than nine months become structurally inefficient sometime around 2027.3 The marginal pre-training dollar is buying less capability than the dollar before it. Incremental performance gains will keep moving market share at the margins, but only massive innovation on a specific use case or modality actually has any potential to displace an incumbent at this stage. This new innovation could look like a verifiable reasoning loop, a new modality, or a domain-native architecture. The most visible work today sits one layer above the weights at the behavior and alignment layer, where labs are fine-tuning specifically for productivity use cases. For example, Sovereign AI Provider Reflection AI will serve as the AI model provider to the U.S. National Labs. Specifically, they are partnering on the Genesis Mission, which is a Department of Energy initiative to accelerate scientific research through AI.4 What separates the top models is no longer what they know but how they act within specific environments.

Shift in budget. As we have shifted to an RL-first (Reinforcement Learning-first) paradigm where a system is trained primarily through environmental rewards and trial-and-error, rather than relying heavily on human-labeled data or pre-existing templates, the question isn’t just who has the largest cluster, but also who has the best environments, curricula, and verifiers? Compute is no longer the only moat; the entire loop is a moat. The foundation labs are not sufficiently focused on the correct product abstraction that makes RL deliver business outcomes and have instead curated RL environments for diverse, disjointed tasks without focused taste.

Grok’s training shows a shift toward RL
The exit. Top-decile software now trade at ~15-18x next-12-months revenue against ~3-6x for the rest of the index, and strategic M&A volumes have climbed back to 2022 highs. The strategics are paying premium multiples for AI-native IP they cannot build internally fast enough, and they are buying specific capability, not general intelligence.
These trends lead us to one conclusion. The next generation of AI companies that deliver venture-like outcomes will likely be Neo Labs: research-first companies that own a domain-biased corpus, run an integrated RL and verification loop perhaps inside a high-stakes domain, and could exit into public markets or a strategic M&A wave that is already underway. SAP's May 2026 acquisition of Prior Labs is the clearest recent example: a strategic with its own in-house AI effort committing more than €1B to bring in a specialist team because tabular foundation models are not a capability a general LLM does well enough on.
The next jump in performance requires a continuous training run longer than nine months, weighed against the opportunity cost of a frozen flagship model.
Some argue this is temporary: The next architecture will reset the curve and restore the frontier's lead. The history of foundation models suggests otherwise. Every architectural advance of the last three years, from mixture-of-experts to long-context attention to reasoning-trace distillation, has been replicated in open weights.
The durable wedges are domain-specific data that no one else can assemble, workflows embedded so deep in a customer's operations that ripping them out would cost more than keeping them, and distribution channels that the hyperscalers cannot easily replicate. Which is why labs like OpenAI have spent billions starting partnerships such as the OpenAI Deployment Company.5
Neo Labs can be built around three, specific layers, which also act as moats.
The first layer is the corpus.
Recent CMU work on mid-training shows that a model pre-trained on a domain-biased corpus, then fine-tuned with reinforcement learning, outperforms the same model trained with RL alone by margins that grow with task difficulty. If a base model is broad and opinionated about everything, post-training has to fight a vast prior built from Reddit, fiction, and the open web. If the base is already fluent in protein structures, tabular data, or German legal code, every subsequent RL run is cheaper, faster, and more stable. A domain-biased corpus is a one-time investment that pays back on every training cycle after it. This is why Xaira, Isomorphic Labs, and Chai Discovery are not "biotech startups using AI" but biotech-native foundation model labs, why Prior Labs built TabPFN as a foundation model purpose-built for tabular data rather than another general LLM with a spreadsheet wrapper, and why sovereign labs like Aleph Alpha and Sarvam matter. Their corpora cover regulated and linguistic domains that OpenAI cannot legally or practically assemble.

Research from CMU shows Mid-training+RL beats RL alone
The second layer is the RL loop.
Whoever owns the environment owns the data flywheel, because every action the model takes inside that environment generates a new training signal that no one outside the environment can replicate. CuspAI builds materials-discovery environments where simulated chemistry is the training signal. Factory AI does the same thing one domain over, running continual learning at the coding interface so that every mission a customer runs captures reusable patterns into a skill library. Vertical specific partnerships do not just require architectural change, they also require an entire organizational change.
The third layer is the verifier.
Reinforcement learning only works when the reward signal is trustworthy, and in most domains it is not. Verifiers like "Did the molecule bind to the target with sub-nanomolar affinity" or "Did the synthesized material exhibit the predicted band gap" include deterministic rewards, and they are only available in domains where ground truth exists and someone has built the apparatus to measure it. The verifier is the most underestimated asset in this stack because it looks like infrastructure when it is actually the source of truth.
The three layers reinforce each other. The corpus makes RL tractable. The RL loop generates the data the corpus cannot. The verifier ensures both are pointed at the correct target. A frontier generalist optimizes one layer at a time across every domain at once and ends up dominant in none. A Neo Lab integrates all three inside a single domain.
The Neo Lab opportunity, then, sits exactly where frontier generalists were never built to win: fields with proprietary corpora difficult to assemble, environments they do not natively operate inside, and verifiers only the practitioners can craft.

We have backed a select number of Neo Labs and have continued to invest in them. We believe these segments will produce a meaningful share of the next trillion dollars of Neo Lab value.
Another reason is scale. Neo labs raise $50M to $500M before they have revenue, which means their investors have to be comfortable holding the position for five to seven years before a commercial signal arrives. Their partners must then have sufficient reserves to continue to back that company through every stage in their journey. Only a few firms in the world have the long-term orientation, culture, and track record to do that.
And finally, technical expertise. NEA has backed World Labs, Fei-Fei Li's lab building spatial intelligence models for the three-dimensional world; Sakana AI, the Tokyo-based lab applying nature-inspired methods to model architecture; and CuspAI, the materials-discovery lab where simulated chemistry is the training signal. Each is now a leader in its category. The lessons learned from these investments are carried into partnerships with the next round of labs.
If you are a founder creating a Neo Lab, please reach out to us: ltretikov@nea.com, tjoshi@nea.com, aschoen@nea.com, mfaulkner@nea.com.
| Lab | Description | Category |
|---|---|---|
| AI21 Labs | Enterprise LLM developer building summarization, rewriting, and natural-language comprehension tools. | Frontier Generalist |
| Anthropic | Safety-focused frontier AI lab building Claude and the alignment research that underpins it. | Frontier Generalist |
| DeepMind | Alphabet’s frontier AI research lab building Gemini and the underlying research across language, vision, and embodied AI. | Frontier Generalist |
| Humans& | Building AI systems that interact collaboratively with people. | Frontier Generalist |
| Imbue | Trains foundation models optimized for reasoning to power robust, custom AI agents. | Frontier Generalist |
| Inception | AI research and product company building diffusion-based language models for production. | Frontier Generalist |
| MBZUAI IFM | Creates open foundation models across Abu Dhabi, Paris, and Silicon Valley. | Frontier Generalist |
| NeoCognition | AI agent lab researching specialized intelligence for tech enterprises and SaaS companies | Frontier Generalist |
| OpenAI | Frontier AI lab building general-purpose models, multimodal systems, and the API platform that underpins much of the ecosystem. | Frontier Generalist |
| Poetiq | Taking the fastest path to superintelligence via practical recursive self-improvement. | Frontier Generalist |
| Poolside | Frontier AI lab building code-specialized foundation models trained with reinforcement learning from code execution feedback, deployable inside customer infrastructure. | Frontier Generalist |
| Reka | Builds multimodal generative AI models for enterprise production across text, images, and tabular data. | Frontier Generalist |
| Safe Superintelligence | Builds foundational model architecture treating safety and capability as a singular technical challenge. | Frontier Generalist |
| Thinking Machines Lab | Mira Murati’s frontier AI lab studying human-AI collaboration and adaptable multimodal machine learning. | Frontier Generalist |
| TR x Imperial Frontier AI Lab | Thomson Reuters and Imperial College London joint lab pursuing frontier AI research in safety and capability. | Frontier Generalist |
| xAI | Frontier AI lab building Grok with native tool use and real-time search integration. | Frontier Generalist |
| Arcee AI | Builds open-weight foundation models that run on edge, on-prem, or in the cloud. | Enterprise & Open-Weight |
| Deep Cogito | Builds hybrid reasoning LLMs aimed at general superintelligence through a novel training strategy. | Enterprise & Open-Weight |
| DeepSeek | Chinese open-weight frontier AI lab whose R1 and V-series models match Western frontier capability at a fraction of the inference cost. | Enterprise & Open-Weight |
| Dolphin AI | Pushing the boundaries of AI model development and distributed inference. | Enterprise & Open-Weight |
| Moonshot AI | Beijing-based lab building open-weight Kimi-series models; reached $20B valuation in May 2026 on the strength of Kimi K2.6. | Enterprise & Open-Weight |
| Nous Research | Builds human-centric, open-source AI models with advanced reasoning and adaptability. | Enterprise & Open-Weight |
| Qwen | Alibaba’s open-source LLM team behind the Qwen series of foundation models, widely used as a base by other Chinese open-weight labs. | Enterprise & Open-Weight |
| Sentient | Decentralized AI platform for community-owned AGI development with on-chain attribution and collaborative governance. | Enterprise & Open-Weight |
| Stepfun | Chinese frontier lab developing unified multimodal models across language, image, video, and speech. | Enterprise & Open-Weight |
| Tencent | Operates the Hunyuan series of open-weight Chinese foundation models alongside its broader consumer and enterprise AI products. | Enterprise & Open-Weight |
| Zyphra | Builds next-generation agent architectures with state-space models and long-term memory for cloud, on-prem, and on-device deployment. | Enterprise & Open-Weight |
| Aleph Alpha | European sovereign AI lab building interpretable, customizable systems for regulated and government workflows. | Sovereign |
| Mistral AI | Builds open-source foundation models and enterprise-grade deployment tools with privacy-first defaults; the leading European frontier lab. | Sovereign |
| Reflection AI | Builds large-scale, open frontier intelligence models intended as a sovereign alternative to closed US labs. | Sovereign |
| Sakana AI | Tokyo-based lab applying nature-inspired and evolutionary methods to model architecture and generative AI research | Sovereign |
| Sarvam | Building AI accessible to everyone in India, with models tuned for the country’s linguistic diversity. | Sovereign |
| Core Automation | Jerry Tworek’s lab building Ceres, a continual-learning model designed to update weights in production, aimed at replacing the pretrain-then-fine-tune paradigm. | Compute |
| Orbital Industries | AI-first industrial company building hardware from the atoms up, starting with high-density GPU cooling for AI data centers. | Compute |
| Ricursive Intelligence | Frontier AI lab applying self-improving systems to chip design, closing the loop between AI and hardware. | Compute |
| Unconventional AI | Naveen Rao’s lab building analog and neuromorphic compute substrates designed to deliver biology-scale energy efficiency for AI workloads. | Compute |
| CuspAI | Materials-discovery lab where simulated chemistry is the training signal for AI-designed advanced materials. | Science & Automated Research |
| Kyutai | Non-profit AI research lab focused on high-impact projects at the frontier of generative AI. | Science & Automated Research |
| Lila Sciences | Building scientific superintelligence to solve humanity’s greatest challenges. | Science & Automated Research |
| Ndea | Frontier AI lab building program-synthesis systems that unify intuitive pattern recognition and formal reasoning. | Science & Automated Research |
| Periodic Labs | Building an AI scientist. | Science & Automated Research |
| Unreasonable Labs | Builds autonomous knowledge-creation systems that synthesize multidisciplinary research at scale. | Science & Automated Research |
| Axiom | Math and reasoning lab building AI trained on formal proofs, with applications across quantitative finance, software verification, and pure mathematics. | Math & Security |
| Harmonic | Harmonic Building the world’s most advanced mathematical reasoning engine. | Math & Security |
| Irregular | Frontier AI security lab setting safety standards for capable, sophisticated systems. | Math & Security |
| Math, Inc. | Pursuing verified superintelligence via autoformalization. | Math & Security |
| Cartesia | Real-time, on-device multimodal AI built on state-space model architectures, with voice as the leading modality. | Voice & Communication |
| Deepgram | Enterprise voice AI infrastructure for real-time speech recognition, transcription, and voice agents at scale. | Voice & Communication |
| Hark | Hark Building the most advanced personal intelligence in the world. | Voice & Communication |
| Flapping Airplanes | Foundational AI research lab in stealth, focused on the data efficiency problem. | Data |
| Fundamental | Stealth AI research company building new approaches to data-efficient learning; raised $255M Series A in February 2026. | Data |
| Logical Intelligence | Piloting the world’s first energy-based model for critical systems. | Data |
| Prior Labs | Tabular foundation model lab whose TabPFN set state-of-the-art on structured-data prediction; announced acquisition by SAP in May 2026. | Data |
| Aaru | Generates AI agent populations that simulate human behavior, replacing surveys and focus groups for political polling and corporate strategy. | Prediction |
| Simile | Stanford spinout building behavioral-simulation AI trained on real human interviews, transactions, and behavioral science, used to predict customer and market response. | Prediction |
| Chai Discovery | AI-native molecular structure prediction platform for drug discovery across proteins, small molecules, DNA, and RNA. | Biotech |
| Edison Scientific | AI scientist platform for life-sciences R&D, synthesizing literature and proprietary data to identify novel therapeutic targets. | Biotech |
| Goodfire | Mechanistic interpretability lab building tools to understand and steer the internal computations of frontier models, with early applications in biology and healthcare. | Biotech |
| Grafton Sciences | Building systems of general physical ability to enable superintelligence in biotechnology. | Biotech |
| Isomorphic Labs | Finding solutions to the world’s most devastating diseases through AI-first drug discovery. | Biotech |
| Xaira Therapeutics | AI-first drug discovery lab developing computational methods to expand science’s power to cure disease. | Biotech |
| AMI Labs | Yann LeCun’s lab building AI systems that understand the real world through world models, with persistent memory, reasoning, planning, and control. | Robotics |
| Embo | Ex-Google DeepMind researchers building world models for robotics; raising $100M+ seed led by Andreessen Horowitz. | Robotics |
| General Intuition | Builds foundation models and general agents for environments requiring deep spatial and temporal reasoning. | Robotics |
| Generalist | Building “physical AGI” through hardware-agnostic foundation models for robots; GEN-1 demonstrated 99% reliability on tasks like folding clothes and packaging. | Robotics |
| OneWorld | Foundation model lab building general-purpose robotic intelligence. | Robotics |
| Physical Intelligence | Building foundation models and learning algorithms to power general-purpose robots. | Robotics |
| Skild AI | CMU spinout building Skild Brain, an omni-bodied robotics foundation model that controls any robot for any task; $14B valuation as of January 2026. | Robotics |
| Liquid AI | Builds general-purpose AI systems that run on edge devices and small computers. | Physical & Edge |
| PHI AI | Physical AI lab building on-device intelligence for embodied and edge deployments. | Physical & Edge |
| Physical Superintelligence | AI lab with the mission of discovering new physics through autonomous experimental and theoretical reasoning. | Physical & Edge |
| PrismML | Multiplying intelligence in models without increasing size or complexity, for on-device deployment. | Physical & Edge |
| Standard Intelligence | Aligned AGI lab building a general computer action model. | Physical & Edge |
| UniversalAGI | Builds production-grade autonomous AI agents for enterprises and government organizations working with sensitive data. | Physical & Edge |
| Antim Labs | Builds interactive RL environments where machines learn through play and exploration. | Visual & Simulation |
| Black Forest Labs | Builds production-grade image generation and editing models (FLUX) with multi-reference control and local or cloud deployment. | Visual & Simulation |
| Decart | Builds real-time generative experience models and the training infrastructure to keep them stable at cluster scale. | Visual & Simulation |
| Elorian | Building the foundation of visual reasoning. | Visual & Simulation |
| Moonlake AI | Builds AI that generates world simulations and games. | Visual & Simulation |
| World Labs | Building foundational world models that perceive, generate, reason, and interact with the 3D world. | Visual & Simulation |
| Adaption | Builds adaptive AI modules with malleable datasets and gradient-free continual learning. | Self-Improving Systems |
| Autoscience | Autoscience Automates and accelerates the end-to-end AI research pipeline using frontier LLMs. | Self-Improving Systems |
| Ineffable Intelligence | AI lab self-discovering the foundations of knowledge; currently in stealth. | Self-Improving Systems |
| Isara Labs | Automating science to ensure the flourishing of humanity. | Self-Improving Systems |
| Mirendil | Frontier lab building systems that excel at AI R&D. | Self-Improving Systems |
| Recursive | London- and SF-based lab building AI systems that automate model architecture, training, evaluation, and research direction without human oversight. | Self-Improving Systems |
Sources
Pitchbook post-money valuations, April 30, 2026, neo labs defined as primarily research-focused AI companies.
Pitchbook post-money valuations, April 30, 2026.
https://www.axios.com/2026/05/22/reflection-ai-genesis-mission-energy-partnership
https://openai.com/index/openai-launches-the-deployment-company/