AI Science Companies: Evidence AGI Is Near
People routinely ask, “If AI labs believed AGI was imminent, why are they doing X?” Sometimes this skepticism is valid. But consider OpenAI launching Sora 2 as a TikTok-style feed of AI-generated videos[1]. On the surface, it seems like a waste of developer time, and critics argue it makes the world a worse place to live[13]. Yet the underlying video generation requires modeling physics, space, and causality essential for AGI.
This dynamic is even clearer in the realm of “AI for Science”, the focus of this article. Several AI science companies have been in the news lately (Periodic Labs[2], Lila[3], Medra[4], etc.), and a common reaction I’ve seen is skepticism: “These are long, hard projects. This isn’t what you’d do if you believed artificial general intelligence (AGI) was imminent.” I believe this view is fundamentally mistaken. These ventures aren’t a hedge against imminent AGI; they represent the most strategic and direct preparation for its arrival.
The key is understanding three points: First, science is so broad that any AI capable of tackling it would essentially be AGI. Second, mastering real-world science is one of the final capabilities needed for AGI. Third, once AGI exists, real-world labs configured for its use will become immensely valuable.
Science AI Is Essentially AGI
To understand why current science AI represents a path to AGI, contrast it with earlier “narrow” science AI like AlphaFold[8]. AlphaFold mastered a single, well-defined task: predicting protein structures from amino acid sequences. It’s unbeatable within its domain but can’t formulate new research questions, design experiments, or work across disciplines. The new generation of science AI companies (Periodic Labs[2], Lila[3], Medra[4]) are building something fundamentally different: systems that autonomously generate hypotheses, design experiments to test them, physically execute those experiments using robotic labs, analyze results, and iterate. They work across multiple domains simultaneously—chemistry, materials science, and drug discovery. This isn’t narrow AI that excels at one task; it’s a system that must master the full scientific method across diverse fields.
Consider the capabilities an AI would need to perform real scientific research:
- Reading comprehension to understand literature across multiple domains
- Technical writing to document findings
- Mathematical reasoning for theory and analysis
- Programming skills for simulations and data processing
- Experimental design
- The ability to physically perform experiments in the lab
- Critical thinking to form and test hypotheses
- Visual literacy to interpret and create graphs and charts
- The ability to synthesize information across disciplines
To succeed, a science AI must master essentially the full breadth of human cognitive capabilities. This is even broader than “general office work”. While office work may seem narrow by comparison, companies like Mechanize[11][12] show that it requires AI to master complex skills: using diverse software tools, managing long-horizon projects, coordinating with others, and adapting to unforeseen obstacles. Science demands all of that, plus domain expertise, physical manipulation, and creative hypothesis generation. An AI that can do science isn’t narrow; it’s AGI with lab equipment.
Other Benchmarks Are Saturating
The timing of these science AI ventures becomes clearer when you consider what’s happening to traditional AI benchmarks. GPT-5 now scores 91.4% on MMLU (Massive Multitask Language Understanding), surpassing the 89.8% achieved by human experts[5]. Coding benchmarks show similar saturation—AI systems routinely match or exceed human performance on standard programming tasks. As these traditional measures of AI capability approach their ceilings, researchers need new frontiers to push toward AGI.
This saturation extends beyond academic benchmarks to real-world economic activity. OpenAI’s GDPVal benchmark measures AI performance on tasks across 44 occupations that contribute significantly to US GDP[6]. Their September 2025 report revealed that top models like Claude Opus 4.1 perform at or above human expert level on roughly 48% of these tasks—including drafting legal briefs, creating engineering blueprints, and developing financial analyses—at speeds 100x faster and cheaper than humans. With performance doubling between Spring 2024 (GPT-4o) and Summer 2025 (GPT-5), trends suggest AI will match human performance across the majority of economically valuable tasks by mid-2026.
If AGI is defined (as OpenAI does[16]) as systems that “outperform humans at most economically valuable work,” we are rapidly approaching that threshold in the digital realm. This is where science becomes critical. Unlike saturated benchmarks with known solutions, real scientific research presents genuinely novel challenges: formulating original hypotheses, designing experiments to test them, physically executing those experiments, interpreting ambiguous results, and iterating based on what you learn. When AI labs needed a harder benchmark after MMLU became too easy, they created FrontierMath—where AI systems initially solved only 2% of problems. But performance improved so rapidly that researchers had to add new, harder difficulty tiers to maintain meaningful evaluation[5]. Science represents a similar leap in difficulty, but with real-world impact rather than just benchmark scores. It’s one of the last remaining domains where AI capabilities are still far from human level, making it a natural next frontier for labs pursuing AGI.
Physical World Capabilities Are Critical
While we’ve established that physical manipulation is among the capabilities science AI requires, it’s worth emphasizing why this particular capability has such strategic importance. A purely digital AGI, no matter how intelligent, faces fundamental limitations. It can write code, analyze data, and generate insights, but transforming the world requires physical intervention. Curing diseases needs wet lab experiments. New materials require physical synthesis and testing. Clean energy technologies must be built and deployed. The most impactful applications of AGI—medicine, manufacturing, materials science, energy—all require interfacing with physical reality. Humans can be bribed, charmed, or threatened into doing it on an AI’s behalf, but that introduces a point of failure outside the AI’s control.
This is why the robotics developments we’re seeing, like Gemini Robotics 1.5[7], matter so much. Vision-language-action models enable AI to perceive the physical world, manipulate objects with precision, and adapt plans in real-time. The science labs being built by Periodic[2], Lila[3], and Medra[4] are creating exactly this infrastructure: robotic systems that can handle reagents, run experiments, operate analytical equipment, and iterate based on results. When AGI arrives, purely digital AGI will be impressive but limited. AGI with access to these autonomous labs will be transformative, capable of running thousands of parallel experiments, iterating 24/7, and directly discovering new drugs, materials, and technologies that reshape the world.
Science AI Accelerates AGI (Recursive Improvement)
Here’s where the strategic picture becomes even clearer: an AI capable of doing scientific research can do AI research. This leads directly to recursive self-improvement: an AGI building better AGI. While AI already assists with AI research today, this would represent a qualitative shift to autonomous systems driving the entire research loop. Leopold Aschenbrenner’s “Situational Awareness: The Decade Ahead”[9] lays out this dynamic in detail: once you have AGI-level systems, you can deploy hundreds, thousands, or millions of them to automate AI research. What currently takes human researchers years (reading papers, designing experiments, implementing improvements) gets compressed into months or weeks. Aschenbrenner argues this could compress a decade of algorithmic progress into less than a year, leading to superintelligence by the end of the decade. An AI system that can read papers, formulate hypotheses, design experiments, and iterate on results can apply those same capabilities to improving AI systems themselves.
This represents an evolution in how AI safety thinkers model takeoff dynamics. Early models, often associated with Eliezer Yudkowsky, focused on a single AI system rapidly self-improving in isolation, potentially within hours or days. The modern picture, exemplified by Aschenbrenner, envisions massive parallelization of AGI systems working on AI research, with algorithmic progress compressed from decades to a year or less. Either way, science capability is the direct path to AGI, which immediately enables the sprint to superintelligence. Far from being a distraction from AGI, building science AI is building the direct path to it. These companies aren’t hedging against AGI; they’re constructing the exact capability that enables the final sprint.
Why This Isn’t Hedging
As one observer put it: “Today’s models are so strong that it now makes sense to raise capital to build automated labs so that robots could test these models’ findings at scale”[15]. This captures the key insight—science AI companies aren’t hedging against AGI; they’re making the most AGI-bullish move possible. Let’s break down why:
Science breadth drives takeoff: As established, science requires essentially the full breadth of general intelligence. Every advance in science AI capability directly narrows the gap to AGI. Unlike companies building narrow products or optimizing for specific enterprise use cases, science AI companies are pushing directly on the hardest remaining bottleneck to AGI.
Physical world integration is the last mile: Lab-grade robotic manipulation represents one of the critical remaining challenges. These companies are solving calibration, quality assurance, reliability under continuous operation, and integration of vision-language-action models with real-world equipment. When AGI emerges, the teams that have already solved these problems will be positioned to deploy it for maximum impact immediately.
Infrastructure readiness accelerates deployment: When AGI arrives, having teams experienced in scientific automation, datasets of experimental results, validated robotic systems, and operational lab facilities dramatically reduces time-to-deployment. Rather than AGI emerging into a world that must then build this infrastructure from scratch, these companies are pre-positioning all the pieces needed for immediate transformative impact.
Counterarguments & Rebuttals
“This is too slow—science takes years, AGI needs faster iteration.”
This objection has real merit: biology and chemistry have hard physical constraints. Chemical reactions have minimum timeframes. Cell cultures need days to grow. Drug candidates require years of clinical trials. Materials need time to crystallize or cure. Even perfect automation can’t eliminate these fundamental physical limits.
However, the skeptic underestimates how much current cycle times reflect human limitations rather than physical ones. What takes a human researcher weeks—literature review, experimental design, execution, analysis—can shrink to days or hours with AI assistance. Early results from automated labs show 10x faster iteration in specific domains[17], with fully autonomous experimental loops completing overnight rather than over months. While some physical processes have irreducible timeframes, the compounding effect of removing human bottlenecks is still transformative. The “slow” nature of science is largely a function of human limitations that AI lacks.
“This is just hype—where are the actual results?”
This skepticism is earned. AI for drug discovery has a long history of overpromising and underdelivering, with many well-funded companies producing more press releases than approved drugs. Even successful projects like AlphaFold, while genuinely impressive, remain narrow tools rather than autonomous researchers. The newest companies claim “hundreds” or “thousands” of discoveries, but haven’t published validated breakthroughs at scale.
Yet dismissing the entire field as hype ignores tangible achievements. AlphaFold alone predicted over 200 million protein structures and earned the 2024 Nobel Prize in Chemistry for its creators[8]. It’s used by over 2 million researchers globally. That was one narrow project. The broader AI science field is already producing peer-reviewed results and industry partnerships, and these newer companies are working with paying customers and making novel discoveries in their labs. Lila reports hundreds of discoveries across life sciences, chemistry, and materials[3]. These aren’t hypothetical futures—they’re producing tangible scientific outputs now, at a pace that would be newsworthy if it weren’t overshadowed by the breathless pace of AI progress elsewhere.
“Digital AGI is enough—why do we need labs?”
Humans already bridge the digital-to-physical gap effectively across all domains. A digital AGI could hire contractors for construction, commission manufacturers for products, direct technicians for maintenance, and yes, engage researchers for experiments. This human-mediated approach is flexible, scales with existing infrastructure, and avoids massive upfront capital costs. Why invest billions in robotic automation when you can just pay humans to execute?
The answer lies in speed, scale, and reliability. For transformative applications—whether curing diseases, building advanced materials, manufacturing breakthroughs, or deploying clean energy—waiting for human intermediaries reintroduces exactly the bottleneck AGI was meant to eliminate. Science labs represent a particularly clear case: these companies are building infrastructure where AGI designs experiments and robotic systems immediately execute them, iterating thousands of times faster than human-mediated workflows. But the principle extends beyond research to any domain where rapid physical iteration drives progress. Robotics turn AGI from an advisor into an autonomous actor.
Valuable Regardless of Timeline
Ultimately, these companies will be valuable regardless of the timeline. If AGI is ten years away, these companies will have time to become major players in their industries by curing diseases or patenting new chemicals. If AGI is one year away, they will have spent that year building the perfect launchpad. They will have the team, the proprietary data, and the physical infrastructure ready to work with it.
It’s also worth noting that even “long timelines” in modern AI discourse mean 5-10 years, not 20-30. The Overton window has shifted dramatically. What skeptics call a “conservative” timeline would still see these companies hitting AGI-readiness within their normal venture capital timeframe. Even the supposedly “slow” scenario isn’t that slow anymore.
That’s why no one should mistake science companies for a bearish signal on AGI. They’re the most bullish move possible; building the exact infrastructure and capabilities that will matter most when AGI arrives. When that moment comes, the world will need more than pure intelligence; it will need that intelligence connected to labs that can transform ideas into reality. These companies are building that connection now[10].
Sources
-
OpenAI launches Sora 2 - AI video generator with physically accurate world modeling (Tech Startups, Oct 2025)
-
Periodic Labs raises $300M seed round - Former OpenAI and DeepMind researchers building robotic labs where AI generates hypotheses and runs experiments (TechCrunch, Sept 2025)
-
Flagship Pioneering Unveils Lila Sciences - Building “AI Science Factories” with autonomous labs; claims thousands of discoveries in life sciences, chemistry, and materials science (March 2025)
-
Medra Labs Continuous Science Platform - Another AI science platform announcement (Sept 2025)
-
AI Index 2025: Technical Performance - GPT-5 reaches 91.4% on MMLU vs 89.8% human experts; new harder benchmarks like FrontierMath (2% AI solve rate) needed (Stanford HAI, 2025)
-
GDPVal: Measuring Real-World Economic Impact of AI - OpenAI’s benchmark measuring AI performance on economically valuable tasks (Sept 2025)
-
Gemini Robotics 1.5 brings AI agents into the physical world - Vision-language-action models enabling complex physical manipulation and reactive replanning (Google DeepMind, 2025)
-
AlphaFold wins 2024 Nobel Prize in Chemistry - First Nobel Prize for AI-enabled scientific breakthrough; predicted 200+ million protein structures, used by 2+ million researchers (Nature, Oct 2024)
-
Situational Awareness: The Decade Ahead - Leopold Aschenbrenner’s analysis of AGI timelines, recursive improvement, and the path to superintelligence (June 2024)
-
Inspired by this tweet from Aaron Defazio
-
Mechanize Work on X - AI agents for workplace tasks
-
You Better Mechanize - Zvi Mowshowitz on Mechanize’s approach to workplace automation
-
Sora and The Big Bright Screen Slop - Zvi Mowshowitz on how Sora’s AI-generated video social network could harm user experience and societal well-being
-
Potato.ai - AI co-scientist platform helping researchers design and execute experiments
-
Tweet from prinz - Arguing that building automated labs for AI science is evidence FOR short AGI timelines, not against them
-
OpenAI Charter - OpenAI’s mission statement defining AGI as “highly autonomous systems that outperform humans at most economically valuable work”
-
AI-powered lab discovers new materials 10x faster - NC State self-driving lab collects 10x more data than previous techniques through continuous real-time monitoring (Nature Chemical Engineering, July 2025)