AnalysisAi4 min readApr 28, 2026

Why AlphaGo's David Silver Thinks AI Is on the Wrong Path

DeepMind's David Silver reportedly raised $1.1 billion for a new AI company focused on learning without human data. Here is why that matters.

Omer YLD

Founder & Editor-in-Chief

Apr 28, 20264 min · 793 words

Photo · Amos K / Unsplash

Above → Robot arm playing chess against a human opponent, representing reinforcement learning research

Filed from · IstanbulPhoto · Amos K / Unsplash

David Silver, the DeepMind researcher best known for AlphaGo, has reportedly raised $1.1 billion for a new AI company built around a provocative thesis: modern AI is too dependent on human-generated data. TechCrunch and Wired both covered the raise and Silver's argument that today's dominant approach may be the wrong path for the next leap.

That does not mean large language models are useless. It means the next breakthrough may not come from scraping more text, images, code, and video from the internet. It may come from agents that learn by acting, experimenting, failing, and improving in environments where human data is not the ceiling.

The current AI recipe

Most mainstream AI progress since 2020 has followed a familiar pattern: collect huge amounts of human-generated data, train a model to predict and generate patterns from it, then refine behavior with human feedback and tool use. That recipe produced astonishing systems. It also created obvious limits.

Human data is messy, biased, copyrighted, repetitive, and finite. The best text on the internet does not automatically teach a robot to manipulate objects, a scientific agent to run better experiments, or a planning system to discover strategies humans never wrote down.

Silver's critique is that imitation can only take AI so far. At some point, a system has to learn from consequences.

Why AlphaGo is the reference point

AlphaGo mattered because it showed a different kind of learning. The system used human games, but its most important gains came from self-play and reinforcement learning. It improved by playing millions of games against itself, discovering strategies that even elite players found surprising.

That is the dream Silver is reviving at broader scale: systems that can generate their own learning signal. Instead of asking, "What would a human write next?" the system asks, "What action leads to a better outcome?"

The internet taught AI to talk like us. Reinforcement learning is the bet that AI can learn to discover things we never wrote down.

Technerdo

Where this could matter first

The most plausible early wins are domains with clear feedback loops:

Games and simulations. Agents can practice endlessly and measure success.
Robotics. Simulated environments can teach grasping, navigation, and manipulation before real-world transfer.
Scientific discovery. Models can propose experiments, simulate outcomes, and optimize toward measurable targets.
Software agents. Code either passes tests or fails them, giving a feedback signal beyond human-written examples.
Operations and logistics. Scheduling, routing, and resource allocation can be optimized with reward functions.

The challenge is that real life rarely provides clean rewards. A Go game has a winner. A personal assistant booking travel has preferences, risks, exceptions, and human taste.

Why investors care

The AI market is crowded with companies building variations on the same stack: more data, bigger clusters, larger models, enterprise wrappers. A credible team offering a different route attracts attention because it could change the cost curve.

If AI systems can improve through synthetic environments, self-play, and feedback loops, they may need less licensed human data and fewer brute-force scaling gains. That is a big if, but it is exactly the kind of if that gets funded when the current path is expensive.

What could go wrong

Reinforcement learning is powerful but brittle. Reward functions can be gamed. Simulations can fail to transfer to the real world. Agents can discover shortcuts that technically satisfy a metric while violating human intent. Anyone who has watched a game AI exploit physics glitches understands the problem.

There is also a safety question. Systems that learn by acting need boundaries. An AI that improves through experimentation should not be experimenting freely on users, markets, infrastructure, or public networks.

Note

The real benchmark

The question is not whether reinforcement learning can beat humans in controlled games. It already has. The question is whether it can produce reliable general-purpose systems in messy real-world domains.

Bottom line

Silver's new company is a bet against imitation as the final form of AI. It argues that future systems need to learn from the world, not just from our records of the world.

That is a credible bet. It is also a hard one. If it works, the next wave of AI may look less like a chatbot trained on the internet and more like a problem-solver trained through experience.

— ∎ —

Filed underDavid Silver Deepmind Alphago Reinforcement Learning Ai 2026

About the writer

Omer YLD

Founder & Editor-in-Chief

Omer YLD is the founder and editor-in-chief of Technerdo. A software engineer turned tech journalist, he has spent more than a decade building web platforms and dissecting the gadgets, AI tools, and developer workflows that shape modern work. At Technerdo he leads editorial direction, hands-on product testing, and long-form reviews — with a bias toward clear writing, honest verdicts, and tech that earns its place on your desk.

Product Reviews
AI Tools & Developer Workflows
Laptops & Workstations
Smart Home
Web Development
Consumer Tech Analysis

All posts →Website

Was this piece worth your five minutes?

Join the conversation — sign in to leave a comment and engage with other readers.

Loading comments...

Analysis worth reading, delivered every Monday.

One carefully written email a week. Features, deep dives, and the stories buried under press-release noise. No daily clutter.

One email a week · Unsubscribe any time · No affiliate-only promos

AnalysisAi4 min readApr 28, 2026

Why AlphaGo's David Silver Thinks AI Is on the Wrong Path

DeepMind's David Silver reportedly raised $1.1 billion for a new AI company focused on learning without human data. Here is why that matters.

Omer YLD

Founder & Editor-in-Chief

Apr 28, 20264 min · 793 words

Photo · Amos K / Unsplash

Above → Robot arm playing chess against a human opponent, representing reinforcement learning research

Filed from · IstanbulPhoto · Amos K / Unsplash

The current AI recipe

Silver's critique is that imitation can only take AI so far. At some point, a system has to learn from consequences.

Why AlphaGo is the reference point

The internet taught AI to talk like us. Reinforcement learning is the bet that AI can learn to discover things we never wrote down.

Technerdo

Where this could matter first

The most plausible early wins are domains with clear feedback loops:

Games and simulations. Agents can practice endlessly and measure success.
Robotics. Simulated environments can teach grasping, navigation, and manipulation before real-world transfer.
Scientific discovery. Models can propose experiments, simulate outcomes, and optimize toward measurable targets.
Software agents. Code either passes tests or fails them, giving a feedback signal beyond human-written examples.
Operations and logistics. Scheduling, routing, and resource allocation can be optimized with reward functions.

The challenge is that real life rarely provides clean rewards. A Go game has a winner. A personal assistant booking travel has preferences, risks, exceptions, and human taste.

Why investors care

What could go wrong

Note

The real benchmark

Bottom line

Silver's new company is a bet against imitation as the final form of AI. It argues that future systems need to learn from the world, not just from our records of the world.

That is a credible bet. It is also a hard one. If it works, the next wave of AI may look less like a chatbot trained on the internet and more like a problem-solver trained through experience.

— ∎ —

Filed underDavid Silver Deepmind Alphago Reinforcement Learning Ai 2026

About the writer

Omer YLD

Founder & Editor-in-Chief

Product Reviews
AI Tools & Developer Workflows
Laptops & Workstations
Smart Home
Web Development
Consumer Tech Analysis

All posts →Website

Was this piece worth your five minutes?

Join the conversation — sign in to leave a comment and engage with other readers.

Loading comments...

Analysis worth reading, delivered every Monday.

One carefully written email a week. Features, deep dives, and the stories buried under press-release noise. No daily clutter.

One email a week · Unsubscribe any time · No affiliate-only promos

Why AlphaGo's David Silver Thinks AI Is on the Wrong Path

The current AI recipe

Why AlphaGo is the reference point

Where this could matter first

Why investors care

What could go wrong

The real benchmark

Bottom line

Omer YLD

More from Ai

Best AI Coding Agents for Development Teams in 2026

Best AI Music Generators in 2026: We Tested the Top 5

The AI Data Center Energy Crisis: Can We Power the Future?

Analysis worth reading, delivered every Monday.

Why AlphaGo's David Silver Thinks AI Is on the Wrong Path

The current AI recipe

Why AlphaGo is the reference point

Where this could matter first

Why investors care

What could go wrong

The real benchmark

Bottom line

Omer YLD

More from Ai

Best AI Coding Agents for Development Teams in 2026

Best AI Music Generators in 2026: We Tested the Top 5

The AI Data Center Energy Crisis: Can We Power the Future?

Analysis worth reading, delivered every Monday.