阿尔法狗十周年：人工智能的转折点 | 索雷·格拉佩尔与普什米特·科利 | Google DeepMind: The Podcast 中文双语解读

本集简介

首尔，2016年3月。两名选手俯身于一张布满黑白棋子的19×19棋盘前，正在对弈古老的围棋——一种复杂到曾被认为机器无法掌握的棋类游戏。一方是传奇的18届围棋世界冠军李世石（Lee Sedol），另一方是AlphaGo——一个基于强化学习这一强大技术构建的神经网络人工智能系统。就在眨眼之间，世界改变了。整整十年后，我们回望这场点燃现代人工智能革命的对局。从算法的发现到解决蛋白质折叠等科学重大挑战，这一切的基石，正是在这块木制棋盘上奠定的。请跟随汉娜·弗莱、普什米特·科利（科学副总裁）和托雷·格拉佩尔（AlphaGo团队及杰出研究科学家）一起剖析AlphaGo的遗产。 🎥 AlphaGo https://youtu.be/WXuK6gekU1Y 🎥《思考的游戏》：https://youtu.be/d95J8yzvjbQ 如果您喜欢本集节目，请在Spotify或Apple Podcasts上为我们留下评价。我们始终期待您的反馈、新点子或嘉宾推荐！本节目由Simplecast（AdsWizz公司）制作。有关我们为广告目的收集和使用个人数据的信息，请访问 pcm.adswizz.com。

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

欢迎回到谷歌DeepMind播客，我是汉娜·弗里教授。

Welcome back to Google DeepMinder Podcast, I am Professor Hannah Fry.

Speaker 0

想象一下这个场景。

Picture the scene.

Speaker 0

那是2016年3月。

It's March 2016.

Speaker 0

在韩国首尔的一家酒店套房内，两名选手正在对弈古老的围棋游戏。

Inside a hotel suite in Seoul, South Korea, two players are playing the ancient game of Go.

Speaker 0

这是一种难以想象的复杂游戏，长期以来被认为机器不可能掌握。

A game of unimaginable complexity, long thought impossible for a machine to master.

Speaker 0

一方是李世石，一位传奇的18次围棋世界冠军。

On one side is Lee Sedol, a legendary 18 time Go world champion.

Speaker 0

另一方是AlphaGo，一个基于强化学习这一强大技术的神经网络人工智能系统。

On the other, AlphaGo, a neural network based AI system built on a powerful technique called reinforcement learning.

Speaker 1

欢迎来到在韩国首尔现场直播的DeepMind挑战赛。

Welcome to the DeepMind Challenge live in Seoul, Korea.

Speaker 1

这步棋非常出人意料。

That's a very surprising move.

Speaker 2

没有一个人类玩家会选择第37手。

Not a single human player would have chosen move 37.

Speaker 0

经过连续七天、长达数小时的激烈对弈

After hours of intense gameplay spread over seven days

Speaker 1

是的。

Yeah.

Speaker 1

这是一步令人兴奋的棋。

That's an exciting move.

Speaker 0

李世石放下两颗棋子，表示认输。

Lee Sedol placed two stones on the board to signal his final resignation.

Speaker 0

击掌庆祝。

High five.

Speaker 0

转瞬之间，世界改变了。

And in the blink of an eye, the world changed.

Speaker 1

最终比分是四比一。

The final result of four one.

Speaker 1

祝贺AlphaGo和整个团队。

Congratulations to AlphaGo and to the entire team.

Speaker 0

那正是十年前，自那以后，人工智能领域发生了难以想象的变化。

That was exactly one decade ago, and the field of AI has changed unimaginably since then.

Speaker 0

我们见证了大型语言模型的兴起、AI代理的日益复杂化，以及像蛋白质折叠这样的科学重大挑战的解决。

We have seen the rise of large language models, the growing sophistication of AI agents, and the solving of scientific grand challenges like protein folding.

Speaker 0

但在许多方面，现代人工智能革命 arguably 正是从韩国那张木制棋盘上开始的。

But in many ways, the modern AI revolution arguably began right there, on that wooden board in South Korea.

Speaker 0

因此，在这一集中，我们希望回顾并展望，当初这个教机器下棋的大胆实验，如何成为今日AI突破的基石。

So in this episode, we wanted to look backwards and forwards to how a bold experiment in teaching machines to play games became the foundation stone for the AI breakthroughs of today.

Speaker 0

和我一起的，正是讲述这个故事的完美嘉宾。

And with me are the perfect guests to tell that story.

Speaker 0

托雷·格拉佩尔是谷歌DeepMind的杰出研究科学家，当时就在首尔，作为AlphaGo项目的关键架构师。

Thore Graepel is a distinguished research scientist at Google DeepMind who was right there in Seoul as a key architect of the AlphaGo project.

Speaker 0

还有普什米特·科利，他领导着谷歌DeepMind的科学工作，能告诉我们那些在围棋中首创的早期技术如何应对当今的关键问题。

And Pushmeet Kohli, who leads Google DeepMind's science work and is the person to tell us how those early techniques pioneered in Go can tackle crucial problems today.

Speaker 0

欢迎两位来到本播客。

Welcome to the podcast, both of you.

Speaker 0

索尔，我知道你本人也是一位出色的围棋选手。

Thore, I know you're an accomplished Go player yourself.

Speaker 0

请给我们解释一下，为什么围棋被视为对人工智能的一个良好挑战。

Just explain to us why Go was seen as a good challenge for AI.

Speaker 3

是的。

Yes.

Speaker 3

围棋之所以被视为人工智能的完美挑战，是因为它的规则非常简单，却能产生极其复杂的对弈，包含战术、策略和复杂的模式。

The the game of Go seemed like the perfect challenge for AI because the game has such simple rules, yet it leads to such complex gameplay with tactics and strategies and complex patterns.

Speaker 3

一旦国际象棋被攻克——或者说，深蓝已经击败了世界冠军——围棋就成了一个未解的难题。

And once the game of chess had been solved as it were, or at least, you know, Deep Blue had won against the world champion, then Go was this open challenge.

Speaker 3

它的复杂程度远超国际象棋好几个数量级，当时没人预料到它能被迅速攻克。

It's much more complex than chess by many orders of magnitude, and nobody was expecting it to be solved anytime soon.

Speaker 3

然而，对于计算机科学家来说，它看起来如此优雅简洁，因此在当时是理想的攻关目标。

Yet it it it looks so elegant and simple for computer scientists, and so it was the perfect game to tackle, at the time.

Speaker 0

我的意思是，没人认为它会很快被解决，这一点可谓一针见血。

I mean, the idea of nobody thinking it would be solved anytime soon, that's sort of hits the nail on the head.

Speaker 0

对吧，普什米特？

Right, Pushmeet?

Speaker 0

我知道你当时在微软工作。

I mean, I know you were working at Microsoft at the time.

Speaker 0

但当时这个问题被认为有多复杂呢？

But just how complex was this problem considered to be?

Speaker 4

我认为它被认为极其复杂。

I think it was considered extremely complex.

Speaker 4

这不仅是因为搜索空间的广度，也就是你能走的步数，还因为深度——你需要推理多久，以及对局有多长。

And that is because not only because of the breadth of the search space, of the number of moves you can make, but also the depth, how long you have to reason and how long the games are.

Speaker 4

在国际象棋中，你可能只需要考虑大约60到70步的推理。

In a game of chess, you might think about reasoning about 60 to 70 sort of moves.

Speaker 4

在围棋中，对局要长得多。

In a game of Go, it's much, much longer.

Speaker 4

这就带来了这个问题的挑战。

And that leads to the challenge of the problem.

Speaker 0

索尔，我知道你刚到DeepMind时是个围棋选手，你第一天就和AlphaGo对弈过吗？

Thore, I know when you first started at DeepMind being Go player, didn't you play against AlphaGo on your first day?

Speaker 3

是的。

Yeah.

Speaker 3

对。

Yeah.

Speaker 3

没错。

Exactly.

Speaker 3

所以想象一下，我第一天到DeepMind上班，认识了几个同事，包括大卫·西尔弗，他问我：索尔，你是围棋选手，对吧？

So imagine, I come first day at work at DeepMind, I know a couple of people including David Silver and he asks me, Thore, you're a Go player, right?

Speaker 3

你能不能帮个忙，测试一下我们当时那个还叫不上AlphaGo名字的初级版本？

Couldn't you do us a favor and test our baby version of of something that wasn't even called AlphaGo at the time, of course, you know.

Speaker 3

这是一个实习项目，他们刚刚从互联网上收集了大约几千盘棋谱，训练了一个系统，也许是几十万盘棋。

It was an internship project and they had just about taken a few thousand games from the internet and had trained a system, or a few 100,000 games maybe.

Speaker 3

我有幸成为最早与它对弈的人之一。

And I had the opportunity to be one of the first people to play against it.

Speaker 3

但你可以想象，我既兴奋又紧张。

But you can imagine, I was excited but I was also nervous.

Speaker 3

那是我上班的第一天，却被带到了一个中央位置的桌子旁。

It was my first day at work, and there I was being dragged to a centrally located table.

Speaker 3

对面坐着的我想是黄士杰，他后来被称为AlphaGo的手，面无表情。

On the other side, I think it was Aja Huang, who would later be known as the hand of AlphaGo, with his poker face.

Speaker 3

我有机会与这个AlphaGo的初级版本对弈。

And I got to play against this baby version of AlphaGo.

Speaker 0

想必当时有很多人围观吧？

With people watching presumably?

Speaker 0

有

With

Speaker 3

周围有很多人围观。

a lot of people watching all around me.

Speaker 3

你知道，后来德米斯出现了，当然戴维一直都在。

You know, there was no Later, Demis showed up, and of course David was there the whole time.

Speaker 3

是的。

Yeah.

Speaker 3

那么人们会怎么做呢？

And so what does one do?

Speaker 3

下得保守一点，对吧？

Play conservatively, right?

Speaker 3

所以我只是想，别犯错就行，这应该不会太难吧。

So I just thought, just don't make a mistake, surely this can't be so hard.

Speaker 3

但当然，这正是那个版本程序擅长的地方。

But of course that was exactly what that version of the program was good at.

Speaker 3

它是在人类职业棋手对局上训练的，所以它完全知道如何应对常规下法。

It was trained on human professional games, so it knew exactly what to do against conventional play.

Speaker 3

随着这场小规模的对局进行，我的局势越来越糟，最终以微弱劣势输掉了比赛。

And so as this little test match proceeded, my position became worse and worse and I ended up losing by a small margin.

Speaker 3

但我成为了官方记录中第一个在Staff Ago中输掉的人。

But I took the crown of the first person who officially lost in Staff Ago.

Speaker 3

这真是一次独特的经历。

It was quite the experience.

Speaker 3

当然，之后每个人都认识我了。

And of course afterwards everyone knew me.

Speaker 3

这是一种很棒的自我介绍方式。

It was a wonderful way of introducing myself.

Speaker 0

一种令人谦卑的方式。

A humbling way.

Speaker 3

一种令人谦卑的方式。

A humbling way.

Speaker 3

没错。

Exactly.

Speaker 0

当然。

Absolutely.

Speaker 0

普什米特，提醒我们一下，好吧。

Pushmeet, just remind us I mean, okay.

Speaker 0

我知道，从那个还是实习阶段的早期开始，这个算法已经取得了巨大的进展。

So I I I know that the the algorithm advanced quite substantially from from that early point where it was an internship.

Speaker 0

但总的来说，能否为我们解释一下它是如何运作的，特别是关于破解组合空间这个概念？

But just broadly, explain to us how it works and this this idea about cracking the kind of combinatorial spaces in particular.

Speaker 4

是的。

Yeah.

Speaker 4

如果你看一下围棋，任何时候你能下的步数是有限的。

So I think if you look at the game of Go, the number of moves that you can make, at any given time, there are a finite number of moves.

Speaker 4

但如果你观察并思考整个棋局状态，它的复杂度是指数级的。

But if you look and reason about the the overall game state, it's exponential.

Speaker 4

而你需要考虑的状态数量呈指数增长，这正是让围棋变得极其复杂的原因。

And that exponential growth in the number of states that you have to reason about is what makes the game extremely complicated.

Speaker 0

那么他们是怎么破解这个问题的呢？

So how did they crack it then?

Speaker 0

再给我们提醒一下，他们发现的解决方案是什么。

What's just remind us of of of the solution that they discovered.

Speaker 4

AlphaGo的精妙之处在于它结合了快速思考和慢速思考这两种元素。

The beauty of AlphaGo was there is this element of thinking fast and thinking slow.

Speaker 4

在某种意义上，AlphaGo完美地融合了快速思考与慢速思考的过程，以应对这个极其庞大的搜索空间。

And AlphaGo in some sense was the perfect combination of those thinking fast and thinking slow processes coming together to take on this extremely large search space.

Speaker 3

所以这与人类下棋的方式非常吻合，我认为。

So it matches quite well to how humans play the game, I think.

Speaker 3

嗯。

Mhmm.

Speaker 3

你可以想象，人类下象棋或围棋时，我们也有能力快速判断一个局面对黑方或白方是否有利。

You know, if you imagine how how a human would play a game of chess or a game of Go, we also have the capacity to look at a position and pretty quickly appreciate if that's good for black or good for white.

Speaker 3

而且我们也能从一个局面中直接看出一些有潜力的走法。

And we can also look at a position and already see moves that seem promising.

Speaker 3

我们从不会考虑所有可能的走法，象棋中可能有二三十种，围棋中则有两三百种。

We never look at all the possible moves which would be maybe 20 or 30 in chess or or 200 or 300 in Go.

Speaker 3

我们立刻会依赖某些看似优美、恰到好处的走法，这些走法由我们的直觉引导。

We immediately draw on to certain, maybe even aesthetically pleasing moves that seem like just the right ones guided by our intuition.

Speaker 3

而这种能力通过规划得到补充，在规划中我们会明确地推演各种可能性。

And that element is complemented by planning where we explicitly reason through the possibilities.

Speaker 3

如果我走这一步，对手可能会走那一步，然后我必须用这一步来应对。

If I make this move, my opponent might make that move and then I have to counter with this move.

Speaker 3

这两种不同的思维方式共同构成了人类下棋的方式，也同样体现在AlphaGo的下法中。

And these two different ways of thinking come together in how humans play these games, and they also come together in in how AlphaGo plays.

Speaker 0

直觉与计算，正如

The intuition and the calculation as it

Speaker 3

那样。

were.

Speaker 3

没错。

Exactly.

Speaker 0

那么，这就是灵感的来源吗？

So was that the inspiration then?

Speaker 0

你有没有思考过自己下棋的方式，以及其他围棋选手的下法，并从神经科学中直接汲取了灵感？

Did you sort of think about how you were playing the game, how other Go players were playing the game, and draw that direct inspiration from neuroscience effectively as it were?

Speaker 3

是的，我认为这确实是其中一个方向，因为很多团队成员本身就是棋手，能够进行自我反思，观察我们是如何应对这盘棋的。

Yeah, I think that is definitely one direction because a lot of team members were actually game players who were able to introspect and see how we tackled the game.

Speaker 3

当然，这还结合了深度学习——自2012年以来，深度学习作为一种方向迅速发展，并首次为我们提供了学习这些近似函数的工具，例如价值函数，它接收一个棋盘并判断当前局面对黑方或白方有多有利；或者策略网络，它接收一个棋盘并根据职业棋手可能采取这些走法的概率对可用的着法进行排序。

And then of course that comes together with deep learning that at the time, you know, since 2012 had grown as a direction and now for the first time gave us the tools to learn these approximate functions for example the value function that takes a board and tells us how good it is for either black or white, or the policy network that takes a board and effectively ranks the available moves according to how likely it would be that a professional player would take them.

Speaker 3

因此，当时深度学习正处在成熟阶段，能够应对这个问题，并为我们实现快速思考提供了机会。

And so deep learning was just ripe at the time to tackle this problem and gave us the opportunity to implement the fast thinking.

Speaker 3

慢思考与深蓝的情况并无不同，你知道的，那就是对博弈树的搜索，这早已为人所知，我们现在或许称之为传统人工智能。

The slow thinking is not unlike what happened in Deep Blue, you know, it's the search of the game tree that was already known and that we might now call good old fashioned AI.

Speaker 0

明白了。

Okay.

Speaker 0

我的意思是，你很早就输给了这个东西。

Well, I mean, you lost to this thing quite early.

Speaker 0

但当它经过团队中许多人的测试后，我知道你们曾让职业围棋手范廷钰来办公室进行测试。

But once it had gone through a lot of the people on the team, let's say, I know that you tested it with a professional Go player because you had Fan Hui come into the office.

Speaker 0

是的。

Yeah.

Speaker 0

没错。

Exactly.

Speaker 0

那时你们有多大的把握认为它会赢过他？

How confident were you at that point that it was gonna beat him?

Speaker 3

是的。

Yeah.

Speaker 3

我们的信心程度各有不同，这非常有趣。

We had different levels of confidence, which was really interesting.

Speaker 3

我们非常幸运地找到了他，他是当时欧洲围棋冠军。

So we had been really lucky to find him, you know, he was the European GO champion at the time.

Speaker 3

他住在波尔多，然后过来了。

He lived in Bordeaux and and came over.

Speaker 3

我们诱使他来和我们下这盘棋，安排是他会与当时版本的AlphaGo进行10场测试对局。

We lured him into into playing this game with us, and the setup was that he would play 10 test games against the version of AlphaGo at that point.

Speaker 3

我个人认为，AlphaGo不可能已经强大到能击败欧洲冠军这样的职业棋手。

And I personally thought that AlphaGo cannot possibly be at the point already that it beats the European champion, a professional player.

Speaker 3

于是我和大卫·西尔弗打了个赌。

And so I had a bet with David Silver.

Speaker 3

大卫·西尔弗很有信心。

David Silver was confident.

Speaker 3

他说，我觉得AlphaGo会以10比0完胜。

He said, I think AlphaGo is going to nail it ten zero.

Speaker 3

我说，不可能。

And I said, no.

Speaker 3

我觉得AlphaGo至少会输一局。

I think AlphaGo will lose at least one game.

Speaker 3

赌注是，输的人必须打扮成一位古代日本围棋大师，到办公室里穿着这身装扮待上一整天。

And the the bet was that whoever lost would have to show up at the office dressed as an ancient Japanese Go master and be in the office for one day with that.

Speaker 3

那么，是谁真的那样打扮出现了？

Well, who showed up like that?

Speaker 3

是我，因为是十比零。

It was me because 10 nil.

Speaker 3

事实上确实是十比零。

It was in fact 10 nil.

Speaker 3

是的。

Yeah.

Speaker 3

但这给了我们信心，也让他们相信，我们将来能够应对更强大的对手。

But it did give us confidence and gave them confidence that we would be able to tackle even harder opponents in the near future.

Speaker 0

当然，你们后来在2016年登上了飞往首尔的航班，去与李世石对弈。

Which you did, of course, on a plane you got in 2016 to to Seoul and Korea to play against Lee Sedol.

Speaker 0

我的意思是，跟我们说说他到底有多厉害。

I mean, just tell us give us a sense of how phenomenal a player he actually is.

Speaker 3

是的。

Yeah.

Speaker 3

当时，李世石确实是顶尖棋手之一，甚至可能是最强的，拥有辉煌的赛事夺冠纪录。

So Lee Sedol was really one of the or maybe the best players at the time with an incredible track record of of winning tournaments.

Speaker 3

当时，人们将他与罗杰·费德勒相提并论，称赞他的成功与智慧才华。

He was compared to Roger Federer at the time for his success and intellectual brilliance.

Speaker 3

因此，他接受我们的挑战，愿意与我们对弈，这对我们来说是极大的荣耀。

And so for us, it was a tremendous honor that he accepted our challenge to play against him.

Speaker 3

这同样是一个巨大的挑战，因为我们必须确定一个具体日期，对吧？

And it was a tremendous challenge because we had to set a date, right?

Speaker 3

你不能只是说：‘我们准备好了再告诉你。’

You can't just say, you know, we'll tell you when we're ready.

Speaker 3

日期定下来后，我们必须朝着这个目标努力，真正把AlphaGo提升到足够强大的水平。

A date was set and we had to work towards that date actually make AlphaGo strong enough.

Speaker 3

而让这场对弈更添紧张与激动的是，李世石坚信自己会赢。

And what added tension and excitement to it was that Lee Sedol was convinced that he would win.

Speaker 3

当时他认为AlphaGo获胜的可能性极低。

He thought it highly unlikely at the time that AlphaGo would win.

Speaker 3

当然，他的评估是基于他所看到的与樊麾对弈的棋谱，他认为自己更强。

And of course, he was basing his assessment on the game records that he had seen against Fanhua, and he assessed that he was better.

Speaker 3

但他不太了解的是，AlphaGo通过我们不断进行的训练和算法优化一直在持续进步。

But of course, what he wasn't so aware of is that AlphaGo was constantly improving through the training and the algorithmic refinements and so on that we made.

Speaker 3

于是整个团队前往韩国，你无法想象那里的人们有多兴奋。

And so the entire team basically went to South Korea, and you wouldn't believe the excitement of people there.

Speaker 3

你知道，在英国，围棋只是一项小众活动。

You know, the truth is in in England, Go is a bit of a niche activity.

Speaker 3

对吧？

Right?

Speaker 3

很少有人会下围棋，甚至根本不知道这项运动。

Very few people would be able to play it or even know about it.

Speaker 3

但在韩国，人们都非常兴奋。

But in South Korea, people were so excited.

Speaker 3

顶尖的围棋选手是名人，我们到那里时，有大批摄影师拍照，还有一支纪录片摄制组跟随我们，想象一下，一群典型的电脑极客突然成为全球瞩目的焦点，这场对决真是一场奇妙的经历。

The best Go players are celebrities, and, you know, we came there and there were hordes of photographers that took pictures, we had a documentary film crew with us and so imagine typical computer geeks as it were, suddenly in the limelight of the world for this match, that was quite the adventure.

Speaker 0

是的。

Yeah.

Speaker 0

我的意思是，你对AlphaGo的表现感到紧张吗？

I mean, you nervous about the performance of of AlphaGo?

Speaker 3

是的。

Yes.

Speaker 3

我们确实很紧张。

We were definitely nervous.

Speaker 3

当然，我们有一个非常复杂的评估流程。

So of course we had a very sophisticated evaluation pipeline.

Speaker 3

你可以测试你能够接触到的选手，比如范华，这非常有帮助。

You can test against players that you have access to like Fran Hui, that was super helpful.

Speaker 3

你也可以测试程序的先前版本，并计算我们所谓的系统ELO分数，这基本上是根据你与其他版本（可能是你程序的早期版本）对弈的所有结果，来计算新版本的评分，这些指标可以很好地校准。

You can also test against previous versions of the programme and you can calculate what we call the ELO score of the system, which basically takes the outcomes of all the games that you play against other versions, maybe earlier versions of your program, and calculates what the rating of the new version is and you can calibrate these things quite well.

Speaker 3

但当然，我们并不知道李世石在这个评分体系中处于什么位置。

But of course we didn't know where on that scale Lee Sedol would be.

Speaker 3

当然，我们也希望留一些余地，你知道的。

And of course we wanted a cushion as well, you know.

Speaker 3

如果能好得多，有更大的把握，那就更好了，毕竟这是世界舞台，对吧？

It would have been would be nice to be quite a bit better, to have some certainty because this is the world stage, right?

Speaker 3

如果你输了，这对声誉会是不小的打击。

If you lose this, that's bit of a hit to the to the reputation.

Speaker 3

所以是的，我们很紧张，一直工作到最后一刻，我们还需要确保系统非常稳定，你知道的，你不想在最后一刻做改动来让它好一点点，却冒让它变得不稳定的风险。

And so yeah, we there was we were nervous, we worked up to the last minute, we also needed to make sure that the system is really stable, you know, you don't want to make last minute changes to make it that little bit better but risk that it now becomes unstable.

Speaker 3

但最终我们对它相当满意，于是我们进入了那个如今闻名的酒店房间，那里是所有行动发生的地方，媒体都在那里等候，然后我们开始了比赛。

But in the end we were quite happy with it and so we entered that now kind of famous hotel floor where all the action happened, where all the press was waiting and so on, and embarked on the match.

Speaker 0

世界各地的人都在观看，包括普什米特，也包括你。

And people were watching from around the world, including Pushmeet, including you.

Speaker 0

是的。

Yeah.

Speaker 0

那么，当时你在哪里？

So, I mean, where were you at this point?

Speaker 0

你当时也在观看。

You you were watching on.

Speaker 4

是的。

Yeah.

Speaker 4

我在西雅图。

I was in Seattle.

Speaker 4

我的意思是，我在第一局进行到一半时才真正投入进去。

I I mean, I really started getting into it in the middle of the first game.

Speaker 4

很明显，AlphaGo已经达到了这一特定里程碑。

It became so clear that AlphaGo had reached that specific milestone.

Speaker 4

你甚至能从媒体、评论员和李世石本人的反应中看出来。

And you could even see the reaction from the press and the commentators and Lee Sedol himself.

Speaker 0

有趣的是，你提到是在那局的中段，但在那局的早期阶段，谁能占上风有明显迹象吗？

Well, it's interesting that you said in the middle of that game because in the early stages of that game, was it clear who had the upper hand?

Speaker 4

我认为，对于只是旁观的人来说，早期大家都觉得李世石肯定会赢。

I think from, like, from person who was just watching it, I felt that in the early stages, everyone felt quite confident that Lee Sedol would would win.

Speaker 4

事实上，只有当比赛进行到接近最终结果时，人们才意识到，通过计算领地，AlphaGo占据了优势。

In fact, only as the game progressed and it became closer to the final outcome that they realized that as you count the the territory, AlphaGo had an an advantage.

Speaker 4

事实上，这让所有人都感到惊讶。

In fact, it came as a surprise to people.

Speaker 4

你怎么看？

What did you think?

Speaker 3

是的。

Yeah.

Speaker 3

我当时在现场和一位美国职业围棋选手有过一次有趣的互动，他坐在我旁边一起观赛。当时棋盘一角出现了一连串变化，他靠近我说：‘我总是告诉我的学生不要下AlphaGo刚刚下的那种愚蠢的棋，这么说吧，这简直没救了。’

So I had this interesting interaction on-site with a professional Go player, an American professional Go player, who was sitting next to me while we were watching, and there was some sequence unfolding in a corner, and he kind of approached me and said, you know, I always tell my students not to play that stupid move that AlphaGo just played, so I mean, it's pretty hopeless.

Speaker 3

我当时想，我可不是那么专业的专家。

And I was like, I'm not as much as an expert.

Speaker 3

我的反应就是：咱们就等着看吧。

I'm let let's just wait and see was my was my reaction.

Speaker 3

那场第一局结束后，这位先生走过来对我说：这是我这辈子见过最非凡的事情。

And then after after that first game, this gentleman came to me and said, this is the most phenomenal thing I've ever experienced.

Speaker 3

我非常感激能在这里见证一台机器能够达到如此高水平的围棋对弈，我们从中将学到非常多的东西，而他已经欣然接受了这一点。

I'm so grateful that I'm allowed to be here to witness that a machine can play go at this level, and there's gonna be so much we can learn from it, and he was already embracing this.

Speaker 3

我的意思是，想象一下这些人将一生都奉献给了这项游戏的研究，他们往往从孩童时期就开始训练，直到如今只为精通这个游戏。

I mean, have to imagine these people dedicate their lives to the study of this game, and they've often trained from being young children to their current age just to master this game.

Speaker 3

因此，当一台机器能够与人类围棋选手匹敌甚至超越时，他们当然会感到震惊。

And so, of course, it comes as a shock to them that a machine might match or even exceed a human Go player.

Speaker 0

因为如果第一局是AlphaGo获胜，那么在第二局中，AlphaGo下出了一步真正让所有人震惊的棋。

Because if that was the first game when AlphaGo won, in the second game, AlphaGo did something that, I mean, really surprised everybody.

Speaker 1

这是一步非常令人惊讶的棋。

That's a very surprising move.

Speaker 2

专业解说员几乎一致认为，没有任何一位人类玩家会选择第37手。

Professional commentators almost unanimously said that not a single human player would have chosen Move 37.

Speaker 2

AlphaGo表示，人类玩家下出第37手的概率仅为万分之一。

AlphaGo said there was a one in 10,000 that Move 37 would have been played by a human player.

Speaker 0

请给我们解释一下，现在著名的第37手究竟发生了什么。

Just just explain to us what happened with the with the now famous Move 37.

Speaker 3

是的

Yeah.

Speaker 3

所以这是一个非凡的场景，我当时坐在国际英语解说室里，我们的美国解说员迈克尔·雷蒙德在墙上放了一个巨大的演示棋盘，他会把所有棋子摆上去，向观众展示正在下的棋局，并分析不同的变化。于是，他拿起对应第37手的棋子放在棋盘上，然后退后一步说：‘不，这肯定错了’，并把棋子拿了下来。

So this this was a remarkable scene, and I was sitting in the the international English speaking commentating room, and Michael Redmond, our American commentator, he had this big demo board on the wall and he would put all the stones up there on the board to show people what was being played and comment on different variations, and so he took the stone corresponding to move 37 on the board, and then he stepped back and said, no, this must be wrong, and he took it back.

Speaker 3

但他又看了看屏幕，说：‘不，不，这确实是AlphaGo下的棋’，然后把棋子放了回去。

And then he looked at the screen again and said, no, no, that is actually what AlphaGo played, and he put it

Speaker 2

放了回去。

back.

Speaker 3

他感到困惑。

He was puzzled.

Speaker 3

你可以明显看出，对人类棋手来说，这一步完全违背直觉。

You could you could see it, that that was such a counterintuitive move for a human player.

Speaker 3

这是一步五线的肩冲，而人类棋手通常会避免这种下法。

It was a shoulder move on the fifth line, and this is typically something that human Go players avoid.

Speaker 3

在围棋中，常常是在边线上进行推挤，一方沿着棋盘边缘构筑实地，另一方则向中央发展势力。

So often in Go, is some kind of pushing going on along the along the edges, and one of the players builds territory along the wall of the board, and the other side builds influence towards the center of the board.

Speaker 3

如果这种情况发生在第三和第四线，通常被认为是大致均衡的。

And if that happens on the third and fourth line, this is considered to be roughly equitable.

Speaker 3

你知道，双方都能从中获得一些东西。

You know, both sides get something out of it.

Speaker 3

但AlphaGo实际上暗示的是，即使你在第五线下子，让对方获得更多的地盘，这仍然是有利可图的。

But what AlphaGo was effectively suggesting is that it's still profitable if you do it on the fifth line and you give that much more territory to the other party.

Speaker 3

这正是让人们感到惊讶的地方——居然会存在这样的情况，这种下法竟然是正确的。

And that's what was so surprising to people that that could that there would be situations in which that would be correct.

Speaker 3

因此，这不仅是一步非常特殊的棋，而且在某种程度上，它代表了一种新的方式，来权衡即时地盘与向中央扩展势力之间的关系。

And so not only was it a very special move, but it in a way, it represented a new way of of weighing these two factors of immediate territory versus influence towards the center of the board against each other.

Speaker 0

这超出了人类棋手通常会做的范围，普什米特。

Something that went beyond what a human go pay would normally do, Pushmeet.

Speaker 4

是的，绝对如此。

Yeah, absolutely.

Speaker 4

我的意思是，正是在这样的时刻，你看到了人工智能系统的真正潜力——它在拓展人类的知识边界。在这个特定案例中，人们长期以来一直将围棋视为一个需要深入研究的领域。

I mean, are moments like this where you see the true potential of an AI system, expanding human knowledge, where people have regarded in this particular case, the game of Go as a thing to be studied for many, many years.

Speaker 4

就在这一刻，这种知识得到了扩展。

And there comes this particular point where that knowledge is expanded.

Speaker 4

人们最初持怀疑态度，游戏中的情况也是如此。

And people are at first skeptical, which was the case in the game as well.

Speaker 4

当这步棋下出时，很长一段时间内都被认为是幻觉或错误，直到其意义逐渐明朗。

When the move was played, it was considered a hallucination or a mistake, right, for quite a bit of time before its implications became clear.

Speaker 0

后来在比赛中。

Later on in the game.

Speaker 0

没错。

Exactly.

Speaker 0

因为它被证明是第二次获胜的关键。

Because it proved to be pivotal to the second win.

Speaker 4

是的。

Yeah.

Speaker 4

这不仅是在那场比赛中的一个时刻，我认为，它也是整个AI历史中的一个时刻——这个特定的瞬间向我们表明，未来总会有一些时候，这些系统会产生我们甚至无法判断是否正确或是否为惊人突破的洞见，但它们却会深刻影响我们以全新视角看待整个研究领域的方式。

It was not just a moment in that game, but it was also a moment, I think, in the whole sort of history of AI where that particular moment showed us that there will be times when these systems will produce insights which we might not even be able to discern whether they are the right things or amazing breakthroughs, but yet they will have a lot of influence in how we look at whole areas of study in a completely new light.

Speaker 0

我也想谈谈第78手。

Well, I also want to talk about move 78.

Speaker 0

这是李世石下的一手棋，让AlphaGo感到困惑，最终导致它认输。

This is a a move that was played by Lee Sedol that confused AlphaGo, causing it to resign the game.

Speaker 1

李世石在这里想做什么？

What is Lee Sedol up to here?

Speaker 1

他在这一步上已经花了七到八分钟了。

He just burned like seven or eight minutes just on this move already.

Speaker 1

看这一手棋。

Oh, look at that move.

Speaker 1

这是一步令人兴奋的棋。

That's an exciting move.

Speaker 1

哦。

Oh.

Speaker 1

说实话，我不太确定AlphaGo在这里想做什么。

You know, I'm not actually sure what AlphaGo is trying to do here.

Speaker 0

到这个时候，AlphaGo已经连续赢了三局，而李世石下了一步让系统困惑的棋。

So by this point, AlphaGo has won three games in a row, and now Lee Sedol does a move that confuses the system.

Speaker 0

这么说公平吗？

Is that fair to say?

Speaker 3

是的，这么说完全公平。

Yeah, that's absolutely fair to say.

Speaker 3

第78手是李世石下的一步不寻常的尖顶棋。

So move 78 was an unusual wedge move that Isidol played.

Speaker 3

当时棋盘中央发生了一场非常有趣的较量，李世石找到了这步棋，这步棋也让人们感到惊讶，就像第37手一样。

There had been a very interesting battle, as it were, at the center of the board, and Isidol found this move, and it was also surprising to people, similar to Move 37.

Speaker 3

从那时起，我们观察到AlphaGo对局势的把握不再清晰了。

And from then on we observed that AlphaGo didn't have a good grasp of the position anymore.

Speaker 3

我们发现它下的棋在我们看来很不合理——当然，第37手也不太合理，但这些新棋步甚至连我们这样的业余爱好者都觉得奇怪，可见它确实被这步棋打乱了。

We saw that the moves that it made didn't really make sense to us in a bad way, you know, 37 also didn't make sense to us maybe, but these moves even to amateurs like us seemed strange, and so it had been confused by the move.

Speaker 3

为了让你理解为什么这仍然如此重要，你可能会说，这是一场五局三胜的比赛，AlphaGo已经赢了前三局。

And just to zoom out to give you a sense of why this still mattered so much, so you might say, okay, it's a match of five games, and AlphaGo has won the first three.

Speaker 3

还有什么需要证明的呢？

What more is there to prove?

Speaker 3

但那时我们在想，如果李世石赢下最后两局，你会得出什么结论？

But then we were thinking, well, if now, Lee Sedol was to win the last two, what would you conclude?

Speaker 3

他已参透了。

He's got it figured

Speaker 0

明白了，他抓住了系统的脆弱性。

out, He's bound the fragility.

Speaker 3

没错。

Exactly.

Speaker 3

因此，这将是人类的胜利。

So the human it would have been the human triumph.

Speaker 3

这就是为什么那场比赛和最后一场对我们来说依然非常激动人心。

And so that's why that game and the last one were still very exciting to us.

Speaker 3

但我们也并非完全感到失望。

But it wasn't entirely the case that we were disappointed.

Speaker 3

我们当然感到失望，但同时也对李世石充满敬佩——作为一个普通人，他竟然能下出这样的棋。

We were certainly disappointed, but also we had so much admiration for Lee Sedol to, you know, as a human, to be able to find this move.

Speaker 3

你必须想象一下，这位毕生致力于围棋的大师，在这场必须面对机器完美表现、自己却苦苦寻找应对之法的战斗中，内心该有多艰难，对吧？

You just have to imagine this this master who has dedicated his life to playing this game in this battle that must have been so hard on him, right, to to see this machine play so perfectly and him struggling to find a way.

Speaker 3

而在第四局中，他终于找到了突破口。

And then in game four, he finds a way.

Speaker 3

正如他在新闻发布会上后来所说，他感到无比高兴和自豪，因为这可能是他最后一次，代表人类找到战胜机器的方法。

And as he put it in the press conference, I think later, he he said that he was so happy and proud that he was able, maybe for the last time, on behalf of humanity, to find a way to overcome the machine.

Speaker 0

因为有些人称它为‘神之一手’，不是吗？

Because some people called it the divine move, didn't they?

Speaker 3

是的。

Yeah.

Speaker 3

对。

Yeah.

Speaker 3

我认为，考虑到当时紧张的氛围，以及他那一刻超越自我的表现，最终下出这步棋，这个称呼确实非常贴切。

And I think given the the tension at that point in time and and him really outgrowing himself at that moment and finding that move, I think it's a it's a good name for it.

Speaker 0

最终比分是阿尔法狗以四比一获胜。

Well, the final score was four one to AlphaGo in total.

Speaker 0

围棋界对此有何反应？

What was the reaction from the Go community?

Speaker 3

是的。

Yeah.

Speaker 3

围棋界密切关注了这场比赛，当然，结果极具戏剧性，对许多人来说也出乎意料。

So the Go community followed the match very closely, and, of course, the outcome was dramatic, and for many people, unexpected.

Speaker 3

因此，人们的反应各不相同。

And so people showed very different reactions, you know.

Speaker 3

有些人对结果感到无比惊讶和震惊，有些人简直不敢相信。

Some people were absolutely amazed and surprised about outcome, some people couldn't believe it.

Speaker 3

另一些人则认为，一个时代已经终结，因为最强的围棋选手可能不再是人类，而是一台机器。

Others of course also thought that some era had come to an end because now maybe the strongest Go player was no longer a human but a machine.

Speaker 3

但总体而言，我们感到惊人的是，人们对围棋的兴趣反而增加了。

But overall what we found amazing is that there was an uptick in interest in the game of Go.

Speaker 3

我认为现在下围棋的人比以前更多了，围棋界也真正接受了从AlphaGo身上学到的东西。

I think more people play Go now than did before, and the Go community really embraced the learning from AlphaGo.

Speaker 3

现在有许多程序的工作方式与AlphaGo基本相同，人们用它们来进行教学。

So there are now many programs that work essentially the same way that AlphaGo does, and people use it for teaching purposes.

Speaker 3

他们通过这些程序分析自己的对局。

They analyze their games through it.

Speaker 3

总的来说，我认为这为整个围棋界带来了积极的推动。

And overall, I think it has provided a lift to the whole Go community.

Speaker 0

我想问问你，AI界对这场比赛的反应如何？

Let me ask you about the reaction from the AI world to to this match.

Speaker 0

当时有什么热议？

What was the buzz?

Speaker 0

当时的讨论是怎样的？

What was the conversation like?

Speaker 4

李世石对战AlphaGo的比赛是一个关键转折点，许多人在机器学习领域一直将这些模型和技术视为数学和应用项目，但从此开始看到了证据，表明这些系统能够自我学习并超越人类的知识。

The Lee Sedol match, the AlphaGo, Lee Sedol match was a key pivot point where a lot of people, especially in the machine learning community who have been sort of working on these models and techniques as a mathematical and applied project started to see evidence that these systems can self learn and go beyond human knowledge.

Speaker 4

这是一个非常重要的观点，因为在机器学习中，你是用收集到的训练数据进行训练的。

And that is a very important sort of point because in machine learning, you train with training data which has been collected.

Speaker 4

你自然的预期是模型会与这些数据的分布保持一致。

And your natural sort of expectation is that the model is going to just be consistent with that distribution.

Speaker 4

但能够超越这种分布，并让这种洞见被世界所利用，我认为这是整个经历中得出的惊人洞察。

And to show that you can go beyond that distribution and that insight then can be utilized by the world, I think, is an amazing sort of insight that comes out of this whole experience.

Speaker 4

这真正揭示了人工智能的潜力——不仅在围棋领域，还在对世界的理解、化学、生物学、数学和计算机科学中。

And it really points to what is possible with artificial intelligence in not just the game of Go, but in the understanding of world, in chemistry, in biology, in mathematics, in computer science.

Speaker 4

这些系统将能够发现并向我们揭示哪些类似第37手的惊人类比？

What are these amazing analogs of Move 37 that these systems will be able to discover and and reveal to us.

Speaker 0

你提到的超越人类智能这一点，真的非常引人入胜。

I think that point that you made there about going beyond human intelligence is is just so fascinating.

Speaker 0

但让我觉得AlphaGo故事中最令人着迷的一点，即使在4:1获胜之后，是你后来开发了AlphaZero——你移除了所有人类数据，所有它曾用于训练的围棋对局，结果发现，一旦剔除人类智能，这个系统反而表现得更好，这让我感到震惊。

But one of the things that I find most intriguing about the the AlphaGo story, even after the victory of four one, is that you then built AlphaZero where you took away all of the human data, all of the the games of Go that it had been trained on, and discovered that once you take out the human intelligence, the thing actually improved, which is astonishing to me.

Speaker 3

是的。

Yeah.

Speaker 3

从科学的角度来看，有人会认为这甚至比最初的AlphaGo更进一步。

From a scientific perspective, one could argue that that is even an even bigger step than the original AlphaGo.

Speaker 3

因为正如你所说，AlphaZero系统无法访问任何人类对局记录，不了解人类如何下棋，也没有利用关于围棋的先验知识，而仅仅依赖于游戏规则以及我们之前讨论过的策略网络和价值网络的表示与学习方法。

Because as you were saying, the AlphaZero system doesn't have access to any human game records, how humans play, didn't have access to prior knowledge about the game, how the game is played, but really only had access to the rules of the game and means of representing and learning these functions that we talked about, the policy net and the value net.

Speaker 3

因此，它最初完全是随机落子，因为它对什么是好棋或坏棋没有任何概念，但它通过对局积累经验，逐渐学会哪些走法更可能带来胜利，哪些更可能导致失败，哪些局面有潜力，哪些局面没有希望，最终，它下出的棋步变得越来越好。

So basically it starts playing entirely randomly at the beginning because it has no notion of what good or bad moves are, but it gathers experience from playing these games and it learns what are moves that are more likely to lead to a win, what are moves that are more likely to lead to loss, what are positions that look promising, what are positions that are not promising, and eventually, it it starts playing better and better moves.

Speaker 3

当然，它不再受限于人类的知识。

And now of course it's not limited by human knowledge.

Speaker 3

它所发现的东西令人惊叹。

And what it discovered was amazing.

Speaker 3

首先，它重新发现了人类的下法。

So first of all it rediscovered ways of how humans play.

Speaker 3

这让人感到非常安心，你知道的。

That was totally reassuring, you know.

Speaker 3

在围棋的角落里有一些我们称之为定式的模式，在国际象棋中也有一些固定的开局走法。

There are certain patterns in the corner in Go that we call joseki, or in chess there are certain opening moves.

Speaker 3

这个系统现在更具通用性。

The system was now more general.

Speaker 3

它可以下国际象棋、围棋和将棋，如果我们这样训练它，它还能玩其他许多棋类游戏。

It could play chess, Go, and shogi and could have played any number of other board games if we trained it that way.

Speaker 3

所以一开始，它重新发现了人类的知识，让我们惊叹不已。

And so at first, it rediscovers human knowledge, and we think, wow.

Speaker 3

这太酷了。

This is so cool.

Speaker 3

它找到了相同的开局方式等等。

It it finds the same openings and so on.

Speaker 3

然后我们观察一些这些开局，发现它不再使用它们了。

And then we look at some of these openings and it stops playing them.

Speaker 3

我们想，这是怎么回事？

We think, what's going on?

Speaker 3

它已经建立了一种声誉。

It has found a reputation.

Speaker 3

所以它重新发现了人类的知识，然后又抛弃了它，因为它已经超越了这些知识，找到了更好的下法。

So it discovered, rediscovered human knowledge, and then it discards it because it has now gone beyond it and has found there's actually better ways of playing.

Speaker 3

我不会再继续以这种人类的方式下棋了。

I'm not going to continue playing in this human way.

Speaker 0

一些人类尚未有效发现的东西。

Stuff that humans haven't found yet effectively.

Speaker 3

没错。

Exactly.

Speaker 3

当AlphaZero下围棋时，它的下法在我看来非常陌生，这根本不是我从围棋老师那里学到的那种围棋，那种结构化的方式或许是为了让人类更容易理解。

For AlphaZero when it played Go, the way it played Go looked alien to me in So the this wasn't the kind of Go that I had learned from my Go teacher, you know, which is structured maybe in a way that enables humans to understand it.

Speaker 3

这些落子在当时看起来非常自由，完全说不通。

These moves looked very free and didn't make much sense at the time.

Speaker 3

但三十步之后，一切就都清晰了。

But 30 moves later, everything would fall into place.

Speaker 3

你会恍然大悟：哦，原来如此，真厉害，这下说得通了，仿佛它具有远见——而它确实有。

Would see, oh, yeah, oh, wow, that makes sense now, and so on, as if it had the foresight, in a way which it did.

Speaker 3

对吧？

Right?

Speaker 3

所以，从无到达到这种水平的棋艺，这一发现非常令人印象深刻。

So that discovery from nothing to to that level of play was very impressive.

Speaker 0

好的。

Okay.

Speaker 0

所以，有一件事我想给你看，是你们在首尔时发生的事。

So there's there's something I I want to show you, something that happened actually when you guys were in Seoul.

Speaker 0

因为正如你之前提到的，你们当时正在为这部关于AlphaGo的纪录片拍摄。

Because you as you mentioned before, you were being filmed for this documentary for for AlphaGo.

Speaker 0

有一些镜头没有被用在电影里，但当时摄像机还在拍摄，而麦克风仍然开着。

And there's some footage that didn't make it into the film, but it was it was captured by the cameras as they were sort of packing up, but the microphones were still running.

Speaker 0

我不知道你有没有听过这段小片段。

I don't know if you've you've heard this little clip.

Speaker 0

我来放给你听。

Let me play it for you.

Speaker 0

等一下。

Hold on.

Speaker 0

这是德米斯和大卫在进行一次私下对话。

This is Demis and David having a sort of private conversation.

Speaker 2

看到一个被认为不可能解决的问题如此迅速地发生变化，真是太惊人了，是的。

It's just amazing seeing how quickly a problem that is seen as being impossible Yeah.

Speaker 2

可以转变为实际上是。

Can change to being And, technically,

Speaker 1

我们研究蛋白质折叠。

we consult protein folding.

Speaker 1

这简直太重要了。

That's like, I mean, it's just huge.

Speaker 1

我相信我们能做到。

I'm I'm sure we can do that.

Speaker 1

我之前就以为我们能做到。

I was I thought we could do that before.

Speaker 4

是的。

Yeah.

Speaker 1

但现在我们 definitely 可以做到。

And now but now we definitely can do it.

Speaker 1

干得好。

Good job.

Speaker 1

谢谢。

Thank you.

Speaker 1

好的。

Okay.

Speaker 1

太棒了。

Beautiful.

Speaker 4

这难道不很棒吗？

Isn't that great?

Speaker 0

是的。

Yeah.

Speaker 0

索雷，你认为这捕捉到了当时的情绪吗？

Thore, do do you think that captured the mood at the time?

Speaker 3

是的。

Yeah.

Speaker 3

那正是阿尔法狗当时打开的那扇门。

That was the kind of door that AlphaGo opened at the time.

Speaker 3

对吧？

Right?

Speaker 3

如果我们能做到这一点，那我们还能做什么？

If we can do this, then what else could we do?

Speaker 3

因为这是一款拥有10的170次方种不同局面的游戏。

Because this is a game with 10 to the power of 170 different positions.

Speaker 3

这极其复杂。

This is super complex.

Speaker 3

如果我们有系统的方法来应对这种组合搜索空间，那么我们似乎也能处理其他大规模的组合搜索空间。

And if we have principled ways of navigating that kind of combinatorial search space, then it seems plausible that we would also be able to handle other large combinatorial search spaces.

展开剩余字幕（还有 207 条）

Speaker 3

当时，蛋白质折叠是其中一个热门方向。

And at the time, one of the favorites was protein folding.

Speaker 3

当然。

Absolutely.

Speaker 0

而这也正是你加入DeepMind团队的关键时刻，因为谈到AlphaFold，你绝对是这个故事中不可或缺的一部分。

And and this is now the point really where you come on board with the DeepMind team fish meat because when it came to AlphaFold, I mean, you're an integral part of that story.

Speaker 0

AlphaGo是否直接影响了你们后来的研究，还是说，那场胜利带来的信心让德米斯说了那些话？

Did AlphaGo, directly influence what you guys went on to do, or was it sort of like the the confidence of a victory that made Demis say things like that?

Speaker 4

没有。

No.

Speaker 4

我认为德米斯从很早开始就对AI的发展目标有着非常清晰的认识。

I think Demis, from very early on, I think he has a very strong notion of what AI is being developed for.

Speaker 4

他真正将AI视为一种帮助我们更好地理解世界的工具。

He really sees AI as a tool that will help us understand the world better.

Speaker 4

事实上，当AlphaGo比赛进行时，我正在微软从事AI在编程领域的研究。

In fact, at the time when the AlphaGo matches were happening, happening, I was at Microsoft working on AI for programming.

Speaker 4

现在AI用于编程无处不在，但那时很少有人研究程序合成和AI编程。

Now AI for coding is everywhere, but at that time, not many people were working on program synthesis and AI for coding.

Speaker 4

德米斯希望我加入DeepMind。

And Demis wanted me to join DeepMind.

Speaker 4

我问他的是，我非常希望用AI系统和机器学习系统来解决世界上最具挑战性的问题，并理解正在发生的事情。

And my question to him was, I am really interested in having AI systems, machine learning systems for solving the most challenging problems in the world and to make sense of what's happening.

Speaker 4

我认为他的回应是：如果你想理解世界，想解决世界上最重要的问题，那你必须加入DeepMind，因为我们必须深度理解世界，才能应对这些挑战。

And I think his reaction was, if you want to understand the world and if you want to solve the most important problems in the world, then you have to join DeepMind because we will need AI to really understand the world deeply and to tackle these problems.

Speaker 4

所以，如果你对编程学习感兴趣，对网络安全感兴趣，对应对气候变化感兴趣，或对理解如何治疗那些无法治愈的疾病感兴趣，你就必须来，主导如何将AI应用于这些领域。

So if you are interested in sort of learning to program, if you are interested in cybersecurity, if you are interested in dealing with climate change, if you are interested in understanding how to deal with impossible to treat sort of diseases, you have to come and really lead the charge on how can AI be used for these applications.

Speaker 0

我想问问你，你们在AlphaGo中的一些创新是如何最终融入你们进行的科学项目的。

I wanna ask you about some of the some of the innovations that you guys have made in in AlphaGo and how they ended up finding their way into the the science projects that you guys were doing.

Speaker 0

AlphaGo做的一件大事，就是让那个巨大的搜索空间变得可处理。

One of the big things that AlphaGo did was to make that gigantic search space more tractable.

Speaker 0

那么，自那以后，搜索算法发生了哪些变化？

So how have search algorithms changed since then?

Speaker 0

它们在科学中是如何被应用的？

How are they being used in science?

Speaker 4

我的意思是，它在你遇到的许多现实世界问题中都是不可或缺的一部分。

I mean, is such an integral part of many problems that you encounter in the real world.

Speaker 4

我们刚刚谈到了蛋白质折叠，这可以被视为对所有可能结构空间的搜索。

We just spoke about protein folding, which could be considered as the search over the space of all possible structures.

Speaker 4

但为了给出一个更简单的例子，你可以把搜索理解为寻找解决特定问题的算法。

But just to give a more sort of simpler example, you can think of search as also the search of algorithms for solving a particular problem.

Speaker 4

我们周围所有计算机所做的事，其底层都涉及某种形式的矩阵乘法。

So everything around us that computers do has some form of matrix multiplication underlying it.

Speaker 4

因此，即使我们今天拥有这些改变世界的机器学习系统和神经网络，这些神经网络也是基于矩阵乘法的，本质上是将大量的数字矩阵相乘。

So even the fact that we have these machine learning systems and neural networks that are changing the world today, these neural networks are based on matrix multiplication, essentially taking large matrices of numbers and multiplying them together.

Speaker 4

即使是矩阵乘法这样最简单的操作——只是将两个矩阵相乘——也是你在学校和大学里学到的最基本内容。

And even the very simplest operation of matrix multiplication, which is just taking two matrices and multiplying them, is the simplest thing that you sort of learn in school and college.

Speaker 4

然而，作为整个研究界，我们还不知道两个矩阵相乘的最快方法是什么。

And yet, we don't know as a whole research community what is the fastest way of multiplying two matrices.

Speaker 4

所以如果你思考这个问题，你可以把它看作一个搜索问题。

So if you think about that problem, you can reason about it as a search problem.

Speaker 4

你可以认为存在一个可能的算法空间，现在在这个算法空间中进行搜索，找出最优的算法。

You can say there there's a space of possible algorithms and now search over that space of algorithms and try to find me the best algorithm.

Speaker 4

问题是，这个问题的搜索空间甚至比围棋的搜索空间还要大。

The issue is that the search space for that problem is even larger than the search space for Go.

Speaker 4

因此，我们首先需要做的一件事是开发了一个名为AlphaTensor的智能体，它将矩阵乘法视为一个搜索问题，一个游戏。

So one of the first things that we needed to do is we came up with this agent called AlphaTensor, which made matrix multiplication as a search problem, as game.

Speaker 0

所以你不是在说赢得或输掉围棋比赛，而是在说你是否快速地将这两个矩阵相乘了？

So instead of did you win or lose the game of Go, you're saying did you multiply these two matrices together quickly or not?

Speaker 4

是的。

Yeah.

Speaker 4

你是否用最少的步数完全准确地完成了这些矩阵的乘法？

Did you multiply these matrices completely accurately in the smallest number of moves?

Speaker 0

对。

Right.

Speaker 4

那就是这个游戏。

And that was the game.

Speaker 4

早在1969年，斯特拉森就提出了一种算法。

And there was an algorithm that Strassen in 1969 had had come up with.

Speaker 4

从那以后的五十年里，没有任何进展。

And since then, for fifty years, there was no progress.

Speaker 4

然后AlphaTensor找到了一种更优的矩阵乘法方式。

And then AlphaTensor found a better way of multiplying these two matrices.

Speaker 4

这成为了证明同类技术潜力的关键例证。

And then the and that was a key sort of proof point of what is possible with the same sort of techniques.

Speaker 0

如果有人在观看，可能对你们提到的内容不太熟悉，比如矩阵乘法。

In case there's anyone watching who's who's sort of, I don't know, maybe not that familiar with the the the things you're talking about, matrix multiplication, for example.

Speaker 0

我们必须非常清楚地说明这项技术的潜力。

I mean, we we need to, like, be really clear on the potential of this thing.

Speaker 0

事实上，世界上每一个大型语言模型本质上都是一个巨大的矩阵乘法问题。

I mean, every single large language model in the world is essentially at its heart just a massive matrix multiplication problem.

Speaker 0

对吧？

Right?

Speaker 0

是的。

Yes.

Speaker 0

所有关于不同芯片的热议，都是因为有些芯片比其他芯片能更快地进行矩阵乘法。

All of the fuss about different chips that are being made is because some of them can multiply matrices faster than others.

Speaker 0

对。

Yeah.

Speaker 0

你在这里描述的是，把这件事变成一个游戏，哪怕只是在加快运算速度上取得微小的提升。

And what you're describing here is is, like, turning that into a game and even small gains that you might make on how quickly you can do something.

Speaker 0

一旦将其扩展到全球所有人使用AI的规模，我们谈论的就是巨大的差异。

Once you scale it up to the size of how much everybody in the world is using AI, we're talking about gigantic differences.

Speaker 4

对。

Yeah.

Speaker 4

当然。

Absolutely.

Speaker 4

从那以后，我们所做的就是不再仅仅局限于矩阵乘法。

And since then, what we have done is we have said, and let's not just tackle matrix multiplication.

Speaker 4

让我们去探索所有你能想到的算法。

Let's tackle all the possible algorithms that you can think of.

Speaker 4

因此，我们的新代理如AlphaEvolve，会在所有可能程序的空间中进行搜索，以找到能够解决这些重要问题的最佳算法，无论是如何在数据中心调度任务——这是一个极其重要且涉及能源、计算利用率等问题的任务。

So our new agents like AlphaEvolve, they search in the space of all possible programs trying to find the best algorithm that can solve these important problems, whether it's how do you schedule jobs in a data center, which is an extremely important problem and has implications in terms of energy, compute utilization and so on.

Speaker 4

或者如何解决这类物流问题，比如在网络中传输数据包。

Or how do you sort of tackle these logistics problems where you are trying to move packets around in a network.

Speaker 4

因此，这种解决搜索问题的基本方法，现在已经被扩展到了更广泛的应用领域。

So the same basic methodology of tackling these search problems now has been expanded in terms of what you can do with it.

Speaker 0

好的。

Okay.

Speaker 0

但我在这里想到的是策略网络，正如你所描述的直觉，比如一个围棋选手看到棋盘后会说，我觉得这个方向值得深入探索。

But I'm thinking here about the policy network, the the intuition as you described it, where, you know, a a Go player might look at the board and say, I think this is a fruitful direction in which to search.

Speaker 0

如果你面对的不是棋盘，不是围棋，而是世界上乃至更广阔范围内所有可能的算法呢。

If you instead of a board, instead of a game of Go, you've got all possible algorithms of everything in the entire world and beyond, let's say.

Speaker 0

在这种情况下，你究竟如何建立直觉？

How on earth do you create intuition in that sort of a situation?

Speaker 0

你如何知道如何缩小搜索空间？

How do you know how to narrow down the search space?

Speaker 4

是的。

Yeah.

Speaker 4

我认为这是一个我们现在开始思考的非常有趣的研究课题。

So I think this is this is a very interesting sort of research topic that we are now starting to think of.

Speaker 4

当我们应用像AlphaEvolve这样的代理来发现这些新算法时，有时这些算法对我们来说并不直观。

When we apply agents like AlphaEvolve to discover these new algorithms, sometimes those algorithms are not very intuitive to us.

Speaker 4

事实上，它们可能是反直觉的。

In fact, they could be counterintuitive.

Speaker 4

所以有时候你能看到一些模式。

So sometimes you can see the patterns.

Speaker 4

你能看到问题中存在一些我们之前没有理解的对称性。

You can see that there are certain symmetries in the problem that we did not understand.

Speaker 4

数学家们并不理解。

Mathematicians did not understand.

Speaker 4

计算机科学家们也不理解。

Computer scientists did not understand.

Speaker 4

但不知怎的，那些对称性确实存在。

But somehow, there were those symmetries.

Speaker 4

该智能体 somehow 发现了这些对称性，然后利用并运用这些对称性，使解决方案变得高效得多。

The agent somehow discovered those symmetries and then it exploited and utilized those symmetries to make the solution much more efficient.

Speaker 4

在某些情况下，我们根本不清楚它是如何让事情变得更快的，但事实就是它更快了。

In some cases, we just don't understand how it made things faster, but they are faster.

Speaker 4

于是我们的挑战在于，当考虑人类与这些AI智能体协作时，我们如何确保所生成的系统和算法能被人类计算机科学家和工程师所理解？

And then our challenge is that when you think about collaboration where humans and these AI agents are working together, then how do we make sure that the systems that are produced and the algorithms that are produced are interpretable by the human computer scientists and engineers?

Speaker 3

这让我想起AlphaGo中的一个情况，人们在终局阶段观察AlphaGo时发现，它的下法并不完全最优。

It reminds me a little bit of this situation in AlphaGo where people in the endgame were observing AlphaGo and found that it didn't quite play optimally.

Speaker 3

他们非常惊讶地说：看，这个走法比AlphaGo下的更好。

And they were really surprised to say, look, this is a better move than what AlphaGo played.

Speaker 3

你知道，它是不是没发挥好？

You know, is it not playing well?

Speaker 3

它是不是犯了错误？

Is it making mistakes?

Speaker 3

解决方案是，AlphaGo正在优化我们赋予它的目标，即最大化赢得比赛的概率。

And the solution was that AlphaGo was optimizing the objective we had given it, which is to maximize the probability of winning the game.

Speaker 3

人类往往使用一种启发式方法，即希望比对手多占一些地盘，并认为差距越大越好，这通常是对的。

Humans tend to use a heuristic, which is they want to have more territory than the opponent by some margin, and they think the larger the margin is, the better it is for them, which is often true.

Speaker 3

但AlphaGo并不在意胜率差距。

But AlphaGo doesn't care about the margin.

Speaker 3

对AlphaGo来说，只要赢半目就够了，因此在终局阶段，它常常像是在戏弄对手，故意放弃一些点数，直到确信自己能以半目获胜为止。

For AlphaGo, it was enough to win by half a point, and so often in the end game, it was almost toying seemed to be toying with the opponent and giving up points just up until the point where it was sure it could win by half a point.

Speaker 3

你有时会遇到这些反直觉的行为，但如果你深入探究，就能理解它们为何会出现。

You sometimes you get these counterintuitive behaviors, but if you then drill deeper, you can see why they come about.

Speaker 0

因为人类的算法最终优化的是略有不同的目标。

Because the algorithm in humans are ultimately optimizing for slightly different things.

Speaker 3

没错。

Exactly.

Speaker 3

对。

Yeah.

Speaker 0

是的。

Yeah.

Speaker 0

好。

Okay.

Speaker 0

但这让我有些疑惑。

But then that does make me wonder.

Speaker 0

比如第37手，它展现出了人类无法做到的水平。

So Move 37 as as an example of where, you know, it went beyond what human humans are able to do.

Speaker 0

同时，当第37手刚出来时，人们以为那是失误。

At the same time, when Move thirty seven first came through, people thought it was a mistake.

Speaker 0

对吧？

Right?

Speaker 0

那么你怎么区分呢？

So how can you tell the difference?

Speaker 0

我的意思是，如果算法提出了一个原创的东西，你能确定它不是幻觉吗？

I mean, if if the algorithm comes up with something that is original, can you be sure it's not a hallucination?

Speaker 4

是的。

Yeah.

Speaker 4

我认为这是一个非常重要的观点。

And I think this this is a very important point.

Speaker 4

对吧？

Right?

Speaker 4

就像大型语言模型一样，尤其是在它们最初开发的早期版本中，它们会产生幻觉。

Like with the large language models, especially when they were being developed initially in the first versions of them, they would hallucinate.

Speaker 4

它们会提出不正确的解决方案，或者给出完全无效的回应。

They would come up with solutions which were not correct or come up with responses which were completely invalid.

Speaker 4

而这就是智能代理框架的重要性所在，它将大型语言模型与验证器结合，能够筛选出哪些是幻觉，哪些可能是值得关注的非凡发现。

And this is where the importance of the agent harness comes into play, where you couple the large language model with a verifier, which is able to sort of prune out when, what is being hallucinated and what is actually something that might be remarkable that we need to investigate further.

Speaker 0

但如果这些大型语言模型基于人类数据，会不会限制我们只能停留在人类已发现的范围内？

But then if those large language models are based on human data, is there a danger of you limiting yourselves to what humans have already discovered?

Speaker 0

我在想那些已经写进教科书里的东西。

I'm thinking of what's already in the textbook as it were.

Speaker 4

当我们构建这些智能体时，我们会刻意增加它们需要探索的内容。

When we build these agents, we deliberately increase the amount of things that they have to explore.

Speaker 4

所以我们告诉模型：你们必须超越你们训练时所依赖的数据分布，应该自由地进行更多探索。

So we tell the models that you have to go beyond the distribution that you were trained on and you should feel free to explore more.

Speaker 4

事实上，你们可能会产生一些不恰当或不正确的全新想法，但我们有验证器和评估函数来筛选掉这些不靠谱的见解。

And in fact, you might sort of produce new things which might not be appropriate or not be correct, but we have that verifier and evaluation function to prune out those insights.

Speaker 3

我认为这正是卡尔·波普尔会用来描述整个科学过程的方式。

I think this is really how Karl Popper would also characterize the whole scientific process.

Speaker 3

‘猜想与反驳’是那篇著名的论文，你知道，猜想或许就是幻觉，是产生合理假设的生成能力，而反驳则是过滤掉错误、无效内容的步骤。

Conjecture and refutation is the famous essay and you know conjecture is maybe hallucination, it's this production capability of producing plausible hypotheses, and then refutation is the step by which you filter out the things that that are wrong, that don't work.

Speaker 3

我认为这也清楚地解释了为什么当前的AI能力格局呈现出现在的样子。

And I think it it also makes clear why the current AI capability landscape looks like it does.

Speaker 3

也就是说，它在可验证的领域表现非常出色。

Namely, it is very good in verifiable domains.

Speaker 3

代码就是一个可验证的领域。

Code is a verifiable domain.

Speaker 3

你可以定义目标，并为代码编写测试。

You define the objective, you can write down tests for the code.

Speaker 3

第一个测试是它能否编译，你知道，这已经不错了。

The first test is that it compiles, you know, it's already good

Speaker 0

对吧？

Right?

Speaker 3

然后你用这些测试来检验它，但你有明确的标准来排除失败，这对这类任务至关重要。

Then you test it on those tests, but you have hard criteria to reject failure, which is super important for these kinds of tasks.

Speaker 3

如果没有这个标准，事情就会变得复杂得多。

If you don't have it, things become much trickier.

Speaker 3

例如，如果你研究开放性的科学问题，可能就没有一个验证者能告诉你这是对的还是错的。

For example if you work on open scientific problems you may not have a verifier who can tell you that this is right or this is wrong.

Speaker 3

最终，通常需要通过物理实验来进行验证。

Ultimately often experiment, physical experiment will be the verification that you need.

Speaker 0

对。

Right.

Speaker 0

但那可是相当遥远的未来了，不是吗？

But that's quite a long long way down the road, isn't it?

Speaker 0

我想，指的是实验部分。

I guess, the the experimental part of it.

Speaker 0

因为我只是在思考，回到你之前提到的观点，关于可解释性的问题。

Because I'm just wondering here about interpretability coming back to the point that you made earlier.

Speaker 0

考虑到这里的风险远高于围棋棋盘上的情况，最终得到的结果难以解释，这重要吗？

Does it matter that you might end up with results that are not easily interpretable here given that the stakes are so much higher than they are on a board of a Go game?

Speaker 4

是的。

Yeah.

Speaker 4

我认为这确实很重要。

I think it it does matter.

Speaker 4

科学也关乎沟通。

Science is also about communication.

Speaker 4

对吧？

Right?

Speaker 4

如果你能提出这个新发现，但如果你无法与人沟通，别人也无法在此基础上继续发展，那么所能产生的影响就会受到限制，对吧？

If you can come up with this new but if you are not able to communicate and people are not able to build on top of it, then there are limits to what the impact that will be achieved, right?

Speaker 4

因此，可解释性扮演着非常重要的角色。

So interpretability plays a very important role.

Speaker 4

但这并非唯一重要的事情。

But it's not the only thing.

Speaker 4

以AlphaFold为例。

Take the example of AlphaFold.

Speaker 4

AlphaFold能够解决蛋白质结构预测这个惊人的问题。

AlphaFold is able to solve this amazing problem of protein structure prediction.

Speaker 4

我们是否完全理解它所进行的那种概念性操作呢？

Do we understand completely the conceptual sort of operations that it does?

Speaker 4

在机制层面上，是的。

Like at the mechanistic level, yes.

Speaker 4

但我们并不完全了解能够重现人类水平推理过程以做出相同预测的底层理论。

But we don't know completely the underlying theory that can be used to recreate a human level reasoning process to make the same predictions.

Speaker 4

我们必须以某种方式将其转化为人类能够理解的易懂形式，因为人类的认知能力是有限的。

And we will somehow need to convert them to a human digestible form that the bounded rational human mind will be able to comprehend.

Speaker 3

我认为这是一个非常有趣的观点，即解释不仅需要说明所要解释的现象，还需要考虑解释对象的智力水平。

I think that's a really interesting point there which is that an explanation not only needs to account for the phenomenon that you're explaining, it also needs to account for the intellectual level of the recipient of the explanation.

Speaker 3

所以有时在YouTube上，你可以看到这些内容，比如用六岁、八岁、十岁、十二岁孩子的理解水平来解释生活。

So sometimes on YouTube, you can see these things, life explained at the level of a six year old, an eight year old, a 10 year old, a 12 year old.

Speaker 3

我得说，我很喜欢为十二岁孩子做的解释。

I quite like the explanations for 12 year olds, I have to say.

Speaker 3

这反映了这一事实。

And and that reflects this fact.

Speaker 3

对吧？

Right?

Speaker 3

你知道，有些解释实际上是现象与我们理解能力之间的桥梁。

You know, some an explanation really is a bridge between the phenomenon and our capacity to understand it.

Speaker 3

因此，未来的人工智能系统可能会提出一些对他们来说看似简单，但对我们来说刚刚好、能跟上AI系统步伐的解释。

So it may very well be the case that future AI systems come up with explanations that might seem simplistic to them, but that that are just about right for us to keep up with the AI system.

Speaker 3

对吧？

Right?

Speaker 4

是的。

Yeah.

Speaker 4

没错。

Exactly.

Speaker 4

我的意思是，看看我们的代理系统如AlphaProof，它们能够做到的是：你给它们一些开放的数学问题，它们会给出一个证明，而这个证明是可以验证的。

I mean, you look at our agents like AlphaProof, what they are able to do is you give them open maths problems and they will give you a proof and that that proof is verifiable.

Speaker 0

你可以判断它是否正确。

You can tell whether it's correct or not.

Speaker 0

没错。

Exactly.

Speaker 0

即使你并不理解它。

Even if you don't understand it.

Speaker 0

是的。

Yeah.

Speaker 4

你可能不理解它，但你知道它是正确的，对吧？

You might not understand it but you know it's correct, right?

Speaker 4

关于原始定理是否正确的问题现在已经解决了。

The uncertainty about whether the original theorem was correct or not is now resolved.

Speaker 4

但我们完全理解它了吗？

But do we completely understand it?

Speaker 4

事实上，到目前为止，我们得到的结果都投入了大量精力，将其转化为数学家能够理解并认可的形式，他们说：是的，这说得通。

Like in fact, till now the results that we have had, we have spent the effort and then converted those results into a form that mathematicians have been able to see and say, yes, it makes sense.

Speaker 4

我实际上可以把它翻译成英文，而且完全行得通。

I can actually translate it in English and it all works.

Speaker 4

但由此产生了两个关键现象。

But there are two key phenomena that come out of it.

Speaker 4

其中之一是，如今问题表述的重要性提升了。

One is that the importance of framing the problem now rises.

Speaker 4

因为如果你不这么做，当我们试图解决这些非常困难的数学问题、给智能体提供这些难题时，一个挑战就在于准确地定义问题，以便智能体能够理解它需要优化的奖励函数。

Because if you don't one of the challenges when we are trying to solve these very hard maths problems, when we are giving the agent these hard problems, is to specify the problem accurately so that the agent can now understand what is the reward function that it needs to optimize for.

Speaker 4

一旦它找到了解决方案，接下来的挑战就是将解决方案转换回人类可读的形式。

And then once it finds the solution, then there's the challenge of actually converting the solution back to a human readable form.

Speaker 0

不过，如果我们真的达到了算法能够自行提出证明的阶段，数学家在这种情况下扮演什么角色呢？

If we do get to a point, though, where an algorithm could just come up with its own proof, where's the role for mathematicians in all of this?

Speaker 0

从自私的角度来说。

Speaking selfishly.

Speaker 4

不，我认为数学家今天反而更加重要，因为这些智能体能够解决那些不可思议的难题。

No, I think mathematicians are even more important today because what these agents are able to do is they're able to solve these incredible problems.

Speaker 4

但究竟哪些问题需要被解决呢？

But what are the problems that need solving?

Speaker 4

你该如何定义这个问题？

How do you specify that problem?

Speaker 4

这就是数学家和科学家发挥作用的地方。

That's where mathematicians and scientists come in.

Speaker 0

不过我喜欢这样一个想法：有一天，像黎曼猜想这样的问题可能会返回说，是的，存在一个证明。

I do like the idea though that one day there might be, I don't know, Riemann hypothesis and it comes back and says, yes, there's a proof.

Speaker 0

不幸的是，这个证明超出了任何人类的理解能力。

Unfortunately, it's it's beyond any human's ability to understand it.

Speaker 0

所以，抱歉啊。

So, you know, sorry about that.

Speaker 0

但说实话，我刚才有点开玩笑，但如果我们讨论的是推动科学知识和理解超越人类已有的成就，你认为在科学领域已经出现过类似‘第37步’的例子吗？

But actually, I mean, I'm joking slightly, but but if we are talking here about, advancing scientific knowledge and understanding beyond what humans have done, do you think you've seen examples of Move 37 in science already?

Speaker 4

是的，我绝对这么认为。

Yeah, I think absolutely.

Speaker 4

我认为矩阵乘法算法就是一个例子，人们已经研究了许多年。

I think just the example of the matrix multiplication algorithm, it is something that people had studied for many, many years.

Speaker 4

但我们还是能够提出一种新的算法。

And yet we have been able to come up with a new algorithm.

Speaker 4

因此，这确实是算法发现中的一个‘第37步’时刻。

So that is genuinely a Move 37 moment in algorithmic discovery.

Speaker 4

我认为我们现在在科学的许多其他领域、数学、材料科学中也看到了同样的现象，正在发现一些我们认为稳定的全新结构。

And I think we are now seeing the same thing in many other areas of science, in mathematics, in material science, coming up with new structures that we think now are stable.

Speaker 4

因此，这类例子有很多，但最初的‘第37步’时刻仍然非常相关，因为它在某种意义上是第一个，并且催生了超越人类理解这一概念。

So there are a number of these things, but the original Move 37 moment is still very relevant because it was, in some sense, the first, and it brought about that concept of going beyond human understanding.

Speaker 0

我在这里想到的是AlphaZero，它真正摆脱了人类数据，展现了这些深刻的结果。

I am thinking here about AlphaZero again and how that really moved away from human data and showed these profound results.

Speaker 0

另一方面，大型语言模型却几乎成了一种基于大量人类数据的智能捷径。

Large language models on the other hand ended up being almost a shortcut to intelligence, I guess, that was based very much on human data.

Speaker 0

这对你来说是一种出乎意料的转折吗？

Was that a sort of surprising turn of events for you?

Speaker 3

是的。

Yes.

Speaker 3

我认为这是我们观察到的一个有趣现象。

I think that is an interesting thing that we observed.

Speaker 3

DeepMind 基于这样一个理念：我们用游戏作为现实世界的微观模型，DeepMind 的哲学是将智能体置于这些环境中，让它们学会如何掌握这些环境，从而提升自身的智能。

DeepMind was based on this idea that we use games as a microcosm of of the real world, and the philosophy of deep mind had been to place agents within these environments and let them learn how to master them and thereby grow their intelligence.

Speaker 3

而大型语言模型的出现，实际上是发现了一条捷径。

And then what happened with large language models was really this discovery that there's a shortcut.

Speaker 3

某种程度上，互联网上存储着海量的结晶化智能——首先是文本数据，然后是图像、视频等等，而这条捷径就是先挖掘所有这些数据，并基于这些数据训练系统，这基本上就是第一代和第二代大型语言模型的基础。

That somehow there's this huge amount of crystallized intelligence, if you like, stored in the form of data on the Internet, first text data, maybe images, maybe videos, and so on, And that the shortcut is really to first mine all of that data and train systems based on that, and that's basically the first and second generation of large language models that are based on that.

Speaker 3

但当然，你会逐渐意识到，首先，这种方法并不能带来新颖性。

But then of course you come to the point where first of all that doesn't lead you to novelty.

Speaker 3

你现在只是在现有人类知识的语料库中打转，我们知道这些模型在这些范围内的能力有多强。

You're now within this corpus of existing human knowledge, and we know how competent these models are within that.

Speaker 3

但要摆脱这个局限现在变得非常困难。

But but it's very difficult to to get out of that now.

Speaker 3

我们该如何超越已知的一切？

How do we go beyond what we already know?

Speaker 3

我认为，过去几年里，整个领域又重新开始探索 DeepMind 早期开创的方法——以及其他人的方法，即在环境中进行强化学习。

And that's I think where now the community for the past few years is exploring the methods again that DeepMind pioneered early on, others of course, reinforcement learning in environments.

Speaker 3

现在的后训练阶段通常包括强化学习，使用人类生成的数据，或在编码环境等任务环境中进行。

Part of the post training now is routinely forms of reinforcement learning and either on human generated data or also on problems on environments like coding environments and so on.

Speaker 3

因此，我们现在正再次超越人类的知识边界。

And so so now we're in a period where we're going again beyond human knowledge.

Speaker 0

普什米特，你认为如果没有阿尔法狗，我们今天会处于人工智能革命的这个时刻吗？

Pushmeet, do you think that we would be here at this moment in the AI revolution if it hadn't been for AlphaGo?

Speaker 4

我认为阿尔法狗是一个转折点，它清晰地表明，超越人类在特定领域智力水平的时刻并非科幻，也不是几十年后的事，而是正在发生。

I think AlphaGo was that transition point where it became very, very clear that the moment of transition where we go beyond human level intelligence in particular areas is not science fiction or many decades later, is happening now.

Speaker 4

如果在围棋中能做到，那么在蛋白质结构预测、核聚变、材料科学等领域也没有理由做不到。

And if it could happen in a game of Go, there was no reason why it couldn't happen in protein structure prediction, in fusion, in material science.

Speaker 4

那场对局以及第37手所带来的影响和体验，正是我们现在所生活的时代的基础。

And the legacy of that match and Move 37 and and that experience is what we are all living in now.

Speaker 0

说实话，我认为这是结束本集的一个绝佳观点。

I think that's a great point to end the episode, actually, to be honest with you.

Speaker 0

别客气，普什米特，非常感谢你参与这次对话。

Don't Pushmeet, thank you so much for joining me.

Speaker 0

太棒了。

Amazing.

Speaker 3

是的。

Yep.

Speaker 3

很高兴。

Pleasure.

Speaker 0

在人类与机器的故事中，这种重大的范式转变之前也发生过。

These big paradigm shifting moments in the story of humans and machines have happened before.

Speaker 0

但关于国际象棋，它始终只是一个计算问题。

But the thing about chess is that it was always just a question of calculation.

Speaker 0

机器能否通过蛮力赢得胜利？

Can a machine brute force its way to a victory?

Speaker 0

AlphaGo是不同的。

AlphaGo was different.

Speaker 0

这是机器首次展现出更深层次的东西——一种将直觉与计算结合的真正智能，将我们带入了超越人类能力的领域。

It was the first time that a machine had demonstrated something deeper, a genuine intelligence that combined intuition with calculation, and took us beyond human capability.

Speaker 0

如今，距离AlphaGo对战已经过去了十年，这个领域的发展速度令人惊叹。

Now ten years on from the AlphaGo match, the field has moved at an incredible pace.

Speaker 0

但当年困扰研究人员的许多问题，如今比以往任何时候都更加相关。

But many of the questions that preoccupied researchers then are more relevant now than ever.

Speaker 0

你如何创建超越人类知识、能够产生新见解的AI系统？

How do you create AI systems that go beyond human knowledge and are capable of new insights?

Speaker 0

你又如何区分真正的全新见解与幻觉？

And how do you separate the genuinely new insights from hallucinations?