本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
在这栋建筑的每个角落,你都能看到人们在玩游戏。当然,这里有常见的台球桌和乒乓球桌,但也有围棋盘、国际象棋盘、复古街机游戏机,以及像《卡坦岛》这样的现代策略游戏。因为在这里你会发现,游戏是美好的。我是汉娜·弗莱,一名专门研究人类行为模式的数学家。
Wherever you turn in this building, there are people playing. Of course, there's the usual pool tables and ping pong tables, but there are also go boards, chess boards, vintage video game cabinets, and more recent strategy games like Settlers of Catan. Because here you see, games are good. I'm Hannah Fry. I'm a mathematician who specialises in studying patterns in human behaviour.
我还是一名作家、广播主持人,和许多人一样,我对人工智能的潜力及其可能引领我们到达的领域充满好奇。过去一年里,我一直在与DeepMind合作——这家被誉为‘人工智能阿波罗计划’的伦敦实验室。准备好了吗?
And I'm also an author, a broadcaster, and, like rather a lot of us, someone who is fascinated by the potential for artificial intelligence and the places it can take us. Over the last twelve months, I've been working with DeepMind, the London based lab that has been called the Apollo project of artificial intelligence. Ready to go?
五、四、三、二、一。
Five, four, three, two, one.
欢迎收听DeepMind播客。嗨。推开另一扇门,你猜怎么着?一位国际象棋特级大师正在沉思他的最新棋步。二十出头时,马修·萨德勒曾是英国史上最杰出的棋手之一。
Welcome to the DeepMind Podcast. Hi. Open another door, and what do you know? There's a chess grandmaster pondering his latest move. In his early twenties, Matthew Sadler was one of Britain's all time greatest chess players.
当人工智能开始挑战人类棋王时,他作为职业选手身处这场变革的核心。如果你对国际象棋有所了解,一定会记得这一刻。
He was there at the heart of it all as a professional player when AI started moving in on human chess champions. And if you know anything about chess at all, you'll remember this.
名为‘深蓝’的计算机创造了国际象棋历史——它击败了世界冠军加里·卡斯帕罗夫。
A computer called Deep Blue has made chess history by defeating the world's champion, Gary Kasparov.
这是1997年人类被机器击败的历史性时刻。这是六局对决的最后一盘,他的对手——全球最强大的超级计算机之一——似乎已占据上风。国际象棋这项被誉为人类智慧终极考验的游戏,从此被彻底改写。卫冕冠军加里·卡斯帕罗夫以惨烈方式败给了IBM开发的象棋计算机‘深蓝’。当时见证这一幕的观众中就有马修·萨德勒。
This is the moment in 1997 when man was defeated by machine. It was the final game of six, and his opponent, one of the most powerful supercomputers in the world, seemed to have the upper hand. The game of chess, supposedly a true test of human intellect, will never be the same again. The reigning chess champion, Gary Kasparov, beaten in a devastating fashion by a chess playing computer built by IBM known as Deep Blue. One of those watching was Matthew Sadler.
深蓝的关键在于它战胜卡斯帕罗夫时其实并不比他强。这很令人恼火,因为所有人都在说‘电脑打败了卡斯帕罗夫’。而你想说:是的,但它并没有更厉害。卡斯帕罗夫有些心理崩溃,而深蓝发挥出色而已。后来才出现了真正更强的计算机。
The big thing about Deep Blue was that it wasn't actually stronger than Kasparov when it won. So that was very annoying because everyone, you know, was saying, oh, a computer's beaten Kasparov. And you want to say, yes, but it wasn't better. Kasparov got himself a bit psyched out and Deep Blue played well and all of that. And then at some stage computers came along that were a lot stronger.
然后你开始使用它们,它们开始展示你从未见过的东西——那些复杂得难以置信的战术组合,你会说‘这根本行不通’,但它确实可行。那一刻你只能认输,承认它们比你强,并开始欣赏它们为棋局带来的新维度。
And then you could start using them and they started showing you stuff that that you hadn't seen before, you know, incredibly complicated tactical sequences where you'd say, that never works, and yet it does. And then at that moment, you sort of give it up. You sort of say, okay. They're better than me, and you start appreciating what they bring to the game.
随着深蓝战胜卡斯帕罗夫的故事逐渐成为AI界的传说,人们开始思考:哪个游戏会成为AI研究的下一个前沿?最具野心的研究者盯上了古老的围棋。这不是个靠蛮力计算就能应对的游戏,它需要直觉和对棋形美感的天然领悟。与2016年手机就能与国际象棋大师对弈不同,当时没有任何程序能接近顶级围棋水平。
And so as the story of Deep Blue's victory over Kasparov faded into AI folklore, people started to wonder what game would be the next frontier for AI research. The most ambitious had their eye on the ancient board game Go. This is not a game that responds to brute force calculation. It requires intuition and an instinctive appreciation of positions and beauty. Unlike chess, whereby 2016, even a mobile phone could play a credible game against a grandmaster, there was nothing that came close to playing at the top level of Go.
但这并未阻止一个人迎接挑战——DeepMind的首席研究员大卫·席尔瓦。
But that didn't put off one man from the challenge. David Silva, lead researcher at DeepMind.
我一直是个有野心的人。在读博初期,我就给自己定下目标:要在博士期间击败世界最强棋手,后来证明这目标有点过于宏大了。所有人都劝我放弃,系主任甚至私下对我说:‘你在这个项目上纯粹是浪费时间,这太难了。’
I've always been an ambitious person. So I think at the beginning of my PhD, I set out on this this goal for myself to be able to to beat the world's strongest players during my PhD, which turned out to be a little bit ambitious. Everyone was trying to dissuade me from this course. The head of department took me aside and said, look, you're just wasting your time working on this project. It's too hard.
‘没人能做到这个。’他见过太多人在这道难题上撞得头破血流,他不想再看有人——按他的想法——徒劳地挑战不可能完成的任务。
No one will be able to do this. He's seen too many people bang their heads against this problem and fail, and he he didn't want to see someone else, in his mind, you know, just bang their heads against a problem that was too hard.
他是想让你保持就业竞争力
He wanted you to be employable
是的。你的结束 没错。
at Yes. The end of your That's right.
围棋是中国古老的棋盘游戏。尽管在西方并不盛行,但它可以说是世界上最受欢迎的棋类。围棋被视为中国古代四艺之一,与体育和数学一同被纳入学校课程。它在一块浅色木制棋盘上进行,盘面刻有19×19的优雅网格。游戏目标非常简单。
Go is an ancient Chinese board game. And although it's not played much in the West, it's arguably the most popular board game in the world. It's considered one of the four ancient scholarly skills of China and is taught in school alongside sports or maths. It's played on a pale wooden board, ingrained with an elegant grid of 19 by 19 squares. The objective is very simple.
双方玩家都试图通过用己方棋子(称为'黑子'或'白子')围地来占领棋盘区域。但游戏本身复杂得令人费解,远比国际象棋精妙得多。即便如此,大卫·席尔瓦仍坚信AI能够掌握围棋技艺,问题只在于如何实现。
Both players are aiming to capture territory on the board by enclosing it with their pieces, black or white known as stones. But the game itself is mind bogglingly sophisticated, much, much more than chess. Even so, David Silver was single-minded in his belief that AI could master the game Go. It was simply a question of how.
在我看来,正确的方法始终是让机器自主培养这种直觉能力——让它们能自行判断棋局形势并确定黑白孰优。这意味着需要机器学习,特别是其中被称为'强化学习'的方法,即通过试错经验让人工智能像人类和动物那样自主学习。
The right approach, it always seemed to me, was to allow machines to learn for themselves this kind of intuition, to learn for themselves to be able to look at a position and establish whether black or white is ahead. And this meant machine learning, in particular a method within machine learning called reinforcement learning, which is supposed to be how humans and animals learn for themselves through trial and error experience.
但仅靠强化学习是不够的。直到大卫开始与DeepMind团队合作,他才找到了拼图中缺失的那一块。
But reinforcement learning on its own wouldn't be enough. It was only when David began to work with the team at DeepMind that he spotted the missing piece of the puzzle.
我们见证了深度学习带来的巨大革命。这种技术使机器能够自主构建非常丰富、深刻、分层的知识表征。在我看来,这个突破正是缺失的关键要素——如果能把构建深度知识表征的过程与我之前研究的强化学习相结合,让机器通过试错自主学习,将这两部分整合起来,我认为这个方案或许能带我们走得更远。
We'd seen a huge revolution with something called deep learning. This is the ability for machines to build up very rich, deep, layered representations of knowledge for themselves. And that breakthrough seemed to me to be the missing element, that if we could combine that process of being able to build these very rich representations of knowledge with the kind of work which I'd been doing before on reinforcement learning, the ability for machines to learn for themselves by trial and error. If we put those two pieces together, it seemed to me that this was a recipe that might have the legs to take us all the way.
大卫所说的'走得更远',是指打造出足以挑战世界顶级围棋选手的人工智能。
And by all the way, David means building an AI good enough to challenge the very best Go players in the world.
这对人工智能界和围棋界来说都是一个重大时刻。迄今为止,AlphaGo已经击败了我们给它的所有挑战,但只有与世界顶尖棋手如李世石对弈,我们才能真正了解它的实力。
This is a huge moment for both the world of artificial intelligence and, I think, the world of Go. So far, AlphaGo has beaten every challenge we've given it, but we won't know its true strength until we play somebody who is at the top of the world, like Lee Sedol.
2016年3月,灵魂时刻。在首次萌生设计围棋机器的想法十年后,大卫的时刻终于到来。DeepMind的竞争者——AI程序AlphaGo,将与18次世界冠军李世石在电视直播的五局比赛中正面交锋,全球媒体屏息以待。
Soul, March 2016. A decade after first toying with the idea of designing a go getting machine, David's moment had come. DeepMind's contender, the AI program AlphaGo, would battle the 18 time world champion Lee Sedol head to head in a televised match of five games as the world's press watched on with bated breath.
说实话,2016年我飞往首尔时,一直过着受保护的研究员生活,闭门研究这个问题的复杂性及系统实现。直到走下飞机走进酒店房间,看到挤满的记者和喧嚣场面,我才突然意识到这件事有多重大——其影响远超我的想象。比赛期间约有1亿人观看,相关报道达3万篇。
I think the honest truth is that when I flew out to Seoul in 2016, I was living this very protected life as a researcher working on this problem behind closed doors, just thinking about the complexities of the problem and how to make the system work. And it was only when I stepped off the plane and walked into this hotel room that was absolutely jam packed with reporters and everything going on that suddenly the penny drops that this was a really big deal, that actually, you know, the consequences of this were far greater reaching than I'd ever imagined. There were something like a 100,000,000 people watching the match as it proceeded. There was something like 30,000 articles written about the match.
你抵达韩国时,相信你的算法会获胜吗?
When you arrived in Korea, did you believe that your your algorithm would win?
我们到韩国后召集了团队,我让大家伸出手指预测五场比赛中能赢几场。大家给出了不同预测,我其实预测的是4比1。
When we arrived in Korea, we actually got the team together, and I asked the team to hold out a hand. We were playing five games, and I asked everyone to hold out a hand to say how many games of the five we thought we would win. And, you know, many people made many different predictions. I actually predicted four one.
大卫预测AlphaGo会4比1获胜,这充分体现了他的信心。但比赛开始后,疑虑开始蔓延。
David's prediction of four one to AlphaGo was a clear expression of confidence. But once the match started, the doubt started creeping in.
我犯了个错误,低估了人类世界冠军的实力。对弈时我才意识到李世石作为棋手的全面性——他不断将AlphaGo逼至极限,不仅在一局中如此,后续每局都采用全新策略,持续试探弱点。我们实际上把AlphaGo推向了从未测试过的领域。
I feel I made the mistake of underestimating the quality of a real human world champion. When we were actually playing the match, I realized just how immensely versatile Lee Sedol was as a player in his ability to push AlphaGo to its limits, not just in one game, but then coming along again the next game and trying a very different strategy, and then the next game trying a different strategy, like pushing and probing for weaknesses all the way through. And we were pushing AlphaGo into regimes that we'd never tested, actually.
不知道你是否看过电视转播的围棋比赛,当一颗棋子平静地落在棋盘上时,解说员和现场观众表现出的激动情绪与观看足球比赛时如出一辙。即便如此,在第一场比赛中仍有一个瞬间让观众反应尤为突出——就连丽莎·多尔的表情都凝固成震惊状,他张大嘴巴,手不自觉地捂住了脸。
I don't know if you've ever watched a televised game of Go, but as a single stone is placed calmly onto the board, the commentators and the audience watching on react with the same ferocity of emotion and excitement as they would a football game. Even so, there was one moment during the first game where the response from the crowd really stood out. A moment where even Lisa Doll's expression fixed into a look of shock. His mouth fell open and his hand came up to his face.
所有人都预期李世石最终会获胜,认为AlphaGo犯错只是时间问题。在人类解说员看来,比赛局势仍然胶着。而就在那时,AlphaGo下出了极其大胆的一手。很高兴能来到行动核心现场。
Everyone's expectation had been that Lee Sedol would eventually emerge triumphant. Was just a matter of time until AlphaGo made a mistake. The game, apparently to the human commentators, was still balanced. And at that point in time, AlphaGo made an extremely bold move. Nice to be here at the heart of the operation.
大卫正在后台与DeepMind创始人之一德米斯·哈萨比斯一起,密切关注AlphaGo的每一步棋。他们的反应被摄像机捕捉了下来。
David is backstage with Demis Hassabis, one of the founders of DeepMind, watching AlphaGo's every move. And their reaction is caught on camera.
他做到了。他落子了。我们都在盯着看
He's done it. He's gone in. We're all looking
快看他的表情
for Look at his face.
看看他的脸。那可不是自信的表情。他显然被这步棋吓坏了。
Look at his face. That is not a confident face. He's pretty horrified by that.
用围棋术语来说,这步棋侵入了看似属于李世石的领地。AlphaGo直接跳进了李世石的地盘,仿佛在说'来抓我吧'——这是极其大胆的招法。从李世石的反应就能看出他完全没预料到这手棋,他可能期待更谨慎、更像计算机风格的应对。但实际上AlphaGo正运用其直觉判断:虽然无法计算所有可能结局,但它有种预感这步棋会奏效。
In GO terms, it invaded in something which appeared to be Lisa Doll's territory. And AlphaGo jumped right inside the region which seemed like it belonged to LisaDoll and said okay, you know, come and get me and it was an audacious move and you could judge by LisaDoll's reaction that he he wasn't expecting it, was expecting perhaps more timid, more computer like response When the reality was that AlphaGo was using its intuition to judge that if it jumped in here, it couldn't compute all the way to the end all of these possible outcomes, but it it had a sense that this would work out well
AlphaGo的胜利。第二天第二轮比赛,AlphaGo又使出了新招数。
victory for AlphaGo. Roll on day two, round two, and AlphaGo had another surprise up its sleeve.
在第二局比赛中,人类解说员的表现——我只能想到一个词——目瞪口呆。
In the second game, the human commentators were actually I mean, the only word I can think of is gobsmacked in their reaction.
这真是...这真是出人意料的一步棋。
That's a very that's a very surprising move.
我以为...我以为这是个...我以为这是个失误。
I thought I thought it was I thought it was a mistake.
这就是后来著名的第37手。AlphaGo将棋子下在了第五线,这是人类棋手根本不会考虑的走法。
This was the now famous move 37. AlphaGo had placed a stone on the fifth line, a move that no human player would even consider.
围棋中有许多根深蒂固的信念,其中之一就是:在围棋中你可以想象棋子可以落在不同线上——第一线最靠近边缘,第二线、第三线、第四线...围棋中有一条规则是,当你用斜对角方式接近对方棋子时(这被称为'肩冲'),你绝对不能在假想线之上进行肩冲。这条规则已经深深烙印在围棋知识中,因为大多数情况下它确实如此——大多数时候这是个非常有用的常识性知识,能帮助棋手排除大量糟糕的走法。但在这个特定局面中,AlphaGo意识到在第五线下子并进行肩冲,结合棋盘上其他棋子的位置,实际上产生了绝妙的效果,最终结果非常有利。
There's these deeply built in beliefs about the game, and one of them is that in the game of Go you can think of all these different lines upon which stones could be played the first line is closest to the edge, second line, the third line, the fourth line and there's a rule in the game of Go which is that when you approach one of these stones with diagonally it's called a shoulder hit that you never ever do your shoulder hit above the false line and this has just been so ingrained in Go knowledge because most of the time it's true like most of the time this is a very useful common sense piece of knowledge which helps Go players to exclude a vast range of very bad moves. But in this particular position, what AlphaGo realized was that playing on the fifth line and playing the shoulder hit on the fifth line actually just worked beautifully in the context of this position with all of its other stones in such a way that the outcome was really favorable.
因为在那局棋的最后,那颗棋子最终起到了关键作用对吧?它把所有其他棋子都连接起来了。
Because in the end of that game that stone ended up being instrumental right? Kind of joined up to everything else.
没错。是的。那颗棋子在棋局中变得极具影响力,它形成了一张大网,成功围住了中央的大片领地。
That's right. Yeah. That stone became so influential in the game, and it just worked forming this big net that surrounded a vast way of territory in the center.
AlphaGo并非以机械化的方式下棋。它打破了这项古老游戏的常规与传统,创造出一种远超人类思考范畴的下法模式,并且取得了成功。第37手最终为机器锁定了胜局。
AlphaGo just wasn't playing in a mechanical way. It was breaking the norms and conventions of this ancient game. It was creating something, a pattern of playing that went way beyond the approaches that humans had ever considered. And it was doing so successfully. Move 37 would eventually seal victory for the machine.
回顾这场比赛时,李世石谈到正是这一手彻底改变了他对整个对局的看法。
Looking back on the match, Lee Sedol spoke about how this very move shifted his entire view of the match.
我原以为AlphaGo只是基于概率计算的机器。但当我看到这一手时,我改变了想法。AlphaGo确实具有创造性,这一手棋既富有创意又美妙绝伦。
I thought AlphaGo was based on probability calculation, and it was merely a machine. But when I saw this move, I changed my mind. Surely AlphaGo was creative. This move was really creative and beautiful.
你认为这是AI展现真正创造力的表现吗?
Do you think that was the AI illustrating real creativity?
我认为我们需要挑战自己去思考:究竟什么是创造力?我认为创造力应该定义为任何能打破我们预期行为模式的事物。从这个意义上说,它确实具有创造性。
I think we need to challenge ourselves to ask, you know, what is creativity? I think creativity should be defined as anything which takes us out of our expected patterns of behavior. And I think in in that sense, it truly was creative.
AlphaGo赢下了前三局,但对李世石而言比赛尚未结束。在第四局中,他成功反击了对手。
AlphaGo won the first three games, but the match wasn't over for Lee Sedol just yet. In the fourth game, he managed to come back fighting against his opponent.
李世石是位真正的绅士,我们再也找不到比他更适合代表人类参加这场比赛的人选了。他不仅竭尽全力坚持到最后,设计出各种精妙的策略对抗AlphaGo,更以极其人性化的方式承受着万众瞩目的巨大压力。他确实感到艰难,却始终保持谦逊。虽然败给计算机伤及了他的自尊,但他从中汲取了新的力量,最终因在比赛中赢过AlphaGo一局而重获巨大自豪感,并成为人工智能发展史上这一关键时刻的参与者。
Lee Sedol was a true gentleman, and we couldn't have chosen anyone better to represent humankind for this match. He not only strove his utmost to the very end to play and devise all kinds of amazing counter strategies to to AlphaGo, but he dealt with the immense pressure of having all of these people watch him in really a profoundly human way. He he found it very hard. He was humble. I think it hurt his pride to lose to the computer, but he came back and he he found new strength in that and was able to ultimately emerge with immense pride at having beaten AlphaGo in one game and being part of this pivotal moment for AI.
六天比赛结束后,最终比分是AlphaGo四胜,李世石一胜。
At the end of the six days, the final score was AlphaGo four, Lisa Doll one.
因此AlphaGo成为首个围棋计算机冠军,这是人工智能领域的一项重大成果。
So AlphaGo became the first computer champion at the game of Go, and it was the a major result for artificial intelligence.
你赢得了赌注吗?
And you won the sweepstake?
而我赢得了赌注。
And I won the sweepstake.
这里是DeepMind播客,由AlphaGo背后的团队为您带来人工智能研究入门。比赛消息在全球引起轰动,现任DeepMind神经科学研究总监的马特·波维尼克正是数百万观众之一。
This is the DeepMind Podcast, an introduction to AI research from the people behind AlphaGo. News of the match rippled around the world. Matt Botvinik, now DeepMind's director of neuroscience research, was one of the millions watching.
围棋界人士的第一反应是:天啊,现在有计算机程序能击败我们的英雄,这让人有点伤感。但很快人们就意识到:等等,这其实非常令人振奋。我们不再受限于自身对围棋可能性的认知局限,现在有新的视野向我们敞开,我们能在这项游戏中发现新的美学形式。
The first reaction that people had in the Go community was, oh, gee, it feels a little sad that now there's a computer program that can beat our hero. But then it didn't take long before people started to realize, wait a minute, this is actually really exciting. We're not stuck with our own limitations in terms of seeing the possibilities of how to play this game. Now there are new horizons opened up to us. We can find new forms of beauty in this game.
我认为这某种程度上正是我们现在能从更广义的人工智能中所期望的缩影。
I think that's sort of in microcosm now what what I think we can hope for from from AI more generally.
对大卫而言,这场胜利始终是宏大蓝图中的一部分。
And for David, this victory was always part of a bigger picture.
事实上我从未需要停下来质疑'接下来怎么办',因为方向很明确。我们要走得更远,要建造能在各种挑战性领域达到同等性能的机器。为何止步于围棋?
It's not really the case that I've ever had to stop and question and say, well, what next? Because the what next is clear. We want to take this further. We want to build machines which can achieve the same level of performance but across all kinds of challenging domains. Why stop with Go?
你曾说过你认为围棋是人工智能的圣杯。你现在依然这么认为吗?
You once said that you think that the game of Go is the holy grail of artificial intelligence. Do you still think that that's the case?
我认为AI发展史上存在多个关键时刻,某个领域会在一段时间内成为众人关注的焦点。比如国际象棋曾长期占据中心地位——当深蓝击败卡斯帕罗夫时,就标志着那个时代的终结,人们的关注点也随之转移。
I think the history of AI has been a number of pivotal moments where for a period of time, a particular domain has been the centerpiece of everyone's attention. So for a while, the centerpiece of attention was the game of chess. When Deep Blue defeated Garry Kasparov, that marked the end of an era when chess was no longer the domain that people cared about, and and the world moved on.
但在世人将目光从围棋移开前,大卫更想知道他能将AI的极限推进到何种程度。
But before the world moved on from Go, David was curious about just how far he could push the AI.
真正困扰我的问题是:系统如何能在完全无人干预的情况下自主学习?如果没有人类监督者提供输入、指导和人类对弈范例,如果我们真正从'白板状态'(即完全空白)起步,系统仅通过完全随机对弈来自我学习——它能否自学达到围棋可能的最高水平?
Really, the open question for me was, how can a system learn for itself entirely with no human input? If if there was no human supervisor there to say, here's the inputs, here's the guidance, here's the examples of how humans play, what if we started really tabula rasa, which means start from a blank slate? The system just has to learn everything for itself, starting from completely random play. Is it able to learn for itself to play Go to the highest caliber of play that's possible?
自2016年取得突破以来,大卫和他的团队一直致力于开发新算法AlphaZero。最初的围棋程序AlphaGo通过研究人类专家的数百万局棋谱来学习。而AlphaZero则完全从零开始,不依赖任何人类知识。它通过与自己进行数百万次对弈来掌握游戏规则。最初阶段,它的棋艺生疏且不稳定。
Since their triumph in 2016, David and his team have been busy working on that new algorithm, AlphaZero. The original Go beater, AlphaGo, learned to play by studying millions of games played by human experts. AlphaZero, on the other hand, learns completely from scratch, from zero human knowledge. Instead, it picks up the game by playing against itself millions of times. Initially, its gameplay is weak and erratic.
但随着时间的推移,系统逐渐学会识别最佳走法和策略。
But over time, the system learns to identify the best moves and strategies.
它会尝试某种走法,如果某种模式在对弈中获胜,就会更多采用该模式;反之若导致败局,则会减少使用。日积月累中,它构建起这个极其丰富深刻的知识体系——所谓的神经网络,最终能够击败世界上最强的围棋程序。
It tries something, and if a particular pattern is successful and ends up winning the game against itself, it uses that pattern more. And if another pattern ends up causing it to lose the game, it will play that pattern less. And over time, it builds up this very rich, deep representation of knowledge, one of these so called neural networks, and it's able to then go out and beat the world's strongest programs.
AlphaGo和AlphaZero哪个围棋水平更高?
Which is better, AlphaGo or AlphaZero, at Go?
令人惊讶的是,我们发现这个完全自主学习、不依赖任何人类知识的系统,最终展现出更强大的实力,以100比0的绝对优势击败了原始版本的AlphaGo。
Amazingly, we discovered that the system which had learned completely for itself without a single piece of human knowledge ended up being far stronger in the long run and defeated the original version of AlphaGo by 100 games to zero.
这我真不知道。天啊。等等...你们给它灌输人类知识反而削弱了它的实力?
I didn't know that. Oh my god. So hang on. You weakened it by giving it human knowledge.
事实证明,作为人类设计者,我们总自认为知道如何增强系统。但往往当我们把自己的偏见和偏好植入程序时,实际上反而弱化了它们的表现。
It turns out that we have a tendency as human designers to believe that we know how to make the system stronger. But quite often it turns out that by putting our own predispositions and preferences into our programs, we actually make them weaker.
AlphaZero无需人类输入,它对玩什么游戏毫不在意。只要提供规则,它绝不局限于围棋。如今,AlphaZero已从零开始自学掌握了日本将棋。尽管DeepMind大楼里只有一人会下将棋,它仍是当今世界最强的将棋机器。更惊人的是,仅通过四小时自我对弈,AlphaZero就达到了超人级的国际象棋水平。
Without needing any human input, AlphaZero doesn't particularly care what game it's playing. As long as you can give it the rules, it's by no means limited to go. By now, AlphaZero has taught itself from scratch how to master the Japanese game of shogi. It is currently the world's best shogi playing machine despite there only being one human in the DeepMind Building who knows how to play. And to come full circle after only four hours of playing itself, AlphaZero mastered the game of chess to a superhuman level.
迄今为止,仅有少数顶尖棋手有机会与这台机器过招,包括我们之前遇到的国际象棋特级大师马修·塞德勒和女子国际大师娜塔莎·里根。他们合著了关于AlphaZero的《游戏变革者》一书。在DeepMind研究团队之外,他们可能是与AlphaZero相处时间最长的人。以下是娜塔莎的分享。
So far, just a small handful of chess greats have been able to test their skills against the machine, including chess grandmaster Matthew Sadler, who we met earlier, and women's international master Natasha Regan. They've co authored a book about AlphaZero called Game changer. And outside of the research team at DeepMind, they've probably spent more time with AlphaZero than anyone. Here's Natasha.
我和AlphaZero下过一局,比赛并不长。
I played Alpha Zero once and it wasn't a very long game.
你走了多少步?
How many moves did you get to?
噢,反正不到20步。不得不说它非常直接。我尝试了一个弃子开局,以为它可能不熟悉这个套路。但它迅速抓住漏洞,立即将棋子调往进攻位置,很快就赢下了比赛。
Oh, I think it would have been less than 20 anyway. And I'd have to say it was very direct. I played something sacrificial, I thought I'd try an opening thing, it might not know it. It exploited it very quickly, got its pieces out on attacking squares straight away and it won quite quickly.
要是我和它下,估计撑不到20步。它解决我应该会快得多
I imagine I wouldn't get to twenty twenty moves if I played it. I think it'd take me down much quicker
就结束了。
than that.
是的,它确实展现出一种非常流畅的人类风格。它不需要做任何复杂的操作,只是长时间下出更好的棋步,逐渐将我逼退。我根本猜不到自己是在和电脑对弈。
Yeah. It does have a a a very smooth human style against me. There was no need for it to do anything complicated. It just played better moves over a long period of time and just pushed me back gradually. I wouldn't have been able to guess that I was playing against a computer.
如果要我猜测是人类的话,那可能是卡尔森或卡尔波夫这样的棋手。这些风格流畅的局面型棋手,仅凭精妙的棋步就能击败你。
It would if I had to guess a human, it would have been someone like Carlson or or Karpov. These very smooth positional players who just beat you by playing good moves.
就像之前的AlphaGo一样,这正是AI棋风的关键特点。它们不像1997年击败加里·卡斯帕罗夫的IBM深蓝或其后续版本。它们的风格非常机械,充满电脑特征——极其保守,只计算风险后才行动。
And that, like AlphaGo before it, is a key thing about the playing style of the AI. They're not like IBM's Deep Blue that beat Gary Kasparov back in 1997 or any of its descendants. They play with a very mechanical style, computer like style. They're very defensive. They only ever take calculated risks.
但AlphaZero则截然不同
But AlphaZero, on the other hand
它以相当结构化的方式发动攻击。它会全面考虑整个棋盘局势,并且尽量避免自身受到攻击。它会先确保自己的位置稳固安全,然后协调所有棋子发动联合攻势。
It conducts its attack in quite a structured way. So it it takes a account of the whole board and it tends not to get attacked itself. It gets to a position where its own position is quite stable and safe, and then it can bring all its pieces in a a consorted way into doing an attack.
它做着人类会做的事,只是做得更好。这就是关键所在。
It's doing what humans do, but only so much better. That's the thing.
那么直白地说,马特,它实际上是在做原创性的事情吗?
Putting it really bluntly then, Matt, is it actually doing original stuff?
是的。确实。我们下国际象棋已经有四百年历史了。所以实际上,棋盘上的每一步棋可能都曾被某人在某处下过。你可以看到,天哪,它已经自己跟自己下了4400万局棋。
Yes. Is. We've been playing chess for, you know, for four hundred years. So actually, probably every single move on the board has probably been played once by someone somewhere. You can actually see, my goodness, somebody's played 44,000,000 games against itself.
它实际上是在为自己重复整个国际象棋历史。在这过程中,它只是识别出我们所发现的所有东西中最重要的部分。我是说,这就是形成风格的原因。
It's actually repeated our whole chess history for itself. And in that time, it's just identified what the most important things are of all that we've discovered. I mean, that's what makes style.
但AlphaZero不仅有风格,还有实质内容。就在2019年,它同时保持着围棋、将棋和国际象棋世界最强玩家的称号。这不仅仅是贪多求全。这个项目的全部意义在于建立一个足够灵活的智能系统,能够应对一系列问题。虽然AlphaZero可能还无法处理癌症诊断或能源效率问题,但DeepMind在游戏世界中构建这些通用机器是有充分理由的。
But AlphaZero has substance as well as style. Right now, in 2019, it simultaneously holds the titles of being the best player in the world at Go, Shogi and Chess. And it's not just being greedy. The whole point of this project was to build an intelligent system flexible enough to respond to a range of problems. And while AlphaZero might not quite be able to tackle cancer diagnosis or energy efficiency, there is a good reason why DeepMind are playing with building these all purpose machines in the world of games.
我们的目标当然不仅仅是下国际象棋或围棋什么的。我们的目标是对社会面临的一些世界性难题产生影响。但要做到这一点,我们需要先获得理解。我们需要首先真正深入地理解这些系统。而游戏为实现这一目标提供了完美的试验场。
Our goal of course is not just to play chess or go or whatever. Our goal is to have impact on some of the world's most challenging problems which are facing society. But in order to do so we need to gain understanding. We need to really deeply understand these systems for ourselves first. And games provide the perfect test bed for doing so.
游戏就像终极的迷你宇宙。你知道所有规则。最后会有明确的赢家。你可以在游戏结束后回顾并判断哪里出了问题。而且如果你输了,也没关系。
Games are like the ultimate mini universe. You know all the rules. There's a clear winner at the end. You can look back at the end of the game and decide what went wrong. And if you lose, well, it doesn't matter.
你完全可以重新开始一局。但那些我们最终想要解决的大问题,它们要复杂得多。
You can just start another round. The big problems though, the ones we eventually want to tackle, they are quite a lot more complicated.
现实世界是一个非常混乱的地方,最终你会面对像人类这样极其复杂的事物,以及他们彼此之间的互动方式、社会、公司和我们在世界上建立的所有这些惊人的事物。我们需要能够理解它们,才有希望应用类似AlphaGo的技术来取得进展。为了取得进展,我们需要能够应用那些即使在规则未知时也能运作的系统。而这正是AlphaGo或AlphaZero尚未解决的重大挑战。
The real world's a really messy place and you end up with these amazingly complex things like human beings and how they interact with each other and societies and companies and all these amazing things which we've built up in our world. We need to be able to understand them to be able to have a hope to apply something like AlphaGo to make progress. In order to make progress then, we need to be able to apply systems that can operate even when the rules are unknown. And that is a big remaining challenge which has not yet been addressed by AlphaGo or AlphaZero.
如果你想了解更多关于游戏如何被用作人工智能研究的试验场(包括过去和现在),请查看节目说明,在那里你还可以探索DeepMind之外的人工智能研究世界。我们非常欢迎你对我们本系列涉及的任何人工智能话题提出反馈或问题。如果你想参与讨论,或向我们推荐你认为对其他听众有帮助的故事或资源,请随时告知我们。你可以通过Twitter给我们留言,或者发送邮件至podcast@deepmind.com。
If you want to know more about how games have been and continue to be used as a test bed in AI research, Then head over to the show notes where you can also explore the world of AI research beyond DeepMind, and we'd welcome your feedback or your questions on any aspects of artificial intelligence that we're covering in this series. So if you want to join in the discussion or point us to stories or resources that you think other listeners would find helpful, then please let us know. You can message us on Twitter, or you can email us podcast@deepmind.com.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。