本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
我记得大约七岁时写下了人生第一行代码。那可能很简单,比如在屏幕上打印我的名字,或者通过精心输入的字母和数字序列来扭曲拉伸一个二维图形。那种感觉非同寻常,让我第一次体会到计算机逻辑定义能力所能带来的可能性。但直到很久以后,我才开始接触到人工智能或AI这个术语。哇,那将开启一个怎样的世界啊。
I think I was about seven years old when I wrote my first line of code. It was probably something simple, printing my name to the screen or a two dimensional shape that could be twisted and stretched by my sequence of carefully typed letters and digits. It was an extraordinary feeling, a first sense of where the logic defining power of the computer could take us. But it wasn't until much later that I started to come across the term artificial intelligence or AI. And wow, what a world that
AI蕴含着巨大的未来前景,我认为能生活在这个时代并从事相关领域工作实在令人振奋。我们想要逐步理解和掌握日益复杂的系统。AI必须以负责任和安全的方式构建,并为社会全体成员谋福利。我们必须确保这些益处能惠及每个人。
would open up. AI holds enormous promise for the future and I think these are incredibly exciting times to sort of be alive and working in these fields. We want to kind of understand and master increasingly complex systems. AI must be built responsibly and safely and used for the benefit of everyone in society. And we have to ensure the benefits accrue to everyone.
要知道,我认为AI可能会成为我们发明过最具变革性的激动人心技术之一。
You know, I think AI can be one of the most exciting transformative technologies we'll ever invent.
这是DeepMind首席执行官德米斯·哈萨比斯的声音,这家总部位于伦敦的人工智能公司。在德米斯看来,AI将让我们创造出能自主学习解决复杂问题的计算机系统。用他的话来说,社会可以用智能解决其他所有问题——癌症、气候变化、语言、能源。简而言之,推动科学发现。
That is the voice of Demis Hassabis, the CEO of DeepMind, the London based artificial intelligence company. For Demis, AI will allow us to create computer systems that can learn to solve complex problems by themselves. In his words, society could use intelligence to solve everything else. Cancer, climate change, language, energy. In short, to advance scientific discovery.
但这些目标究竟有多远大?研究人员真能破解智能吗?这又将产生多大的实际影响?我是汉娜·弗莱,这里是DeepMind播客。过去一年里,我深入伦敦DeepMind总部,带您一窥引人入胜的AI研究世界及其发展方向。我们将为您讲述人工智能领域最重大挑战的快速发展历程。
But just how far fetched are these goals? Can researchers really crack intelligence? And just how much of an impact would that really have? I'm Hannah Fry, and this is DeepMind the podcast. For the past year, I've been at DeepMind HQ in London for an inside look at the fascinating world of AI research and where it's going, we will be telling you the fast moving story of the biggest challenges in artificial intelligence or AI.
因此,无论您是想了解技术发展趋势,还是希望在自己的AI探索中获得启发,这里都是理想之选。我们将聚焦科学家、研究人员和工程师实际开展的项目,探讨他们研究AI科学的方法,以及整个领域当前正在应对的一些棘手决策。在此过程中,我们探访过摆满计算机屏幕的实验室——科学家们在那里进行着无止境的实验;也走进过会议室——人们正在白板上书写复杂的方程式。
So whether you just want to know more about where the technology is headed or want to be inspired on your own AI journey, then you've come to the right place. We will focus on the projects that scientists, researchers, and engineers are actually working on, how they're approaching the science of AI, and some of the tricky decisions the whole field is wrestling with at the moment. And whilst we're here, we've explored the rooms full of computer screens where scientists run their endless experiments. The meeting rooms where people write intricate equations on whiteboards.
打包
Pack
让机器人登上舞台。在实验室里,成排的机械臂重复摆弄着塑料积木。我们采访了大量人士,试图理解是什么在推动这一新领域。本期播客中您将听到来自人工智能与机器学习前沿领域的声音,其中不少人都是首次公开谈论他们的工作。但如果我们想破解智能之谜,让我们先从人工智能的一个基本问题开始。
to the rafters with robots. And the laboratories where banks of repetitive robot arms grapple with piles of plastic bricks. And we've talked to a huge number of people to try to understand what is driving this new frontier. The voices that you'll hear in this podcast are from the people that are at the cutting edge of AI and machine learning, and quite a few of them are talking about their work publicly for the very first time. But if we want to solve intelligence, let's start with a fundamental question of AI.
我们所说的智能究竟指什么?如果我们试图让机器具备智能,我们真正追求的目标是什么?
What exactly do we mean by intelligence? If we're trying to make machines intelligent, what are we actually aiming for?
这在AI领域是个经常被争论的话题——我们是希望AI代理完全模仿人类行为方式?它们应该完全像人类一样智能,还是只需具备广义上的智能?
This is sort of something that's debated a lot in in the AI world is like, well, do we want to, you know, have our AI agents act exactly the same way that people do? Like, should they be exactly human like intelligent or should they just be intelligent in general?
这位是DeepMind的研究科学家杰斯·哈姆里克,她的专业领域是想象与心理模拟。
This is Jess Hamrick, a research scientist at DeepMind. Her specialism is imagination and mental simulation.
我认为存在两派观点:一派主张我们应该构建具有广义智能的系统,能够解决人类目前无法解决的各类世界性难题,其智能水平超越人类。比如或许人工智能能帮助我们攻克所有疾病的治疗难题——这是人类社会迄今未能实现的成就。但另一派则认为构建至少在某种程度上类人的AI至关重要。
There's sort of, I guess, like, know, one group of people who like to say that, you know, we want to build something that's just generally intelligent, that's really able to solve a lot of different problems in the world that humans aren't necessarily able to solve, that has an intelligence that's higher than humans. So this might be able to solve problems like how do we cure all diseases? Like, maybe maybe an artificial intelligence might be able to help us solve this problem. And that's, you know, something human society and human civilization hasn't yet been able to accomplish. But then there's also another group of people who say that it's really important for us to build AI that is similar to human intelligence, at least in some ways.
我倾向于后者。为何需要类人?因为人类需要与AI交互协作,理解其预测或建议。如果我们构建的AI虽然具备广义智能,但其行为方式对人类而言过于陌生,导致我们无法真正理解其行为——我认为这将是个糟糕的局面:要么人们不信任它而拒绝采纳建议,
I would consider myself to be sort of in the latter group. Why does it need to be similar to humans? The reason is because as we build AI, we as humans need to be able to interact with AI and collaborate with it, be able to understand the predictions that it's making or the recommendations that it's making. And if we build AI in a way that is maybe we are able to build AI and it's generally intelligent, but it acts in a way that's so alien to humans that we just can't really understand what it's doing. And I think that actually would be a really bad scenario to be in because either it means that people don't trust it and then people are very unwilling to, you know, use the recommendations of this AI.
比如当它说'采取某个措施能治愈疾病'时,人们不理解其建议依据,我们可能就会错失许多造福世界的机会。
Maybe it says, oh, do this one thing and this will, like, cure this disease, but people don't understand why it's making that recommendation. Maybe we'd miss out on a lot of opportunities to really do a lot of good in the world.
我们需要AI以与我们相同的方式理解世界。它必须能够向我们解释自己,这样我们才能确信可以信任它。举个例子,有个AI通过皮肤科医生拍摄的皮肤病变照片来诊断皮肤癌。该算法在正确标记图像方面做得不错,但研究人员很快发现AI根本不是通过观察癌症来做出判断的。它只是学会了靠近尺子拍摄的病变更有可能是恶性的。
We need our AI to understand the world in the same way that we do. It needs to be able to explain itself to us so we can be sure that we can trust it. Take for instance the story of an AI that was trained to diagnose skin cancer by looking at photographs of skin lesions taken by dermatologists. The algorithm did a good job of correctly labeling the images, but the researchers soon discovered that the AI wasn't looking at the cancer at all to make its decision. It had simply learned that lesions photographed next to a ruler are more likely to be malignant.
这显然不够可靠。关键在于人工智能必须能够把握人类思维的微妙之处。我们希望它做我们真正意图的事,而不仅仅是我们表面表达的意思。但这并不必然意味着它需要完全像人类一样思考。过于模仿人类或动物大脑可能会带来弊端。
Not exactly trustworthy. It's crucially important that artificial intelligence is able to grasp the subtleties of human thought. We want it to do what we mean it to do, not just what we say we mean. But that doesn't necessarily imply it needs to think in exactly the same way as people do. There can be drawbacks to trying to imitate human or animal brains too closely.
我们会讨论这种策略在哪些方面可能限制你的发展。
We get into discussions about where the strategy can limit you.
这位是马特·博尔芬尼克。马特是DeepMind神经科学研究总监,他运用自己在认知神经科学和实验心理学领域的经验。马特认为人脑是灵感来源,但AI研究需要以自己的方式更进一步。
This is Matt Bolfinick. Matt is the director of neuroscience research at DeepMind, where he draws on his experience in cognitive neuroscience and experimental psychology. Matt believes the human mind is the inspiration, but AI research has to take things further in its own way.
就像莱特兄弟解决飞行问题时,人们喜欢说他们是在停止模仿鸟类翅膀后才成功的——从技术角度看或许没错。但如果没有花费大量时间观察鸟翼,没有注意到翼型模式,没有思考这种形状周围的气流动力学,他们就不会取得那样的突破。所以我们确实相信可以从人脑和人类思维中获取灵感,但也讨论过何时需要跳脱这种模式,直接构建能实现我们目标的东西。
You know, the Wright brothers, when they solved the problem of flight, you know, they people like to say, oh, they solved the problem when they stopped trying to copy birds wings, which, you know, in some technical way might be true. But they wouldn't have gotten to where they were, right, if they hadn't spent an awful lot of time and if other people hadn't spent an awful lot of time looking at birds wings and noticing the airfoil pattern and thinking about the the dynamics of the of the air that flows around an object with this shape. So, yeah, we we do believe that we can look to the human brain and the human mind for inspiration, but we also talk about the when the moment comes where we need to kind of step away from that and just build something that does what we want it to do.
那么神经科学领域的对应物是什么?我们大脑中的'鸟类翅膀'是什么?在构建AI时可以借鉴的人类智能特征?其中一个极具前景的领域是记忆,特别是被称为'回放'的现象。
So what is the neuroscience equivalent? What are the birds wings of our brains? The aspects of our own intelligence that we can use inspiration as we build AI. Well, one area that seems to hold a lot of promise is memory, and in particular, we all do known as replay.
回放现象是在哺乳动物大脑内侧颞叶(包括海马体)中发现的一种现象。神经活动表明过去的经历正在被重放,尤其在导航过程中。比如老鼠穿过某个环境时会产生特定活动模式,之后若在海马体植入电极,就能观察到相同的活动模式序列重现,表明该经历的记忆正在回放。这个概念如今已在AI领域确立了重要地位。
Replay is a phenomenon that was discovered in a part of the the mammalian brain, the medial temporal lobe, including the hippocampus, where you see neural activity that suggests that past experiences are being replayed, especially in in navigation. For example, a rat will go through some environment and a particular pattern of activity will arise as it goes through the environment. And then later, if you have electrodes in the hippocampus, you can see that the same pattern of activity, the same sequence is occurring, suggesting that a memory is being replayed of that experience. And that idea now has a firm place in AI.
如果你丢了车钥匙,可以在脑海中回顾去过的地方来推测可能遗落的位置。比如我先去了厨房,在门厅脱了外套,把包放在一旁,哦对了——它们在我后兜里。这种事后重放经历并从记忆中学习的能力,正是研究人员希望AI掌握的核心技能。接下来由马特详细说明。
If you lose your car keys, can run your mind through where you've been to work out where you might have left them. Well, I first went into the kitchen, I took my coat off in the hallway, put my bag down on the side, and oh, yeah. They're in my back pocket. That ability to replay your experiences and learn from that memory after the fact is a key part of what researchers want AI to be able to do. Here's more from Matt.
DeepMind智能体实现这一功能的方式与大脑机制并不完全相同。研究人员并非机械复制生物机制,但受神经科学启发的'重放'概念确实发挥了关键作用。
The way that that's implemented in DeepMind's agents is it's not exactly what you find in the brain. It wasn't as if people were trying to slavishly recreate the biological mechanisms. But the idea of Replay, which was inspired by neuroscience, came in handy.
2015年,重放机制在DeepMind的著名突破中起到关键作用。团队成功开发出能超人水平玩经典街机游戏的AI系统,如《太空侵略者》《乒乓》《打砖块》。AI采用深度强化学习技术,但在后台会持续记录游戏过程中的操作及其对最终得分的影响。
In 2015, Replay played a pivotal role in a famous DeepMind breakthrough. The team managed to build an AI system that could play arcade classics to a superhuman level. The old Atari games like Space Invaders, Pong, and Breakout. The AI used something called deep reinforcement learning. But behind the scenes, it kept a memory of moves it made as it played and how those moves had impacted on the final score.
通过重放这些记忆,AI能从经验中学习:分析哪些操作序列有效、哪些是失误,并发现原本不明显的策略。但人类记忆远不止是事实数据库——你既能记住法国首都名称,也能回忆六岁生日在充气城堡跳跃的情景,或是毕业日搞的恶作剧。这种被称为情景记忆的现象,对AI发展具有重大意义。
By replaying those memories, the AI could learn from its experiences. It could work out what sequences of moves worked well, which were mistakes, and find strategies that otherwise wouldn't have been obvious. But there's more to our human memories than just a giant database of facts. Of course, you can remember the name of the capital of France, but you might also be able to remember jumping on the bouncy castle at your sixth birthday party or the pranks you played on your last day at school. This is a phenomenon called episodic memory, and it's something that holds a great deal of promise for AI.
我们经常讨论情景记忆,这种认知能力让你能检索亲身经历。比如录制前我们开玩笑问'早餐吃了什么',你能回溯用餐时刻并提取信息的能力,就是心理学家所称的情景记忆功能。这个分类源自心理学家数十年来将记忆细分为不同领域的研究,属于较高层次的概念。
We talk a lot about something called episodic memory, which is simply the cognitive ability to retrieve a memory of something that happened to you. Before we started recording, we were joking about like, what did you have for breakfast? Your ability to cast your mind back to that moment when you were eating breakfast and retrieve that information, that's a function that psychologists and neuroscientists refer to as episodic memory. And we have this category both because psychologists worked hard over decades to fractionate memory into particular domains or or kinds. But this is a pretty high level idea.
这与重放机制不同,核心在于认识到情景记忆对人类智能至关重要。那么AI是否也该具备情景记忆?这对人工智能体意味着什么?
It's not like replay. It's just, hey, there's such a thing as episodic memory, which is very important for human intelligence. Maybe our agents should have episodic memory. What would that mean? What would it mean for an artificial agent to have memory?
这个设想令人着迷:AI不仅能回溯时间回忆完整事件,更能建立记忆间的关联。这种人类独有的神奇能力,若能被研究者破译并复现于AI系统,将极大提升其解决新问题的能力。让我们深入探讨其运作原理。
This is an intriguing possibility. An AI that can transport itself back in time and recall entire events and experiences rather than just facts. When you stop and think about it, this ability to link one memory with another is an amazing human skill. And if researchers can get a better understanding of how our brains actually do this, it could be replicated in AI systems, giving them a much greater capacity for solving novel problems. Let's think about how that works for a moment.
想象一下,每天早晨你都会看到一位三十多岁的男子遛着一只活泼的柯利牧羊犬。直到某天,一位与男子相貌相似的白发女士牵着同一只狗出现在街道上。当这些事件作为记忆片段存储在你脑海中时,你可能会立即进行一系列推断:这对男女可能来自同一家庭,这位女士或许是男子的母亲或其他近亲。
Imagine that every morning you see the same man in his thirties walking a boisterous collie. Then one day, a white haired lady who looks like the man comes down the street with the same dog. With those events stored as episodes in your mind, you might immediately make a series of deductions. The man and the woman might come from the same household. The lady may be the man's mother or another close relative.
也许她接替了他的职责,因为他生病或忙碌。我们为这些陌生人编织出复杂的故事,从记忆中提取素材,优先选择某些信息片段使其连贯——这正是这里神经科学家们近期研究的焦点。2018年9月的一项研究揭示了海马体(大脑中部虾状记忆中枢)在整合独立记忆以产生新见解中的关键作用。杰斯·哈姆里克还在探索另一种让人工智能更灵活应对新情境的方法。
Perhaps she's taken over his role because he's ill or busy. We weave an intricate story of these strangers, pulling material from our memories together, prioritizing some pieces of information over others to make it coherent. It's something that's been the focus of recent research by the neuroscientists here. A study in September 2018 demonstrated the critical role of the hippocampus, that shrimp shaped seat of memory in the middle of the brain, in weaving together individual memories to produce new insight. Jess Hamrick is also looking at another way that AIs can be made to respond more flexibly to new situations.
她的灵感来源于人类的另一种能力——心理模拟,也就是你我所说的想象力。
She takes her inspiration from a different human ability, mental simulation, what you and I might call imagination.
想象你正身处海滩。你的脑海中会突然浮现出这样的画面——至少对我来说可能是这样的:金色沙滩、蔚蓝海洋,或许还有几棵棕榈树。
Imagine that you're on a beach. You'll you'll have like this mental picture kind of spring to mind of, you know, mine at least is is maybe a sandy beach with a a bright blue ocean, maybe some palm trees.
我也在那里。真美妙。
I'm there. It's lovely.
这就是我们所说的心理模拟实例。就像我们在大脑中模拟海滩的画面,然后你可以对这个模拟场景进行各种操作:比如想象加入其他人,想象如果你扔出一个球会发生什么——假设你在打排球之类的活动。
And and so this is an example of of what we would call mental simulation. It's like we're mentally simulating this picture of the beach. And then you can do things with that simulation. So you can imagine adding other people to your imagination. You can imagine what would happen if you, like, threw a ball, if you're playing volleyball or something like that.
这类心理模拟具有极强的互动性和丰富性。我认为它们构成了人类理解世界和预测世界能力的深层基础。
So these these sort of mental simulations are really interactive and really rich. And I think that they underlie a lot of our human ability to understand the world and make predictions about the world.
我应该在此稍作停顿,解释一下杰西和马特所说的‘智能体’是什么意思。这个词在DeepMind被频繁使用。请记住,当人们谈论人工智能时,他们实际上只是在讨论能够自主决策的计算机代码。而‘智能体’就是用来描述这段代码中具有自主行为能力的部分。杰西希望构建出能灵活适应各种环境的智能体。
I should pause for a moment here to explain what Jess and Matt mean by an agent here. It's a word that's used a lot at DeepMind. Remember, when people are talking about artificial intelligence, they're really just talking about computer code with the freedom to make its own decisions. And an agent is just the noun that they use to describe the part of that code that has agency. Jess is hoping to build agents that are flexible enough to adapt to all manner of environments.
这是个非常宏大的目标,但确实具有现实潜力。为了理解这一点,让我们回到街机厅和那款《太空侵略者》游戏——通过深度强化学习训练出的名为DQN(深度Q网络)的智能体已经精通了这款游戏。
It's a very grand ambition, but one with real potential. To see why, let's go back to the arcade and that game of Space Invaders, mastered using deep reinforcement learning to create an agent called Deep Q Network or DQN.
DQN堪称一项惊人的技术壮举,因为它能通过像素感知直接训练来玩多种不同的雅达利游戏。这是前所未有的突破。但DQN的工作原理是直接从输入映射到输出——它接收游戏画面后立即输出能最大化游戏得分的动作指令,比如向左移动或按下射击键。
DQN was really sort of an amazing technological feat because it was able to be trained to play many, many different Atari games directly from perception from pixels. This is something that hadn't been done before. But the way that DQN also works is that it really just goes directly from inputs to outputs. So it takes in the image of the video game and it outputs immediately what action should be taken to maximize the score in that game. So maybe it's move left, maybe it's, you know, push the trigger to shoot.
所有这些动作都只是为了最大化得分。智能体并不理解动作的合理性,它只知道这个动作能带来更高分数。因此除了这个核心功能,智能体无法执行其他任务。比如你无法要求它‘躲在柱子后面直到柱子被摧毁’。
All of these actions are being taken just to maximize that score. And the agent doesn't know why that action is good. It only knows this action will give me a higher score. And so the agent isn't able to really do anything else besides that. You can't ask the agent to say, Hide behind one of the pillars until that pillar is destroyed.
或是‘只消灭同一排的外星入侵者而放过其他敌人’。这些对人类而言可能有点奇怪但可理解的任务,正是因为人类具备心理模拟能力——能预想不同行动带来的后果。通过赋予智能体想象力和多任务规划能力,它们就能更灵活地应对各种新情境。
Or Destroy all of the incoming space invaders in one line and none of the other space invaders. So these are all kinds of like different tasks that you could give a human and and they may be a little bit weird, but humans would understand what what it means to do this. And that's because humans have this ability for mental simulation to imagine what will happen if they take different actions. And so by giving our agents the ability to imagine things and and also plan according to different tasks that it might be given, they're able to act more flexibly and and deal with these sort of novel situations.
但人类并非我们唯一能借鉴的智能形式。我们还可以向动物界的近亲学习。下面有请研究员格雷格·韦恩。在神经科学领域,格雷格专攻记忆与认知架构研究。
But humans aren't the only form of intelligence we can draw inspiration from. We can also learn from our cousins in the animal kingdom. Let's bring in researcher Greg Wayne. Within neuroscience, Greg's thing is memory and cognitive architecture.
一个显而易见的事实是,动物具备处理超长时间跨度的非凡能力——它们能将相隔久远的经验联系起来,这远超现有智能体的水平。我认为西丛鸦就是绝佳例证:它们会埋藏食物为冬季做准备,把搜集的粮食分散储藏并互相偷窃。
One of the things that is quite clear is that animals have a remarkable ability to deal with, for example, very long time scales, so experiences that can be linked across periods of time that is, way beyond our current sets of agents. The great example, I think, is the scrub jay, the western scrub jay. They bury things. They prepare for the winter by scrounging up a lot of food and putting it into depositing different places, hiding it from each other. And they they love to steal each other's food too.
它们是食腐动物,能记住成千上万个埋藏食物的地点。而且是一下子全记住。它们甚至能记住关于这些地点的详细细节。
They're scavengers. And they can remember thousands of sites where they've they've buried their food. And once? All at once. And they they can even know detailed facts about it.
它们知道东西是多久前埋的,知道埋藏时是否被监视,还清楚具体埋了什么。对这些自己制造的事件,它们有着惊人的记忆力。
They know how long ago they buried things. They know if they were being watched while they're burying things. They know what thing they buried there. They have an incredible memory for these events that they have produced themselves.
你怎么能确定它们记得埋了什么?
How can you tell that they know what they buried?
因为它们有偏好。你会发现比起花生,它们更爱吃蛆虫,会优先找回那些蛆虫。就像拥有一个庞大的行为记忆库,能随时调用来指导后续目标行为——比如‘我饿了,现在特别想吃蛆虫,该去哪儿找?’这正是我们想复现的能力。
Because they have a preference. You'll see that they'll they like maggots more than peanuts. They'll go back to those maggots first. Having a kind of large database of things that you've done and seen, that you can access and that you can use to, then guide your your goal directed behavior later. You know, I'm hungry.
‘我现在超想吃蛆虫,该去哪儿找呢?’这类行为模式正是我们想要复现的。
I would love to have some maggots right now. Where should I go find those? That's the kind of thing we would like to replicate.
动物还教会我们另一课:训练狗狗坐下时,你不会写指令清单‘收缩这块肌肉,腿弯曲45度’之类的。而是通过不断重复动作,配合奖惩机制来训练。
And there's another big lesson we can learn from animals. If you want to teach a dog to sit, you don't write a list of instructions. Move this muscle, bend your leg 45 degrees, anything like that. Instead, you repeat the same task over and over again, offering punishments and rewards as you go.
做对了就给点食物奖励——这就是现代训犬方式。我有个朋友用强化学习训练狗狗操作iPad。AI领域也已开始将强化学习深度融入决策系统,这正是我们训练AI的方式。
And if it's good, you give it a little bit of food. That's how we train dogs now. I have a friend who trains dogs to do things on iPads using reinforcement learning. So we've already started on the path in AI of merging reinforcement learning very closely with how our AIs make decisions and so on, and that's how we train them.
所以你本质上是在训练人工智能,就像训练狗狗一样,对良好行为给予奖励,忽视不良行为。这很棒。但是,如何对待AI呢?奖励一个对狗饼干不感兴趣的东西意味着什么?下面有请德米斯·埃萨贝斯。
So you're essentially training an artificial intelligence, an AI, in the same way that you might train a dog, rewarding them for good behavior, ignoring bad behavior. Very nice. But, okay, how do you treat an AI? What does it mean to reward something that isn't interested in doggy biscuits? Here's Demis Esabes.
对于人工系统来说,它们真正关心的只有0和1。因此你可以为几乎所有事物构建人工奖励机制。我们已不再直接编程系统解决方案,而是让系统自主学习,现在更是提升到了元层次。我们真正编程或设计的是奖励系统本身。有趣的是,这反而成了现在的难点。
Well, with artificial systems, all they really care about is ones and zeros. So you can construct artificial reward mechanisms for almost anything. We've now moved away from programming the system solution, so it now learns for itself, to now pro going up a meta level. So now what we're really programming or designing is reward systems. So it's kind of interesting that that now is becoming the difficult part.
这就像如何设计课程体系?如何设计面包屑路径或奖励机制,最终让这些系统学会正确的东西?还有无监督学习的概念,即在没有任何奖励的情况下如何学习。实际上,这正是奖励学习的问题所在。在现实世界中,无论是人类还是孩童,真正的奖励都非常稀少。
It's like how do you design curricula? How do you design breadcrumb trails or rewards so that eventually they learn the right things, these systems? There's also the idea of unsupervised learning, which is how do you learn things if in the absence of any reward. And, actually, that's the issue with reward learning. In the real world, as humans or even as children, there aren't very many rewards.
奖励确实很稀疏,即使对狗来说也是如此。狗狗只是偶尔得到一块饼干,但它必须时刻决定该做什么。我认为其中一个解决方案是我们称之为内在动机的东西——那些通过进化存在于动物体内的内在驱动力,我们也可以培养或构建这种驱动力。这些驱动力非常强大,即使没有外部奖励,也能引导动物或系统。
It's quite sparse, the rewards, even as a dog. Right? The dog gets a doggy biscuit every now and again, but it has to decide every moment, like, what to do. And actually, I think one of the answers to that is what we call intrinsic motivation, which is internal drives that have come through in animals that come from evolution, but we could also evolve or or build in. Those drives are very strong, and they guide the animal or the system even in the absence of external rewards.
当然,这些内在动机可能是快乐、恐惧,甚至是饥饿感。这些都是原始的内在驱动力,即使没有任何外部奖励,也会驱动你的行为。
So, of course, that might be things like joy or fear or even things like hunger. These are all primal kind of internal motivations that drive your behavior even in the absence of any external reward.
您正在收听的是DeepMind播客,一扇通往AI研究的窗口。虽然奖励可能是激励AI学习的关键部分,但机器学习的主要目标之一是让AI能够自我教育,发现任务间的模式和捷径,使自己成为更高效的学习者。在理想情况下,工程师希望AI能像人类一样学习,在几分钟内掌握新任务的核心要领。现在回到马特·博夫尼克。
You're listening to DeepMind, the podcast, a window on AI research. While rewards might be a key part of how to encourage AI to learn, one of the main aims of machine learning is for AI to be able to teach itself, to notice patterns and shortcuts between tasks and make themselves more efficient learners. In an ideal world, engineers would like to reach a point where AI can learn in a similar way to humans, picking up the essentials of a new task in a matter of minutes. Back to Matt Bovnik.
举个实例:我最近去南美度假,想复习西班牙语。我清楚知道该怎么做,了解有哪些可用资源。更重要的是,当我开始复习时,我有一整套概念体系指引着我。比如,我知道动词变位是什么意思。
An example would be, I went on holiday recently to South America and I wanted to brush up my Spanish. And I knew exactly how to do that. I knew what resources were out there for to begin with. But more importantly, when I sat down to brush up my Spanish, I had a whole repertoire of concepts that really guided me. Like, I know what it means to conjugate a verb.
对吧?我知道在某些语言中存在阳性和阴性形式。这种背景知识帮助我比直接一头扎进一门新语言、却不理解学习语言意味着什么时,学得更快。我们希望系统,我们希望人工智能系统能自带这些概念。这不仅仅是语言,电子游戏也是如此。
Right? I know that in certain languages, there are masculine and feminine forms. So this background knowledge helped me learn much more rapidly than if I just sort of was dumped into the middle of a, you know, a new language without understanding what it means to learn a language. And we want systems, we want artificial systems that come armed with these concepts. It's not just about language, it could be video games.
我们可以坐下来玩一款你从未接触过的新游戏。但如果你以前玩过电子游戏,你大概知道游戏机制,这能帮助你快速上手。
We could sit down in front of a new video game that you've never played. If But you've played video games in the past, you kinda know how video games work and that helps you to learn rapidly.
这种AI被称为窄人工智能,可能专注于诊断癌症或玩电子游戏。但终极目标是创造更强大的东西——人工通用智能,它恰恰具备这种适应不同情境的能力,能将在一个环境中习得的高级概念应用到另一个环境中。
The AI is what's known as narrow in its focus. Now that might be diagnosing cancer or playing video games. But the ultimate goal is to create something much more powerful, something called artificial general intelligence, with precisely this ability of being able to adapt to different situations, To be able to use the high level concepts it's learned in one environment and apply them in another.
我们需要的不是只擅长一件事的系统,而是能胜任多种任务的系统。确切地说,我们需要的是能接手从未执行过的新任务的系统。我们想要的智能体是:你可以对它说'虽然你从未解决过这类问题,但现在让我告诉你需要思考什么'。比如你可以教它有机化学,它就能处理相关任务——人类就具备这种能力。
We don't want just a system that's really good at one thing, we want a system that's really good at lots of things. But really what we mean is we want a system that can pick up new tasks that it's never performed before. We want an intelligence where you can say, okay, you've never solved this kind of problem before but let me tell you what I want you to think about now. You could introduce them to organic chemistry or something and they would be able to work with that. Humans can do this.
但让机器做到这点极其困难。这还不是人类能做而AI觉得困难的唯一事例。格雷格花大量时间研究人类看似简单任务背后的详细思维过程。
But getting machines to do this is really, really tricky. And it's not the only thing that we humans can do that AI finds hard. Greg spends a lot of his time trying to understand the detailed mental processes behind apparently simple human tasks.
你吃早餐时喝了橙汁,发现喝完了,于是心想'下班得买些橙汁'。一整天工作期间你完全没想起这事,但就在离开办公室时突然记起来。当你去买橙汁时,这对当下的你毫无价值,唯一受益的是第二天早餐时的你。这实际上需要极强的预期能力——为未来自我的需求做规划。
You have breakfast and you drink your orange juice and you run out, and then you think to yourself, God, when I leave work, I'm going have to pick up some orange juice. You go through your workday, you don't even think about orange juice once, and then it springs to mind, you know, immediately as you're leaving the office that you need to go pick up some orange juice. When you are going to buy the orange juice, it is actually of no value to your present self. The only self that will benefit from buying the orange juice is yourself at breakfast the next day, so you actually have to do something that is incredibly prospective or thinking forward, thinking about the context of your future self.
所以这个问题——人们真的会围坐讨论并试图找出:大脑究竟靠什么机制在恰当时刻提醒你买橙汁?
So this is something that, I mean, people here really sit around and sort of talk about and try and work out what is it about your brain that reminds you to buy orange juice at the right moment?
是的,因为你可以轻松构建虚拟环境,为那些通常具备此类特性的智能体设计任务——比如提前思考数分钟或数小时,或是记住几小时前的事情——而我们普通的智能体在这些方面完全束手无策。它们做不到。为什么呢?这些看似简单。买橙汁这种事看起来多容易。
Yes, because you can easily construct, virtual environments with tasks for agents that we normally have that have properties like this, like thinking minutes or hours ahead or remembering something from hours ago, that our normal agents completely stumble on. They cannot do. Why is that? They seem easy. It seems easy to buy orange juice.
这里浮现出一个主题。早在二十世纪八十年代,汉斯·莫拉维克和他的同事就指出,在人工智能领域,一切都有点本末倒置。人类觉得困难的事情,比如数学、象棋和数据运算,只需要很少的计算量;而我们人类不假思索就能完成的事情,对机器来说却异常困难。这种现象后来被称为莫拉维克悖论。
There's a theme emerging here. Back in the nineteen eighties, Hans Moravec and his colleagues pointed out that when it comes to artificial intelligence, everything is a little bit upside down. While the things that humans find tough, like maths and chess and data crunching, require very little computation, the things that we humans manage without even thinking turn out to be monumentally difficult for machines. It's a phenomenon that has become known as Moravec's Paradox.
和在场的其他神经科学家、心理学家一样,我发现自己总在思考那些看似非常简单的事情。那些我和其他人不假思索就能完成、觉得没什么大不了的事情。但事实证明,其中某些事情要植入人工系统却非常困难。比如拿起物品、放下物品、规划穿过建筑物的路线——这些我们几乎不费脑力就能完成的事情,被证明相当难以工程化实现。刚刚就出现了这样一个例子。
Like other neuroscientists and psychologists here, I I find myself thinking about stuff that seems really simple. Stuff that I do and other people do really without thinking about it and it just doesn't seem that big a deal. But it turns out to be those some of those things turn out to be very difficult to engineer into artificial systems. So picking things up, putting things down, planning a route through a building, things that we can just do without really much mental effort, prove to be quite difficult to engineer. An example of this just came up.
当我们走进这个房间时,大家都觉得这里很闷热,想要设法降温。于是我们围在恒温器旁,试图弄清楚如何让它按我们的意愿工作,但它似乎很抗拒。某一刻我突然想:等等,也许空调就是坏了。这看起来又是件超级简单的事。
As we walked into this room, we all realized that it was quite stuffy in here and that we wanted to try to cool it down. So we all huddled around the thermostat and tried to figure out how to get it to do what we wanted, and it seemed to be resistant. And at some moment, I thought, wait a minute. Maybe maybe the air conditioning's just broken. And again, that seems like a super simple thing.
你会觉得,这有什么大不了的?但在AI研究中,我们其实对此有个专门术语——潜在状态推理。我们试图推断正在发生的某些潜在或隐藏的方面。而事实证明,要做这件看似简单的事,你需要一个非常丰富的世界模型。你需要了解空调、恒温器、'故障'意味着什么、故障的概率有多大等等。
Like, you know, what's such a big deal about that? But actually in AI research, have a name for this, which is latent state inference. We're trying to infer some aspect of what's going on, which is latent or hidden. And it turns out in order to do that seemingly simple thing, you need a very rich model of the world. You need to understand air conditioners and thermostats and what it means to be broken and what's the probability that it's broken and so forth.
莫拉维克悖论常被当作某种深奥的谜题讨论。它被用作证据表明:在AI时代,分析师和律师的工作或许面临风险,而园丁、接待员、厨师等职业在未来几十年仍将稳固。但DeepMind创始人德米斯·哈萨比斯对此有截然不同的看法。
Morenofsky's paradox is often talked about as some kind of profound mystery. It's used as evidence that while the jobs of analysts and lawyers might be at risk in an age of AI, gardeners, receptionists, cooks are secure in their careers for decades to come. But DeepMind's founder, Demetas Abis, has quite a different take.
我认为这很明显。其实有个简单的解释。当莫拉维克研究AI时,主流范式是专家系统——直接手工打造AI问题的解决方案。你可以把它想象成构建庞大的规则数据库。
I think it's quite obvious. There's actually a simple explanation for it. When, Moravec was doing AI, the dominant paradigm was expert systems. Handcrafting solutions directly to AI problems. Think of it as building big databases of rules.
当然,如果你打算那样做,这本身就是一项非常明确的任务编程。你必须确切知道要编写什么内容以及要纳入哪些规则。这意味着你只能处理那些你自己作为人类明确知道如何完成的任务,比如基于逻辑的事物,如数学和国际象棋。奇怪的是,那些我们凭直觉轻松完成的事情,比如行走、视觉以及各种感觉运动技能,对我们来说似乎毫不费力。
Of course, if you're gonna do that, that in itself is a very explicit task programming that out. You know, you have to know exactly what you want to write and what rules you want to incorporate. And what that means is the only tasks you can do that with are the ones that you explicitly know how to do as humans yourself. And that's things like the are logical based, like maths and chess. So weirdly, the things that we do intuitively ourselves and effortlessly, like walking and seeing and, you know, all of these sort of sensory motor skills seems effortless to us.
原因在于,实际上大脑为此进行了大量处理,只是这些过程是潜意识的。它们发生在大脑中我们无法有意识触及的区域。当时我们对神经科学的了解可能较少,所以没有完全意识到例如视觉皮层中进行着多少处理工作。而现在我们对这两方面都有了更深的认知。
And the reason is is because there's actually huge amounts of brain processing going into that. It's just that it's subconscious. It's areas of the brain we don't have conscious access to. We probably knew less about neuroscience at the time, so we didn't realize quite how much processing goes on in the visual cortex, for example. And so now we know both of those things.
我们更了解大脑的运作方式,并构建了像AlphaZero和AlphaGo这样的学习系统。事实证明,视觉其实并不比下围棋更困难——如果你用相同的方式处理,两者难度相当。
We know how the brain works better, and we've built learning systems like AlphaZero and AlphaGo. So it turns out, actually, vision is not any more difficult really than playing Go. It's similar if you approach it in the same way.
用传统手工编程的方法逆向工程我们的无意识技能几乎是不可能的。你必须对某事物的运作机制有全面而清晰的认知,才能让计算机复制它。但现在机器正开始模仿我们的潜意识过程,如视觉和模式识别。莫拉维克悖论未来未必会成为障碍,我必须坦白地说。
It's almost impossible to reverse engineer our unconscious skills using the old methods of handcrafted programming. You have to have a total and complete conscious understanding of how something worked before you could ask a computer to replicate it. But now the machines are just beginning to mimic our subconscious processes, like vision and pattern recognition. There's no reason why Moravec's paradox needs to necessarily be a barrier in the future. I have to be honest with you.
这个观点——比我制作本系列时学到的任何其他理念都更深刻——让我真切认识到人工智能的力量与潜力。迄今为止,我们在机器创造领域取得的所有成就,都仅限于我们有意识知道如何指令它们执行的内容。我们才刚刚开始人工模仿我们的潜意识过程。这意味着前方还有一段极其激动人心的旅程。但神经科学与人工智能的这种并行研究合作,不仅能让人工智能变得更强大。
This single idea, more than any I've learned in making this series, is the one that hit home and underlined the power and potential of AI for me. All that we've managed so far in everything that we've created with machines are only the things that we consciously know how to order them to perform. We're only just at the very beginning of artificially mimicking our sub conscious processes too. And that means that there is an extremely exciting journey ahead. But this partnership of studying neuroscience and artificial intelligence alongside one another doesn't just help make our AI better.
下面再次请马特·博特温尼克和杰斯·哈姆里克来讲解。
Here's Matt Botvinik and Jess Hamrick again to explain.
我们经常在这里谈论良性循环——与恶性循环相反,对吧?人工智能与神经科学之间存在着良性循环:神经科学推动人工智能发展,而人工智能也会回馈这个领域。
We often talk here about the virtuous cycle, the opposite of a vicious cycle, right? There's a virtuous cycle between AI and neuroscience where neuroscience helps AI along and then AI, like, returns the favor.
我们能在神经科学、认知科学与人工智能之间形成这种良性循环的原因之一,从根本上说是因为我们都在研究同一个主题——智能。因此,如果我们提出这类更抽象的问题,比如智能系统在此情境下应如何行动,我们可以对人类提出同样的问题:人在这种情况下会怎么做并尝试给出答案。我们也可以询问AI智能体在此情境下该如何行动并寻求解答。或者,如果我们已在某个领域获得答案,就能将该解决方案应用到其他领域中去。
One of the reasons why we can get this virtuous cycle between neuroscience and cognitive science and AI is because fundamentally, we're all trying to study the same thing, which is intelligence. And so if we ask sort of these more abstract questions about what should an intelligence system do in this situation, we can ask that about humans. What would a person do in this situation and try to come up with an answer. We could ask what should our AI agent do in this situation and and try to come up with an answer. Or if we have an answer already in one of those fields, we can take the solution and apply it to one of the other fields.
我认为这正是促成不同学科间知识迁移能力的关键所在。
And I think that's sort of really what enables this this ability to to transfer between the different fields.
这不仅是理论层面的思想流动。现实中已有人工智能领域的理念反哺神经科学的具体案例。
This isn't just a theoretical flow of ideas. There are real examples of ideas from artificial intelligence finding their way back into neuroscience.
大脑中有种传递信息的化学物质叫多巴胺。上世纪九十年代,科学家发现了追踪大脑多巴胺释放的方法,并识别出清晰的释放模式,但无人真正理解其含义:为何大脑在此情境而非彼时释放多巴胺?据我所知,当时一些研究计算强化学习的学者——比如彼得·戴恩和里德·蒙塔古——看到了这些神经科学论文中报道的多巴胺活动模式,他们立即意识到这些数据可以用强化学习的数学模型来解释。
So there's a neurotransmitter, a chemical that conveys messages in the brain called dopamine. In the nineteen nineties, people were finding ways of tracking the release of dopamine, in the brain, and very clear patterns were being identified, but nobody really understood what they meant. Why does the brain release dopamine in this situation and not that situation? And as I understand the history, some papers hit the desk of some people who were studying computational reinforcement learning, people like Peter Diane and Reid Montague. And they just saw immediately that the the patterns of activity that were being reported in these neuroscience papers, the dopamine data, could be explained by the math that's involved in reinforcement learning.
这引发了学习机制神经科学领域的真正革命。
That has led to a real revolution in the neuroscience of, learning.
给猴子零食时,它们大脑会分泌少量多巴胺。人类大脑也是如此——每当好事发生时都会产生短暂的愉悦感。但90年代的研究者发现,多巴胺其实并非对奖赏的直接反应,而是在反馈猴子预期奖赏与实际获得之间的差异。就像你走在路上意外捡到20英镑,远比从朋友那里拿到应得的20英镑更令人兴奋。
If you give a monkey a treat, they get a little hit of dopamine in their brains. It's the same in our brains too, a little burst of pleasure whenever something good happens. But in the 1990s, researchers realised that dopamine wasn't actually the response to the reward. It was reporting back about the difference between what the monkey expected the reward to be and what it actually received. If you're walking down the road and you unexpectedly find a £20 note, it's much more exciting than if you're collecting a £20 note that's owed to you by a friend.
如果猴子期待得到葡萄却收到黄瓜片,其失望程度会远高于毫无预期时突然获得黄瓜。关键在于,AI研究者早已在算法中运用了行为机制高度相似的方法:让智能体预测后续发展并与实际情况对比。但请记住,所有这些做法的核心是从人脑运作方式中获取灵感,而非直接复制——毕竟人脑本身也并非完美无缺。至此我们已了解如何从人类大脑、动物乃至鸟类大脑中汲取灵感来创建AI系统。
And if a monkey is expecting you to give it a grape and you hand it a piece of cucumber, it's going to be a lot less happy than if you just surprised it with a bit of cucumber from nowhere. The thing is, AI researchers were already using something that acted in a very similar way in their algorithms. They'd get their agents to make a prediction about what was going to happen next and compare it to what actually occurred. But remember, in all of this, the idea is to just take inspiration from the way that our human brains work, not to make a straightforward artificial copy because our brains aren't exactly perfect. So we've heard how we can take inspiration from the human brain, from the animal and even the bird brain to create AI systems.
但这已不仅仅是一个工作理论了。研究人员不再只是谈论他们想做什么,他们也在讨论实际取得的成果。让我通过DeepMind研究总监Karai Kavocholo来给大家透露一些信息。
But this isn't just a working theory anymore. Researchers aren't just talking about what they want to do. They're also talking about what they've actually managed to do. Let me tease you with Karai Kavocholo, Director of Research at DeepMind.
这是个简单的问题。当然你可以编写程序来解决它,但我们的想法是尝试使用深度强化学习,试图开发一个我们认为能推广到不同问题、更多问题的系统。当我们解决这个问题后,短短几周内就有10到15款雅达利游戏上市销售了。
It's a simple problem. Of course, you can write a program to solve that, but the idea was try to do deep reinforcement learning, try to come up with a system that we think can generalize to different problems, to more problems. And once we solved that, it was a matter of weeks we had 10 or 15 Atari games being sold.
如果你想了解更多关于人工智能与大脑之间的联系,或者探索DeepMind之外的人工智能研究世界,你可以在每期节目的注释中找到大量有用链接。如果有你认为其他听众会感兴趣的故事或资源,请告诉我们。你可以通过Twitter给我们留言,或者发邮件至团队邮箱podcastatDeepMind.com。你也可以用这个地址向我们发送对本系列的问题或反馈。让我们稍作休息。
If you would like to find out more about the link between AI and the brain, or explore the world of AI research beyond DeepMind, you'll find plenty of useful links in the show notes for each episode. And if there are stories or resources that you think other listeners would find helpful, then let us know. You can message us on Twitter or email the team @podcastatDeepMind.com. You can also use that address to send us your questions or feedback on the series. Let's take a little breather.
稍后见。
See you shortly.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。