游戏、山羊与通用智能——对话弗雷德里克·贝斯

本集简介

游戏是训练智能体的绝佳试验场。仔细想想——这些精心设计、规则明确的环境，能让智能体自由探索、自主掌握规则并学习如何处理自主权。本期节目中，研究工程团队负责人弗雷德里克·贝塞与汉娜展开对谈，探讨SIMA（可扩展指令多世界智能体）等重要研究，并展望未来智能体如何安全理解并执行线上与现实世界中的多样化任务。延伸阅读： SIMA RTX与RT2 交互式智能体特别鸣谢以下制作人员（包括但不限于）：主持人：汉娜·弗莱教授系列制片人：丹·哈杜恩剪辑：拉米·察巴尔/TellTale工作室监制&制片人：艾玛·优素福制作支持：莫·达乌德音乐作曲：埃莱妮·肖摄像指导与视频剪辑：汤米·布鲁斯音频工程师：佩里·罗甘廷视频工作室制作：尼古拉斯·杜克视频剪辑：比拉尔·梅尔希视频美术设计：詹姆斯·巴顿视觉标识与设计：埃莉诺·汤姆林森谷歌DeepMind出品若喜欢本期节目，欢迎在Spotify或苹果播客留下评价。我们始终期待听众的反馈、新想法或嘉宾推荐！由AdsWizz旗下Simplecast平台托管。个人信息收集及广告用途说明详见pcm.adswizz.com

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

欢迎回到Google DeepMind播客，我是主持人汉娜·弗莱教授。

Welcome back to Google DeepMind, the podcast, with me, your host, professor Hannah Fry.

Speaker 0

可以说，目前的生成式AI与我们科幻幻想中的技术形态还存在一定差距。

I think it's fair to say that generative AI, as we have it now, is a bit of a mismatch for our science fiction fantasies about what technology might become.

Speaker 0

但值得记住的是，这并非唯一可能。

But it is worth remembering that this isn't the only possibility.

Speaker 0

还存在其他形式的AI，其设计本质就是自主实体。

There are other forms of AI, forms which are, by their very design, autonomous entities.

Speaker 0

这类AI完全属于机器人管家或虚拟管家的范畴，能在你产生需求前预知每个突发奇想。

These AI are firmly in the realm of robot butlers or a virtual concierge that anticipates your every whim before you even have it.

Speaker 0

好吧，这两个具体例子可能还需要等待一段时间。

Then, okay, we might have to wait a little while for those two particular examples.

Speaker 0

但我这里要讨论的是具有自主能动性的人工智能。

But I'm talking here about artificial intelligence with agency.

Speaker 0

这种AI内置了需求目标，能够为追求自身目标而持续独立做出决策。

AI that has built in wants and objectives and the ability to independently make decision after decision in pursuit of its own goals.

Speaker 0

这类AI被称为智能体。

This kind of AI are known as agents.

Speaker 0

虽然智能体尚未像生成式AI那样引爆公众认知，但请相信，它们将产生重大影响。

And while agents haven't yet exploded into the public consciousness quite as much as generative AI has, trust us, they are going to be a very big deal.

Speaker 0

而目前，电子游戏世界正是训练智能体的绝佳试验场。

And in the meantime, a very good training ground for agents is the world of video games.

Speaker 0

想想看：

Think about it.

Speaker 0

那些精心设计、边界明确的游戏环境，正是智能体可以自由探索、自主制定规则并学习处理自主权的理想场所。

Perfectly packaged, neatly constrained environments where the agents can run wild, work out the rules for themselves, and learn how to handle autonomy.

Speaker 0

如果您从开始就关注我们，就会知道DeepMind在AI游戏领域有着引以为傲的传统。

If you've been following us since the beginning, then you will know that AI and games is something that DeepMind has a proud heritage in.

Speaker 0

今天为我们做讲解的是弗雷德里克·贝斯，他是谷歌DeepMind的高级研究工程师。

And so to explain today, we are joined by Frederic Bess, who is a senior staff research engineer at Google DeepMind.

Speaker 0

弗雷德里克拥有计算机视觉博士学位，早在2015年DeepMind团队仅有约100名研究人员时，他就是最早加入的成员之一。

Frederic has a PhD in computer vision and was one of the earliest people to join the DeepMind team back in 2015 when there were only about a 100 researchers here.

Speaker 0

在他投身人工智能前沿领域之前，还曾在视觉效果行业工作过一段时间，研究如何将CGI技术整合到实拍素材中。

And before the decade he spent at the cutting edge of artificial intelligence, he also spent some time working in the visual effects industry on how to integrate CGI into real footage.

Speaker 0

弗雷德里克，非常感谢你参加我们的播客节目。

Frederic, thank you very much for joining us on the podcast.

Speaker 1

谢谢你，弗雷德里克。

Thank you, Frederic.

Speaker 0

看到这样的履历，我明白DeepMind为什么要聘用你了。

I I can see why DeepMind hired you with a CV like that.

Speaker 1

哦，谢谢夸奖。

Oh, I mean, thanks.

Speaker 0

好的。

Okay.

Speaker 0

我想我们最好先从一些定义开始。

Well, so I think it's probably best if we start off with some definitions.

Speaker 1

当然。

Sure.

Speaker 0

我之前简单提到过智能体的概念，但你是如何定义它们的？

So, I mean, I've talked briefly about what agents are there, but but how do you define them?

Speaker 1

在我看来，智能体是一个非常广义的概念，我将其定义为能在环境中行动的实体。

So in my mind, an agent is a very general concept, and I define it as an entity that can act in an environment.

Speaker 1

我认为行动力是智能体的核心特征。

I think action is what define an agent.

Speaker 1

而环境则是能为智能体提供观察数据，并与之互动的场所。

An an environment, on the other side, is a place which can provide observations for the agent and which the agent can interact with.

Speaker 1

因此关键要理解的是，智能体通过作用于环境会改变环境，这对于开发能在环境中（如游戏、现实世界、互联网等）完成有用任务并理解其行为后果的AI系统至关重要。

And so what's important to understand is that the agent, by acting into the environment, will change the environment, which is crucial for developing AI systems that can do useful things and understand the consequences of their actions in environments, in games, in the world, on the Internet, etcetera.

Speaker 0

不过这听起来像是超越了单纯AI范畴的概念。

This sounds like something exists beyond just AI though.

Speaker 0

比如飞机的自动驾驶系统，这能算作智能体吗？

Mean, like autopilot on planes, would that count as an agent?

Speaker 1

是的。

Yes.

Speaker 1

完全正确。

Absolutely.

Speaker 1

或许我们可以举几个智能体的例子，因为这是个非常宽泛的概念。

So maybe we can give a few examples of agents because it's a very general concept.

Speaker 1

人类就是智能体，因为我们能采取行动。

So humans are agents because we can act.

Speaker 1

我们用眼睛和耳朵观察世界。

We observe the world with our eyes and our ears.

Speaker 1

我们具有本体感觉。

We have proprioception.

Speaker 1

我们能感知自己肢体的位置。

We know where our limbs are.

Speaker 1

所以我们都是智能体。

So we are agents.

Speaker 1

我们能影响世界。

We can affect the world.

Speaker 1

机器人同样如此。

Similarly, robots as well.

Speaker 1

但若回溯到几十年前，像自动驾驶这类智能体，其逻辑都是我们手工编码的。

But if you go back to maybe decades ago where we have we had agents such as autopilot, we handcrafted the logic of those agents.

Speaker 1

我们当时编程让它们执行我们指定的任务。

We were programming them to do what we wanted them to do.

Speaker 1

因此它们与我们稍后要讨论的现代智能体不同，但这些确实是智能体的实例。

So they were different from the modern agents that we can talk about later on, but these were indeed examples of agents.

Speaker 0

那么好吧，如果那算是智能体，它和自主性或者说拥有代理权是同一回事吗？

So, alright, if that's an agent then, is that the same thing as autonomy or having agency?

Speaker 0

这些概念是否都存在微妙的差异？

Are these all slightly subtly different?

Speaker 1

我认为它们确实存在微妙差别。

So I think they are subtly different.

Speaker 1

我认为智能体能够执行其被编程设定的行为。

I think an agent can act and do things that that it has been programmed to do.

Speaker 1

这就是代理权——能够采取行动的能力。

And that's agency, being able to take actions.

Speaker 1

而自主性则更偏向于独立完成任务的能力或属性。

Now autonomy is more the skill or the property of acting by itself to accomplish a certain task.

Speaker 1

自主性存在不同等级。

And there are various levels of autonomy.

Speaker 1

比如自动驾驶汽车能自主做出最佳路线决策，我们就可以说它具备某种程度的自主性。

So if we have a self driving car which will take decision to best navigate the roads, you could say that there is a certain level of autonomy.

Speaker 1

有些智能体自主性很低，需要大量人工输入。

Some agents are much less autonomous and will require a lot more of your input.

Speaker 1

而光谱的另一端，可能存在能主动探索世界、学习新事物的智能体，这类几乎不需要我们过多指导。

And then on the other end of the spectrum, could have an agent that goes in the world and learn new things, explore, which we we don't really have to instruct much.

Speaker 1

所以说在智能体的自主性维度上存在一个光谱分布。

So they are a spectrum in the space of autonomy in agents.

Speaker 0

这是否部分类似于你所设定目标类型的某种层级关系？

Is part of that, like, almost a hierarchy of the type of objectives that you're setting?

Speaker 0

比如，我不确定，像拿起杯子又放下这样的自主动作，和决定午餐吃什么还是有些区别的。

Like, I don't know, the the autonomous action of, picking up a cup and putting it down is slightly different to deciding what's for lunch.

Speaker 1

是的。

Yes.

Speaker 1

我们正在训练智能体完成简单任务，因为这是起点，但拥有一个能为你烹饪完整午餐的智能体

We are training agents to do simple things because does this is a starting point, but having an agent that can cook for you your whole lunch

Speaker 0

还能决定你想吃什么就更好了。

decide what you want as well would be nice.

Speaker 1

是的，确实如此。

Yes, as well.

Speaker 1

它可以是个推荐系统。

It could be a recommender system.

Speaker 1

我们首先需要许多子步骤和子任务来妥善解决这些任务。

We'll require a lot of sub steps and sub tasks to solve those tasks properly as well first.

Speaker 1

所以我们把这些称为高层次任务，相比短视界任务会更基础，几乎是机械动作，比如抓取物体、把物品放在桌上，而不是烹饪午餐。

So we call those high level tasks compared to short horizon tasks, which will be very basic, almost mechanical actions like grab an object, put the objects on the table compared to cook lunch.

Speaker 0

对。

Yeah.

Speaker 0

就像你在工厂里看到的机器人，与科幻片里的那种对比。

Like the robots you see in factories, for example, versus the type of ones that we have in science fiction.

Speaker 1

是的。

Yes.

Speaker 0

但这是目标吗？

But is that the sort of aim?

Speaker 0

我们讨论的是未来会进入家庭的机器人吗？

Are we talking about robots eventually that will be in our homes?

Speaker 1

我想是的。

I think so.

Speaker 1

我们正在努力开发通用智能体，目标是实现AGI（人工通用智能），届时我们将拥有真正通用的人工智能系统，其通用性将媲美人类。

So what we are trying to do is develop general agents, and we want to reach a point which we call AGI, artificial general intelligence, which is a point where we will have agents and AI systems that will be truly general, as general as humans.

Speaker 1

这些系统将能像人类一样适应新情境、进行推理，甚至在某些任务上超越人类能力。

We'll be able to adapt to new situations, reasons like we do, and even surpass our capabilities in doing tasks.

Speaker 1

一旦AI系统达到这种能力水平，将释放人类潜能的无限可能。

Once we reach this level of competency in AI systems, this will unlock a lot of potential for what we can do as humans.

Speaker 1

这些AI系统能加速科学进步，或完成许多人类不擅长的任务。

Those AI systems will be able to accelerate our scientific progress, for example, or do a lot of tasks which maybe we are not very good for.

Speaker 1

比如驾驶汽车、家务协助或在线购物等场景。

For example, maybe driving or help you in your home, help you do shopping online.

Speaker 1

我认为有些应用领域我们甚至无法预想，因为这种技术飞跃需要我们逐步探索发现。

I think there are some applications which we can't even begin to conceive because this is such a technological leap that I think we will just discover it as we go.

Speaker 1

但我们坚信，构建能在环境中行动的通用智能体是实现AGI的关键。

But we do believe that building such general agents which can act into an environment is key to achieve AGI.

Speaker 0

你描述的方式很有趣，这意味着部分智能体本质上将是虚拟存在的。

That's interesting then the way that you're describing it that some of these agents are going to be virtual essentially.

Speaker 0

是的。

Yes.

Speaker 0

而另一些则会实际嵌入物理环境中。

And then some of them will be actually physically embedded in an environment.

Speaker 1

没错。

Yes.

Speaker 1

我认为这都归结于观察空间与行动空间的界定。

So I think it all goes back to what's your observation space and what's your action space.

Speaker 1

我们可以开发物理智能体，如机器人这类具身智能体，它们能在现实世界中执行任务，这将非常实用。

So we can have physical agents, embodied physical agents like robots, which will be able to do things in the real world, which will be very useful.

Speaker 1

我们已...

We already, as you

Speaker 0

我的机器人管家。

My robot butler.

Speaker 0

谢谢。

Thank you.

Speaker 1

没错。

Exactly.

Speaker 1

但还包括那些不在现实世界运作，而是在抽象虚拟空间工作的虚拟代理。

But also virtual agents which work not in the real world, but in an abstract virtual space.

Speaker 1

这可能仍是一个物理空间，例如我们在研究中使用的游戏，用来展示和理解这些代理能做什么。

This could be still a physical space, for example, games, which we use in our research to also demonstrate and understand what those agents can do.

Speaker 1

但你也可以拥有一个没有任何物理甚至虚拟行动空间的代理，它却能编写程序，比如生成Python代码、为你构建软件——这虽非物理空间，但它仍是代理，因为它能采取行动、编译函数、上网查询。

But also you could have an agent that doesn't have any physical, even virtual action space, but instead it can make programs, for example, write Python code, build software for you, which is not a physical space, but it's still an agent because it can take actions, it can compile the function, it can look on the internet.

Speaker 1

因此这仍然非常有价值。

And so this is still very valuable.

Speaker 0

这与我们从大语言模型看到的聊天机器人有何不同？

How is this different to the chatbots that we've seen from large language models?

Speaker 0

我是说，它们也能做很多类似的事情，对吧？

Because I mean, they can do a lot of that stuff too, right?

Speaker 0

它们能上网搜索，能理解你的指令并做出某种决策？

They can search the internet, they can understand your instructions and make decisions sort of?

Speaker 1

是的。

Yes.

Speaker 1

通常你从聊天机器人得到的是聊天输出。

So what you get out of chatbots generally is a chat output.

Speaker 1

它们生成自然语言。

They produce natural language.

Speaker 1

现在有些聊天机器人具备能替你上网搜索的功能。

Now there are some capabilities in some chatbots which could go in and search the Internet for you.

Speaker 1

但这个空间仍然相当受限，仅限于程序员添加到聊天机器人中的自然语言或特定功能。

But this space is still fairly constrained to just being either natural language or specific functions that the programmer adds to the chatbot to do things.

Speaker 1

现在，这些聊天机器人与你对话时，某种程度上，你就是聊天机器人的环境。

Now, those chatbots, when they speak with you, I mean, in a way, you are the environment of a chatbot.

Speaker 1

对吧？

Right?

Speaker 1

聊天机器人的话语会影响你，也会影响你接下来对它的提问。

What the chatbot says will affect you and will affect what you ask the chatbot next.

Speaker 1

但可能不像聊天机器人或智能体在某个环境中行动那样会产生重大后果——当智能体采取某些行动时，环境会发生剧烈变化。

But maybe it's not as there is not as much consequence as if the chatbot or the agent was acting in an environment which drastically changes when the agent takes some action.

Speaker 1

所以我们讨论的是被动型智能体。

So it's it's we bought for passive type of agents.

Speaker 1

嗯。

Mhmm.

Speaker 1

我们可以称其为智能体，但它与现实世界中行动的智能体截然不同。

We could call it an agent, but it's very different from an agent that act in the real world.

Speaker 1

这些模型的训练方式也是如此——它们通过大量从互联网抓取的数据进行训练，而不是在一个环境中通过尝试、失败、理解原因并再次尝试来训练。

And the way also those models are trained, they are trained with a lot of data that are scraped from the Internet, but they are not trained in an environment to try two things, fail, understand why they failed and try again.

Speaker 1

这是这些模型用人类数据训练的某些方式之间的差异。

This is a difference between some of the way those models are trained with human data.

Speaker 0

那么我们应该将它们视为独立的，还是可以结合两者？

So then should we be thinking of these as separate or can you combine the two?

Speaker 0

能否将大型语言模型与有目标的智能体结合起来？

Can you combine large language models and an agent that has an objective?

Speaker 1

可以的，实际上谷歌DeepMind有一些机器人论文，比如RT2和RTX，就在做这件事。

So you can, and actually there are some robotics paper from Google DeepMind, it's called RT2 and RTX, which are doing that.

Speaker 1

他们在训练时，语言模型从训练所用的海量数据中获取了大量知识。

So they are training, so a language model has a lot of knowledge from the internal scale data that it's been trained on.

Speaker 1

所以如果你愿意的话，它具备很多常识，并且理解一系列概念。

So if you if you want, it has a lot of common sense and it understands a bunch of concepts.

Speaker 1

但它不具备激活机器人关节的能力。

But it doesn't have the ability to activate joints of a robot.

Speaker 1

但你可以通过文本训练这个模型，使其能为你的机械臂生成关节激活指令，例如。

But you can make text this model and train it to produce joint activation for your robotic arm, for example.

Speaker 1

这正是那篇论文中他们所实现的。

That's what they've done in that paper.

Speaker 0

我想语言模型几乎为算法提供了某种垂直商业领域中对事物的概念性理解。

I guess almost the language model gives the algorithm some sort of, in vertical commerce, conceptual understanding of things.

Speaker 1

是的。

Yes.

Speaker 1

我认为这些大型模型的伟大之处在于它们拥有丰富的知识，能理解概念、词汇以及如何与之对话。

I think that's what is great about those big models is that they have such knowledge and understand concepts and words and what chat to them about.

Speaker 1

它们能制定计划，但这一切都停留在语言层面，对吧？

They can make plans, but all of this is in language space, right?

Speaker 1

而我们需要弥合语言与现实世界或虚拟世界行动之间的鸿沟。

And we need to bridge the gap towards acting in the real world or in the virtual world.

Speaker 0

去实际执行行动吗？

To go off and actually in acting?

Speaker 1

对。

Yes.

Speaker 0

不过游戏有什么特别之处呢？

What is it about games though?

Speaker 0

为什么从游戏开始会很有用？

Why is it useful to start with games?

Speaker 1

我认为游戏能提供非常丰富的体验，而且游戏种类极为多样。

So I think games offer a very rich experience, and games are very varied.

Speaker 1

以电脑游戏为例，如果你看看有多少游戏存在，会发现有成千上万款游戏。

So if you look at how many games there are on on computers, for example, you have tens of thousands of games.

Speaker 1

每款游戏都提供不同的场景、环境和可进行的活动。

And each game offers you different scenarios, different environments, different things you could be doing in the game.

Speaker 1

因此我们认为这是训练AI系统达到人类水平的绝佳试验场。

So we think it's a great proving ground to train AI systems to match human performance.

Speaker 1

游戏还有其他优势。

There are other advantages in games.

Speaker 1

它们是虚拟的，所以易于扩展。

So they are virtual, so they are easy to scale.

Speaker 1

与受现实世界限制的物理系统相比，你可以并行实例化许多不同的游戏。

You could instantiate a lot of different games in parallel compared to physical systems, are constraints to the real world.

Speaker 1

它们也很安全。

They are also safe.

Speaker 1

在游戏中出错并不重要。

So if things go wrong in games, it doesn't really matter.

Speaker 1

毕竟只是个游戏。

It's just a game.

Speaker 1

游戏是人类与智能体互动的场所，比如在多人在线游戏中，而且游戏本身对人类来说也很有趣。

Games are a place where humans contract with agents, for example, in multiplayer games and games are, yeah, fun as well for humans to play.

Speaker 1

我认为这是我们能收集到的绝佳经验数据来源。

And I think it's a great source of experience data that we can collect.

Speaker 0

我想你们已经在非玩家角色中部署了智能体吧？

I guess you've already got agents already in sort of non player characters, right?

Speaker 0

那些NPC在场景中游荡。

NPCs floating around the place.

Speaker 1

是的，没错。

Yes, yes.

Speaker 1

所以我认为这是个非常有趣的观点。

So I think that's a very interesting point.

Speaker 1

从历史上看，实际上游戏开发者在游戏中创建智能体时，与我们制作的智能体类型截然不同。

Historically, and actually it's still the case when game developers create agents inside of games, it's a very different beast to the type of agents that we make.

Speaker 1

那些智能体可以访问游戏的内部代码，游戏开发者只需编写逻辑，效果很好，它们运行速度快，能完成相当惊人的任务。

Those agents have access to the internal code of the game and the game developers just, you know, write the logic, which works great and they are fast and they can do pretty impressive things.

Speaker 0

比如站在那里，如果有人靠近就攻击他们。

Things like stand around and then if someone approaches you attack them.

Speaker 0

我是说，这

I mean, it's

Speaker 1

算是吧。

sort Yeah.

Speaker 1

我是说，你可以为游戏中的NPC智能体编写相当复杂的逻辑，但用代码描述行为的复杂度存在上限。

I mean, there is some pretty, yes, some pretty complicated logic you could put in the agents in the game, in the NPCs, but there is like a limit in how complicated you can describe behavior in terms of code.

Speaker 1

这正是深度学习和现代智能体的用武之地，因为这些智能体内部的逻辑并非人工编写。

And that's where deep learning and modern agents come in because those are not the logic inside those agents is not handcrafted by humans.

Speaker 1

相反，我们让智能体接触大量经验数据，它会自行学习行为模式。

Instead, we subject the agent to a lot of experience data and the agent will learn the behavior for itself.

Speaker 1

而且是以更柔和、更细腻的方式，它能掌握非常复杂的行为，这些行为可能难以用编程语言编写出来。

And in a in a more in a softer way with more, like, nuances, and it can pick up on a very complex behavior, which maybe wouldn't be feasible to to write down in in code, in programming language.

Speaker 0

与其说只是遵循规则，不如说它是在自行摸索如何达成目标。

Rather than just following rules, it's sort of working out how to achieve its objective itself.

Speaker 1

没错。

Yeah.

Speaker 0

DeepMind在游戏领域有着相当长的研究历史。

Deep Mind's got quite a long history of playing around with games.

Speaker 0

考虑到他们通过这种方式取得的成就，这么说可能有点低估了。

Maybe that's a bit trivializes it slightly considering the things that they've achieved by doing so.

Speaker 1

是的。

Yes.

Speaker 1

2015年，神经网络智能体首次取得突破，出现了名为DQN（深度Q网络）的技术，该智能体被训练来玩Atari游戏。

So back in 2015, there was the first breakthrough in neural network agents called the DQN for Deep Q Network, in which the agent was trained to play Atari.

Speaker 1

这里的突破在于，该智能体仅使用像素作为观察空间。

And the breakthrough here is that the agent was only using pixels as the observation space.

Speaker 0

所以Atari游戏是指像《太空侵略者》、《打砖块》这类游戏吗？就是那种有挡板和弹球的八十年代风格游戏？

So Atari is like Space Invaders Breakout, the one with the the the bricks where you've got a paddle and a ball bouncing around, like the sort of games you had in the nineteen eighties?

Speaker 1

是的。

Yes.

Speaker 1

就是那些街机游戏。

So those those arcade arcade games.

Speaker 1

对。

Yeah.

Speaker 1

没错。

Exactly.

Speaker 0

它唯一的输入就是屏幕上的像素？

And the only thing it would take as an input were the pixels on the screen?

Speaker 1

屏幕像素，还有奖励机制概念——也就是你获得多少分数，这对智能体学习至关重要。因为这些智能体是通过强化学习训练的，该方法旨在最大化得分、最大化奖励。

So the pixels on the screen, and there was also the concept reward which is like how many points you get and that is very important for those agents to learn because what those agents are doing and how they are trained, they are trained using a method called reinforcement learning which aims at maximizing this score, maximizing the reward.

Speaker 1

因此将深度学习与深度神经网络和强化学习相结合，诞生了DQN智能体，这是该领域的首创。

And so combining deep learning with deep neural network and reinforcement learning yielded the DQN agent, which was a first of its kind.

Speaker 1

这对DeepMind来说是一个非常重要的里程碑。

And that was a very big milestone for DeepMind.

Speaker 1

随后另一个重要智能体是AlphaGo，它是学习围棋的智能体，围棋是一种棋盘游戏。

Then another important agent that was built was AlphaGo, which was an agent that tries to learn to play the game of Go, which is a board game.

Speaker 0

众所周知非常难。

Notoriously difficult.

Speaker 1

众所周知非常困难。

Notoriously difficult.

Speaker 1

我认为当时人们不确定AI是否能在围棋上战胜人类，但后来这个观点被证明是错误的。

I think it was uncertain whether AI could ever beat the game of Go, but then that was proven to be wrong.

Speaker 0

是指AlphaGo击败李世石的时候吗？

When AlphaGo beat Lee Sedol?

Speaker 1

是的，正是如此。

Yes, exactly.

Speaker 1

所以那是个重大突破。

So that was a huge result.

Speaker 1

沿着历史继续往下说，这并非详尽清单，因为在所有这些重要里程碑之间还发生了大量研究。

Then if you continue along the history, so this is a non exhaustive list because there is a lot of research that has been happening also in between all these important milestones.

Speaker 1

后来出现了AlphaZero，它将下围棋的智能体提升到新高度，还能下国际象棋和将棋。

There was AlphaZero, which is taking the agents that play Go and also it could play chess and shogi to the next level.

Speaker 1

这个智能体完全没有使用人类数据进行训练。

And this agent was trained without any human data.

Speaker 1

AlphaGo是结合人类数据、自我对弈和强化学习训练的，而AlphaZero则完全没有使用人类数据。

So while AlphaGo was trained using a mix of human data and self play and reinforcement learning, AlphaZero was trained without any human data.

Speaker 1

它完全从零开始通过自我对弈，就发现了最佳棋步。

And just starting from scratch with self play, it discovered the best chess moves.

Speaker 1

这个现象非常有趣。

And that was very interesting to see.

Speaker 0

令人惊叹的是，去除人类输入反而让它更强。

Extraordinary that taking the human input out made it better.

Speaker 1

嗯，是的。

Well, yes.

Speaker 1

这确实很有意思，因为你实际上可以观察到国际象棋的最优走法，非常迷人。

So, yeah, it's interesting because you could you can actually look at what are the most optimal moves for chess, which is fascinating.

Speaker 1

嗯。

Mhmm.

Speaker 1

我发现。

I find.

Speaker 0

但后来它又转向了电脑游戏领域，对吧？

But then it moved into computer games again, didn't it?

Speaker 1

是的。

Yes.

Speaker 1

所以我认为下一篇非常重要的论文是关于Alpha Star的，这是一个玩《星际争霸2》的智能体，这是一款竞技游戏，与围棋或国际象棋的环境截然不同，对吧？

So I think one next very important paper was the Alpha Star paper, which is an agent that is playing Starcraft two, which is a game, competitive game, that's very different environments from, say, GO or chess, right?

Speaker 1

这里的挑战之一是，在《星际争霸2》中，你只能从环境中获取部分信息。

So one of the challenge here is that in Starcraft two, you only see partial information from your environment.

Speaker 1

有一种叫做战争迷雾的机制，让你无法看到敌人在做什么。

There is this thing called fog of war where you cannot see what your enemy is doing.

Speaker 1

因此你必须在制定策略时充分考虑这一点。

And so you really have to take this into account during your kind of strategy.

Speaker 1

而在围棋和国际象棋中，你可以看到整个棋盘。

While in GO, in chess, you see the whole board in front of you.

Speaker 1

所以你能掌握游戏的完整状态。

So you have access to the whole state of the game.

Speaker 1

此外，智能体必须实时操作，因为你知道在《星际争霸》中，当你与人类对战时，必须尽可能快地行动，没有时间花几分钟来思考。

Also, agent has to play in real time because Starcraft you know, once you play against a human at this time, you really have to play as fast as you can so that you cannot take minutes to think.

Speaker 1

而这个《星际争霸2》的智能体能够击败非常非常优秀的人类玩家。

And the Starcraft two agent was able to beat very, very good humans at the game.

Speaker 0

所以现在这又向前迈进了一步。

So now this is one step on even further.

Speaker 0

就像在每一步中，你都去掉了让事情变简单的因素。

So it's it's like at every step, you sort of you remove something that made it easy.

Speaker 0

在Atari游戏中，你的每个动作都能得分。

So Atari, you get points for every action.

Speaker 0

国际象棋则不然。

Chess, you don't.

Speaker 0

国际象棋中，你能看到整个棋盘。

In chess, you get to see the entire board.

Speaker 0

《星际争霸》则不行。

In Starcraft, you don't.

Speaker 0

《星际争霸》是竞技性的。

In Starcraft, it's competitive.

Speaker 0

而你现在做的事不具备这种特性。

In what you're doing now, it isn't.

Speaker 1

是的。

Yes.

Speaker 1

那些智能体的目标是最大化得分并赢得游戏。

So those agents were trying to maximize the score and to win the game.

Speaker 1

而我们在SIMA中的工作截然不同——我们训练智能体遵循指令。

While what we do in SIMA is very different, we train agents to follow instructions.

Speaker 0

这个项目叫什么名字？

What's this project called?

Speaker 1

我们的项目名为SIMA。

Our project is called SIMA.

Speaker 1

全称是「可扩展可指导多世界智能体」。

That stands for Scalable Instructable MultiWorld Agent.

Speaker 1

我们使用各种电子游戏和研究环境来训练智能体执行我们下达的指令。

And we used various different video games and also research environments to train an agent to follow instructions that we ask it to do.

Speaker 1

这些指令通常很简单，五到十秒内就能完成，比如「捡起苹果」「转身」「爬上梯子」等等。

Those instructions are fairly simple that you can typically carry out in five to ten seconds, such as pick up the apple or turn around, climb up the ladder, etcetera.

Speaker 1

因此没有计分，我们实际上将游戏用作容器，作为可以执行大量指令的围墙。

And so there is no score, and we use the games really as containers, as walls in which we can carry out a whole bunch of instructions.

Speaker 1

由于我们使用横向语言来制定这些指令，因此可设计的指令数量非常庞大。

And the amount of instruction, the number of instructions that you can come up with is is very large due to the fact that we use lateral language to formulate those instructions.

Speaker 1

这就带来了其他挑战。

So this poses other challenges.

Speaker 1

我们真正希望使我们的智能体具备通用性。

We really want to make our agents to be general.

Speaker 1

虽然围棋智能体只学习如何击败围棋游戏，但我们感兴趣的是制造一个能真正理解你说的话、理解上下文、理解环境然后执行这个指令的智能体。

So while a Go agent only learns to beat the game of Go, we are interested in making an agent that can really understand what you're saying, understand the context, understand the environment and then carry out this instruction.

Speaker 1

因此我们追求的是通用性。

And so we are aiming for generality.

Speaker 1

我认为通用性是我们在追求AGI时希望智能体具备的关键特性。

And I think generality is a key property that we want in our agents to pursue AGI.

Speaker 1

这就是我们使用沙盒游戏的原因。

And that's why we use those sandbox games.

Speaker 0

你说的沙盒游戏是什么意思？

What do you mean by sandbox games?

Speaker 1

沙盒游戏是指玩家基本上可以在游戏提供的空间内做任何他们想做的事情。

So a sandbox game is a game where the player can basically do whatever they want within the space of what the game offers.

Speaker 1

如果这是一款有自然景观的游戏，你可以制作、建造、烹饪、探索。

So if you did if it's a game which has a natural landscape, you can craft, you can build, you can cook, you can explore.

Speaker 0

比如《我的世界》。

Like Minecraft.

Speaker 1

没错。

Exactly.

Speaker 1

就像《我的世界》。

Like Minecraft.

Speaker 1

这就是我们所说的沙盒游戏，因为游戏本身并没有明确的目标。

And that's what we call a sandbox game because there is not really like a goal to the game.

Speaker 1

有些沙盒游戏是带有目标的。

Some games are sandboxed with a goal.

Speaker 1

所以它有点混合性质。

So it's a bit of a mix.

Speaker 1

不过，没错，你只需要享受乐趣并做各种事情。

But, yeah, you just have fun and do a bunch of things.

Speaker 0

就像在真正的沙盒里一样。

Like in an actual sandbox.

Speaker 1

正是如此。

Exactly.

Speaker 1

是的。

Yes.

Speaker 0

关于你们训练所用的游戏，有《无人深空》、《拆迁》、《英灵神殿》、《模拟山羊3》等等。

And in terms of the games that you trained on, so there's No Man's Sky, Teardown, Valheim, Goat Simulator three, as well as a bunch of others.

Speaker 0

我们会确保把这些游戏的清单放在节目说明里。

We'll we'll make sure we put a list of those in in the show notes.

Speaker 0

你自己是游戏玩家吗？

Are you a gamer yourself?

Speaker 1

是的。

Yes.

Speaker 1

非常热衷。

Very much so.

Speaker 1

我六岁生日时收到了一个Game Boy游戏机。

I I was given a Game Boy for my sixth birthday.

Speaker 0

六岁生日。

Sixth birthday.

Speaker 1

太棒了。

Amazing.

Speaker 1

真走运。

Lucky thing.

Speaker 1

是的，那太神奇了。

Yeah, it was amazing.

Speaker 1

我记得我玩的第一款游戏是俄罗斯方块，当时能在车里玩游戏让我感到非常惊奇。

I think my first game was Tetris and yeah, that was I was amazed by being able to play those games in car.

Speaker 1

从此公路旅行再也不一样了。

Car trips were never the same.

Speaker 0

确实如此。

It's true.

Speaker 0

那成年后你还玩俄罗斯方块以外的游戏吗？

And as an adult, did you go beyond Tetris?

Speaker 1

当然玩过。

Yes, of course.

Speaker 1

我记得家里装了DSL宽带后，基本上就能随时上网了。

I remember when we at home got DSL, which would give you basically constant Internet access.

Speaker 1

我特别开心，因为终于可以不用担心计时器，尽情玩网络游戏了。

I was very happy because I could finally play online games without being worried about the counter, know, how much time do I spend on the Internet.

Speaker 1

所以，没错，网游、竞技游戏、沙盒游戏，各种游戏。

So, yeah, online games, competitive games, sandbox games, games.

Speaker 1

所有这些，我都很喜欢

All of these, yeah, I I I do quite like

Speaker 0

。

it.

Speaker 0

听起来你基本上拥有了完美的游戏体验

I mean, so you've got basically the perfect experience then

Speaker 1

to to

Speaker 0

身处这个空间。

be in this space.

Speaker 1

看起来是这样。

Looks like it.

Speaker 0

你现在正在开发的新项目，这些沙盒游戏，确实是个很大的转变。

The new projects that you're working on now, these sandbox games, I mean, they are quite a big departure.

Speaker 0

在一个没有明确定义成功标准的环境中，让智能体完成任务有多困难？

How hard is it to get an agent to achieve things in an environment where you don't have that really clearly defined metric of success?

Speaker 1

是的。

Yeah.

Speaker 1

所以这与之前构建智能体训练的方式截然不同。

So it is it is a very different style of building an agent training from before.

Speaker 1

我们深受先前工作的启发，特别是交互式智能体项目。

There is some prior work which we were very inspired by, which was the interactive agents project.

Speaker 1

DeepMind博客上有篇文章，他们用了一个玩具屋环境，里面有很多物品，目标是让智能体在屋里执行指令。

There is a blog post on the DeepMind blog post where they used a playhouse environment with a bunch of objects and the goal was to get the agents to carry out instructions in that house.

Speaker 0

这就像个模拟环境。

It's like a simulated environment.

Speaker 1

没错，是个模拟环境。

It's a simulated environment, yes.

Speaker 1

就像个儿童玩具屋，里面有彩色物品，目标是把茶壶放到床上，而且也没有奖励机制。

It's like a children's house with colorful objects and the goal was to take the teapots and put it on the beds and there was no reward as well.

Speaker 1

所以我们想把这个推向更高层次。

So we wanted to take this to the next level.

Speaker 1

而且我们还想使用非自研的游戏，那些所有人都能玩的游戏。

And we also wanted to use games that are not games that we make ourselves, that are available for everyone to play.

Speaker 0

所以基本上这就是你们玩的游戏。

So basically it's the games that you guys play.

Speaker 0

是这样吗？

Is that what it is?

Speaker 1

其中一些确实是。

Some of them are indeed.

Speaker 0

我是说，我只想花更多时间玩这个我一直在玩的游戏。

I mean, I just want more time to spend on this game that I've been doing.

Speaker 0

所以我打算开发一个代理来做这件事。

So I'm gonna build an agent to do it.

Speaker 0

当这种情况发生时

Is this interesting when

Speaker 1

有趣吗？

it's happening?

Speaker 1

是的。

Yeah.

Speaker 1

你可以这么做。

You could.

Speaker 1

对。

Yeah.

Speaker 1

于是我们与游戏开发商和工作室建立了合作关系，以便能在研究中使用他们的游戏，我们选择了特定类型的游戏。

And so, yeah, we got into partnership with game developers and game studios to be able to use their games in our research, and we picked games which are of a certain genre.

Speaker 1

我们想要3D游戏，没错。

So three we wanted three d games Yeah.

Speaker 1

因为我们对这种具身化的3D环境很感兴趣。

Because we were interested in this embodied three d setup.

Speaker 1

我们希望能够用语言定义自己的目标。

We wanted to be able to define our own objective with language.

Speaker 1

以《英灵神殿》为例，这是一款维京生存模拟游戏，你可以为智能体设计大量任务。

So if you go into Valheim, for example, which is a Viking survival simulation, you can make up a lot of tasks for the agents.

Speaker 1

你可以下达指令，比如收集情绪、建造房屋、制作夹克等等。

You can say, go get some moods, build a house, craft a jacket, etcetera.

Speaker 0

那你到底要怎么训练一个智能体，当你

Then how on earth do you train an agent when you

Speaker 1

没有数据时？我们的做法是收集人类玩家的游戏数据。

don't have So what we did is that we collected data from humans playing the game.

Speaker 1

然后运用模仿学习技术，教导智能体模拟并预测人类的下一步行为。

And then we used a technique called limitation learning, which teaches the agent to just mimic and predict what a human would do next.

Speaker 1

基于所有观察数据和你此前的操作，人类会怎么做？

Given all of these observations, given what you've done before, what would a human do?

Speaker 1

这就是我们训练智能体行为模式的方法。

And that's how we train the behavior of the agent.

Speaker 1

所以这里不存在奖励机制。

So there is no reward.

Speaker 1

我们只是模仿人类行为。

We just imitate humans.

Speaker 0

那么这是概率驱动的吗？

Is it probabilistic then?

Speaker 0

是不是类似这种情况：人类在这种情境下会这样行动，所以我们会以某种概率采取相应行动？

Is it sort of like in this sort of situation, humans acted in this way, so we'll act in that way with this probability?

Speaker 1

没错，正是如此。

Yes, exactly.

Speaker 1

因此智能体的所有行为都是概率性的。

So all the actions of the agent are probabilistic.

Speaker 1

在智能体行动前，需要从概率分布中采样并选择动作。

So before the agent acts, there is a probability distribution which needs to sample and pick an action from.

Speaker 1

通常这是最可能采取的行动，但有时你可能会做出不太可能的举动，从而导致意想不到的行为。

Often it's the most likely action, but sometimes you can take an unlikely action which can lead to unforeseen behavior.

Speaker 1

有时这是种好行为。

Sometimes it's a good behavior.

Speaker 1

有时

Sometimes

Speaker 0

并不是。

it's not.

Speaker 0

继续，给我举个例子。

Go on, give me an example.

Speaker 1

嗯，比如说我们有《模拟山羊》这样的游戏。

Well, so for example, we have games such as Goat Simulator.

Speaker 0

告诉我们《模拟山羊》的游戏目标是什么。

Tell us the objective of Goat Simulator.

Speaker 1

《模拟山羊》的游戏目标基本上就是作为一只山羊在世界上制造混乱。

So the objective of Goat Simulator is to basically cause mayhem as a goat in the world.

Speaker 1

你可以开车，可以跳到物体上，作为一只山羊。

And you can drive cars, you can jump on objects As a goat.

Speaker 1

作为一只山羊。

As a goat.

Speaker 1

对。

Yeah.

Speaker 1

你你你是一只山羊。

You you you are a goat.

Speaker 1

你还能拥有翅膀飞翔。

You can have wings and you can fly as well.

Speaker 1

但这是个完美的沙盒游戏，因为你可以做许多不同的事情。

But it's a perfect sandbox game because you can do so many different things.

Speaker 1

在《模拟山羊》中，有一个按键会让山羊变得软趴趴的，像布娃娃一样瘫在地上。

And so in Goat Simulator, there is a key which causes the goats to become all floppy and ragdoll on the floor.

Speaker 1

我不太确定作为玩家什么时候会用到这个功能，可能是用来装死吧。

I'm not too sure what's the when you would actually, as a player, use this maybe to pretend you're out.

Speaker 0

装死。

Play dead.

Speaker 1

没错。

Exactly.

Speaker 1

但有时我们测试AI时，它们就会直接瘫成布娃娃，这确实给团队测试时带来了不少欢乐时刻。

But sometimes when we playtest the agents, just ragdolls on the floor, which, yeah, brings some funny moments in the in the team when we do the playtest.

Speaker 0

本来期待它冲进人群，结果直接瘫成布娃娃了。

Just expecting it to, like, barge into a crowd of people and it's just ragdolls.

Speaker 1

对。

Yes.

Speaker 1

是的。

Yes.

Speaker 1

这种情况确实会发生。

That that can happen.

Speaker 0

我能看看效果吗？

Can I see what it looks like?

Speaker 1

我这就给你演示。

I'll let I'll show you.

Speaker 1

来吧。

So Go on.

展开剩余字幕（还有 267 条）

Speaker 1

我们先从《模拟山羊3》开始。

Here, let's start by Goat Simulator three.

Speaker 0

我的最爱。

My fave.

Speaker 1

我们这里有一段智能体的视频，指令是前往一个绿色物体。

So we have a a video of the agent here, which the instruction is to go to a green object.

Speaker 0

好的。

Okay.

Speaker 0

我们有一只山羊。

We've got a goat.

Speaker 0

我们有一个相当逼真的环境，山羊正小跑着前进。

We've got a a quite realistic environment, and we've got the goat trotting forwards.

Speaker 0

周围有很多绿色植物，山羊经过时却直接跳进了池塘。

There's plenty of green, which the goat is walking past, but it's doing a big jump into a pond.

Speaker 0

池塘里有一条蛇形状的充气玩具，山羊成功抵达了那里。

And in the pond, there is an inflatable in the shape of a snake, which the the the goat has successfully reached.

Speaker 1

它是绿色的。

It's green.

Speaker 0

是的。

It's Yeah.

Speaker 0

绝对是绿色。

Definitely green.

Speaker 1

所以我们不能...

So we can't And

Speaker 0

它是一个物体。

it is an object.

Speaker 1

这是个物体。

It's an object.

Speaker 1

没错。

Yes.

Speaker 0

它是跳过去才到达那里的吗？

And it did a jump there to get there?

Speaker 1

是的。

Yeah.

Speaker 1

所以有时候智能体会捕捉到一些人类特有的行为模式。

So sometimes the agent pick up on some human peculiarities.

Speaker 1

比如说，人类在玩游戏时，常常懒得走最长的路线。

For for example, humans, when they play games, are often bored to take the the longest path.

Speaker 1

比如爬楼梯这种情况。

For example, climbing up the stairs.

Speaker 1

如果你有喷气背包，肯定会直接飞过楼梯。

If you have a jetpack, you're just gonna jetpack over the stairs.

Speaker 1

我们最早训练的一个智能体就是如此，当要求它登上平台时，它没有绕远走楼梯，而是直接用喷气背包飞过去。

So one of the first agents where we asked to get up to the platform, instead of taking the stairs the long way, the agent just jetpack over the stairs.

Speaker 1

所以它确实在学习人类行为。

So he does learn about human behavior.

Speaker 0

那些奇怪的捷径。

The strange shortcuts.

Speaker 0

我还注意到它不是走进池塘，而是跳进去，这正是人类会做的动作。

I also noticed that rather than just walking into the pond, it it jumped, which is exactly what I would do as a human.

Speaker 1

没错。

Yeah.

Speaker 1

对。

Yeah.

Speaker 0

你是想溅起水花吗？

You wanna make a splash?

Speaker 1

是的。

Yes.

Speaker 0

好的。

Okay.

Speaker 0

好的。

Alright.

Speaker 0

嗯，我是说，它成功了。

Well, I mean, it succeeded.

Speaker 0

它成功了。

It succeeded.

Speaker 1

那是成功了。

That's succeeded.

Speaker 1

那么另一个例子就是这个。

So another one would be this one.

Speaker 1

这是游戏《英灵神殿》

So this is the game Valheim

Speaker 0

嗯。

Mhmm.

Speaker 1

这是一款维京风格的生存沙盒游戏。

Which is a Viking survival sandbox game.

Speaker 1

我们要求智能体执行的任务是采集蘑菇。

And the task here that we ask the agent to carry out is to pick up mushrooms.

Speaker 0

好的。

Okay.

Speaker 0

那么，又是一个制作精美的环境。

So, again, another very beautifully crafted environment.

Speaker 0

镜头在移动，我猜是智能体在操控镜头转动。

The camera is moving around, which I guess is the agent doing that, moving the camera around.

Speaker 0

它很快就找到了一些蘑菇，把它们从地里射了出来。

It has found quite quickly some mushrooms, shot them out of the ground.

Speaker 0

不知道维京人是否真有这种特殊能力。

I don't know whether the Vikings had that that particular ability.

Speaker 0

继续在森林中漫步。

And just carrying on wandering through the forest.

Speaker 1

是啊。

Yeah.

Speaker 0

好的。

Okay.

Speaker 0

所以确实，在那个3D环境中的镜头控制和移动已经掌握得很好了。

So so definitely, like, camera control and movement in that three d environment, it's mastered.

Speaker 1

是的。

Yes.

Speaker 1

我想是因为我们用的所有游戏都是3D游戏。

So I think because all game all the games that we use are three d games.

Speaker 1

这些游戏的镜头控制都非常相似。

The the the camera control is very similar from all those games.

Speaker 1

我认为在镜头控制方面有很多经验可以借鉴。

I think there is a lot of experience to learn from in terms of controlling the camera.

Speaker 1

你可以看到镜头的移动很人性化，会有犹豫。

You can see that the the the movement of the camera is is human like, so it hesitates.

Speaker 1

它不会很机械地直接瞄准蘑菇就前进。

It's not very robotic, like, just aim on the mushroom and go forward.

Speaker 1

这是因为它是基于人类数据训练的。

And that's because it's trained from human data.

Speaker 1

如果你用强化学习训练一个智能体执行相同任务，你可能会发现

If you train an agent with reinforcement learning to do the same task, you could detect

Speaker 0

与模仿学习不同。

As opposed to imitation learning.

Speaker 1

与模仿学习不同，智能体的镜头运动会非常直线且机械，直奔蘑菇而去，因为它需要优化奖励。

As opposed to imitation learning, the agent would have very linear and robotic camera motion to just go to the mushroom because it needs to optimize its rewards.

Speaker 1

它不会花任何时间犹豫。

It will not spend any time hesitating.

Speaker 0

但在这种情况下，当你指导智能体时，你不需要解释蘑菇是什么或它在游戏中的样子。

But in this, when you're instructing the agent, you're not explaining what a mushroom is or what a mushroom might look like in the game.

Speaker 1

是的。

Yes.

Speaker 1

所以它从训练数据中获取这些信息。

So it it it gets that from the training data.

Speaker 1

在这份训练数据中，有许多人类与蘑菇互动、采摘蘑菇的序列，因此它只是从数据中学习。

So in this training data, there is a bunch of sequences of humans interacting with mushrooms, picking up mushrooms, and so it just learns from the data.

Speaker 1

我们可以给智能体布置一个有趣的任务：如果它从未见过蘑菇，或从未见过红蘑菇，但可能见过蓝蘑菇，然后你要求它'捡起红蘑菇'，智能体能否泛化理解颜色和物体类型？

So an interesting task we could ask the agent here is, you know, if you hold out, if the agent has never seen mushrooms or maybe never seen red mushrooms, but maybe it has seen blue mushrooms, and you ask the agent, pick up the red mushrooms, is the agent able to generalize, you know, color and object type?

Speaker 0

没错。

Yeah.

Speaker 1

这类问题正是我们感兴趣的。

So those type of questions are the questions we are interested in.

Speaker 0

这无疑展示了它对所处环境的概念性理解。

Which is definitely demonstrating a sort of conceptual understanding of the environment that it's in.

Speaker 1

是的，我们寻求概念间的通用性和迁移能力，尝试将新概念组合起来完成任务。

Yeah, would, yes, we're looking for the generality and transfer between concepts and trying to put things together, new concepts together to carry out a task.

Speaker 1

这正表明它已经理解什么是蘑菇以及颜色是什么。

And, yeah, that this is this shows this will show understanding of what a mushroom is and what the color already is.

Speaker 0

某种程度上，你描述的模仿学习部分，听起来像是在做计算机游戏版的预测性文本。

In a way, the way that you describe the imitation part of it, the imitation learning, it sounds a little bit like you're doing a sort of computer game version of predictive text almost.

Speaker 0

比如：这个环境里之前发生过什么？

Like, what has been done in this environment before?

Speaker 0

你如何复现它？

How do you repeat it?

Speaker 1

是的。

Yes.

Speaker 1

正是如此。

It is exactly that.

Speaker 1

是吗？

Is it?

Speaker 1

所以，是的。

So, yes.

Speaker 1

当你训练语言模型时，至少初始模型的训练目标是预测下一个词元是什么。

So when you train language models and language, the objective in which you train at least the initial model is to predict what's the next token.

Speaker 1

那么我的意思是，如果假设所有文本都是人类生成的，人类接下来会说什么？

So what would I mean, if it's assuming that it's all human generated text, what would a human say next?

Speaker 1

人类接下来会说哪个词？

What word would a human say next?

Speaker 1

这与键盘和鼠标操作中人类下一步会做什么非常相似。

Which is very similar to this, which is what would the human do next in terms of the keyboard and mouse action.

Speaker 1

所以你是对的。

So you're right.

Speaker 1

这几乎是相同的技术。

It is pretty much the same technique.

Speaker 0

相同的理念。

The same ideas.

Speaker 0

但关键区别在于你在指导它。

But the key difference here is that you're instructing it.

Speaker 0

你在告诉它这是它的目标，并以人类会采取的方式去实现。

You're saying this is the objective for you and go and do that in a way that a human would do.

Speaker 1

是的。

Yes.

Speaker 1

既然给出了这个指令，人类接下来会怎么做？

So it's given this instruction, what would a human do next?

Speaker 1

但语言模型在某种程度上也会被先前的文本所提示。

But the language models would, are also in a way prompted with some prior texts.

Speaker 1

所以它们会说，既然这段文本在你的提示中，那么接下来最可能应该输出的词是什么？

So they will say, given that this text is in your prompt, what is the next most probable word that you should output?

Speaker 1

所以本质上没有太大区别。

So it's not very different.

Speaker 1

也许智能体本身的基础设施和架构略有不同，因为我们需要产生一种非常不同的模态——键盘和鼠标空间中的操作。

The maybe infrastructure and architecture of the agent itself is a bit different because we here need to produce a very different modality, which is action in keyboard and mouse space.

Speaker 0

听起来它像是在探索人们过去采取的行动网络。

It sounds like it's kind of exploring the network of actions that people have taken in the past.

Speaker 0

是的。

Yeah.

Speaker 0

那它能做出原创性行为吗？

Does it do original things then?

Speaker 1

它本质上不会做出人类未曾做过的事，但这个智能体能够泛化到未见过的环境。

So it will not do anything that humans haven't done per se, but the agent is able to generalize to unseen environments.

Speaker 1

所以在某种程度上这算是创新。

So in a way that's kind of novel.

Speaker 1

就像，我不知道，你是个网球运动员却突然被要求去打羽毛球比赛，对吧？

So it's like, I don't know, you're you're you're a tennis player and suddenly you're asked to do a badminton match, right, as a human.

Speaker 1

因为你学过网球，你掌握了一些反射动作和技巧。

Because you've learned tennis, you have some reflexes and and mechanics that you know.

Speaker 1

你知道怎么握拍，知道怎么击球。

You know how to hold a racket, you know how to strike a ball.

Speaker 1

所以你在羽毛球比赛中也会表现得相当不错。

And so you're you're gonna do fairly well at badminton.

Speaker 1

虽非最佳，但你在这一级别上也不会成为顶尖。

Not the best, but you're not gonna be big in the level.

Speaker 1

我们的智能体之间，我们看到了相同的模式。

And our agents, in our agents, we see the same.

Speaker 1

我们的做法是：在所有环境中训练智能体，只留一个环境不训练，然后在该保留环境中测试智能体。

So what we've done is that we train the agent on all the environments except one, and then we test the agent on that one environment.

Speaker 1

可以看到智能体在那个环境中能以合理水平完成一些任务。

And we can see that the agent is able to do a few things at a reasonable level in that environment.

Speaker 1

某种程度上，我不确定这能否称为创新，但它确实在泛化能力和适应未知环境方面展现出了潜力。

So in a way, I don't know if you can say those are novel, but it definitely shows some promise with respect to generalization and generalizing to unseen environments.

Speaker 0

那么，好吧，告诉我它的整体表现如何？

So, okay, tell me how well does it perform overall then?

Speaker 1

好的，也许我可以分享两个有趣的结果。

Yes, so maybe I can tell you about two interesting results.

Speaker 0

好的，请讲。

Yes, please.

Speaker 1

首先我们观察到，在所有环境中训练的智能体，其表现优于仅在单个独立环境中训练的智能体。

So first of all, we observed that when we train an agent on all of the environment, it performs better than an agent that is only trained on each singular individual environment.

Speaker 0

所以你们在《无人深空》《模拟山羊3》《瓦尔海姆》这些游戏里训练了一个智能体

So you train an agent on No Man's Sky, Goat Simulator three, Valhai

Speaker 1

是的。

Yes.

Speaker 0

然后它在《模拟山羊3》中的表现就比仅在该游戏中训练的智能体更好？

And then it does better in, for example, Goat Simulator three than an agent that is trained only in Goat Simulator three.

Speaker 1

正是如此。

Exactly.

Speaker 1

没错。

Yes.

Speaker 1

啊。

Ah.

Speaker 1

这对我们来说是个非常好的结果，因为它证明了通过多种游戏训练获得的经验能使你在每个单独游戏中的技能都更强。

So that's that was a very good result for us because it demonstrates that the experience that you get from training on variety of game make your skill in each individual game stronger.

Speaker 1

所以我喜欢这样想：以人类为例，或许我得回到网球这个例子，如果我同时训练网球和羽毛球，可能在这两项上都会比只训练网球表现得更好。

So what I like to think of is, let's say, as a human, maybe I have to go back to the tennis example, maybe I train on tennis and I train on badminton and maybe I'll be better at both of those compared to if I had only trained on, say, tennis.

Speaker 1

因为接触更广泛的技能和经验后，我可能在需要执行的动作类型上获得了额外的技巧和微妙理解。

Because maybe I gained some extra skills and subtlety in the type of actions I need to do from being exposed to a wider range of skills and experience.

Speaker 0

是否也部分因为他们接触到了更多数据？

Is it also partly they're just exposed to more data?

Speaker 0

比如更多人类操控三维环境的方式，更多人类可能做出的不同行为？

So more of how humans manipulate three d environments, more of, you know, different kind of things that humans might do?

Speaker 1

是的。

Yes.

Speaker 1

不同的游戏会提供不同的情境。

So games, different games offer different situations.

Speaker 1

因此智能体可以从这些情境中提取知识，并可能将其与其他游戏中的情境联系起来。

So the agent can extract knowledge from those situations and maybe relate them to other situations from other games.

Speaker 1

这样它就能不断积累并巩固行为模式，最终形成比未接触其他经验时更优的表现——某种程度上，我们启动项目时的假设与语言模型相同。

And so it just accumulates and consolidates the behavior into something that is better than not having seen this other experience, which in a way our hypothesis when we started the project was the same hypothesis as the language models.

Speaker 1

大量数据训练会催生一些有趣特性，比如泛化能力和迁移能力。

Training a lot of data will make some interesting properties emerge such as being able to generalize and transfer.

Speaker 1

没错，这也是我们在此观察到的现象。

And so, yeah, that's what we also observed here.

Speaker 0

但我认为这个结果并非显而易见，因为完全有可能它从《无人深空》学到的东西反而会干扰它回到《模拟山羊3》时的表现。

But then I think also that that result isn't immediately obvious because it could well have been the case that actually all the things that it learns from No Man's Sky, like, confuse it when it goes back to Goat Simulator three.

Speaker 1

是的。

Yes.

Speaker 1

所以我们当时很担心这一点，因为我们觉得游戏之间可能会出现破坏性干扰。

So we were worried about that because we thought maybe there will be a destructive interference between games.

Speaker 1

确实如此。

That's that's true.

Speaker 1

我们选择这些游戏都是3D的，因为我们认为游戏之间可能会有经验迁移。

We picked those games to all be three d because we thought there was a chance that some of the experience would transfer between games.

Speaker 1

如果我们训练的是俄罗斯方块和无人深空，就不会期待有任何正向干扰了。

If we had trained on Tetris and No Man's Sky, we wouldn't expect any positive interference here.

Speaker 0

环境有点不一样，对吧？

Slightly different environment, isn't it?

Speaker 1

是的。

Yes.

Speaker 1

但由于这些游戏都有一些相似的概念，相同的导航方式和镜头移动方式，我们认为多游戏训练可能会产生一些有趣且积极的效果。

But because all of those games have some similar concepts, same way of navigating around, moving the camera, we thought maybe there would be some interesting and positive outcome to training on multiple games.

Speaker 0

你之前告诉我有两个重大发现

You told me that there were two there were two big results

Speaker 1

是的。

Yes.

Speaker 1

我看到了。

I saw.

Speaker 1

另一个结果（抱歉，我之前提到过）是：当你在除一个环境外的所有环境中训练智能体时，它在保留环境中的表现与仅在该环境中训练的智能体大致相当。

So the other game the other result, sorry, which I mentioned before was that when you train an agent on all environments but one, it performs roughly as well on this held out environment as the agent that has only been trained on that environment.

Speaker 0

好的。

Okay.

Speaker 0

等等。

Wait.

Speaker 0

等一下。

Wait.

Speaker 0

没错。

So right.

Speaker 0

你有八个游戏。

You've got you've got eight games.

Speaker 0

对。

Yep.

Speaker 0

你训练一个智能体玩这些游戏，但排除《山羊模拟器3》。

You train an agent on them and leave off Goat Simulator three.

Speaker 1

是的。

Yes.

Speaker 0

它之前从未接触过《山羊模拟器3》。

It's never seen Goat Simulator three before.

Speaker 0

然后你把它和另一个只玩过《山羊模拟器3》的智能体做对比。

And then you sort of compare it against an agent that's seen nothing else but Goat Simulator three.

Speaker 1

完全正确。

Exactly.

Speaker 0

接着你在《山羊模拟器3》里测试它们。

And then you test them in Goat Simulator three.

Speaker 1

是的。

Yes.

Speaker 1

这里的SEMA智能体表现几乎和我们称为环境专家的智能体一样好，因为它是该领域的专家。

And the SEMA agent here performs nearly as well as the we call it the environment expert because it's an expert in it's a gut expert.

Speaker 1

所以平均来看它表现稍逊，但整体表现还是相当不错的。

So yes, it doesn't perform as well on average, but it does perform reasonably well.

Speaker 1

这说明它已经学会泛化到未见过的场景，比如其他游戏都有不同的角色形象。

So it has learned to generalize to unseen, you know, all the other games have different avatars, for example.

Speaker 1

这个智能体还没有

The agent has not

Speaker 0

其他地方都没有山羊。

There's no goats anywhere else.

Speaker 0

太空里也没有山羊。

There's no goats in space.

Speaker 0

那里

There

Speaker 1

太空里没有山羊。

is no goats in space.

Speaker 1

但智能体能移动山羊并执行一些任务。

But the agent is able to move the goat around and perform a few tasks.

Speaker 1

当然，《模拟山羊》中有一些动作是特定且独有的，我们不指望

Of course, there are some actions in Goat Simulator which are specific and unique to GOT Simulator and that we don't expect

Speaker 0

把宇航员当布娃娃甩。

bagdolling astronauts.

Speaker 1

不。

No.

Speaker 1

我们也不指望智能体能做到这点，因为它从未见过或体验过这些。

And we don't expect the agent to to be able to to get that because it has never seen it or it has it has never seen experience from it.

Speaker 1

但其他方面比如导航，或许对物体和颜色的概念，这些是它能掌握的。

But other things like navigation, maybe concepts about objects and colors, that is something that it can achieve.

Speaker 0

好的，那么如何将这种能力扩展到持续十秒、十五秒的跳跃抓取动作之外的任务？

Okay then, so how do you expand that beyond these sort of tasks that last for ten seconds, fifteen seconds of jumping in, picking up a motion?

Speaker 1

我们目前正在研究实现方法。

We are working currently on trying to find ways of doing that.

Speaker 1

首先还有很大提升空间，因为即使在这些短期任务上，我们的智能体也尚未达到人类水平。

First of all, is a lot of way to improve because our agent is not at human level even for those short horizon tasks.

Speaker 1

对于更长期的任务，我们正在探索可行的解决方案。

For longer horizon tasks, we are looking into methods which would allow us to do that.

Speaker 1

你之前提到的大型语言模型，或许能够在更长的时间跨度上进行推理。

You mentioned large language models before, which are able to maybe reason on a higher time horizon, longer time horizon.

Speaker 1

所以将这些模型与我们的智能体结合可能是个方法。

So maybe combining those models with our agents could be a way.

Speaker 1

但这属于未来的工作。

But that's future work.

Speaker 0

我能和其中一个搭档玩游戏来提升体验吗？

Can I play alongside one of these things just to make my experience of the game better?

Speaker 0

说白了就是我能作弊吗？

Can I cheat basically?

Speaker 1

确实有些智能体会找到利用游戏漏洞的方法，做出开发者未预期的行为。

So some agents do find, you know, ways of exploiting the games to do things that were not originally intended.

Speaker 1

某种程度上你可以称之为作弊。

So in a way you could call that cheating, I guess.

Speaker 1

不过当我们拥有全能AI系统后，和它一起玩游戏可能正是你想做的。

But, you know, playing like once we have very competent AI systems, which could do all sorts of things, playing games with you could also be one of the things you want to do with those systems.

Speaker 1

那会相当有趣。

That would be quite fun.

Speaker 1

特别是在沙盒游戏里，比如你可以对智能体说'现在我想在模拟山羊里玩捉迷藏'。

Think especially in these sandbox games, if you had an agent which you could say, okay, now I want to play hide and seek in in Goat Simulator.

Speaker 1

假设这是多人模式，你有两只山羊，想怎么玩都行。

And let's say it's multiplayer, you have two goats and you can play whatever you want.

Speaker 1

我觉得能按自己心意与智能体互动会非常有趣。

I think it would be quite fun to interact with an agent in exactly the way you want.

Speaker 1

我们现在使用大语言模型的方式，是通过指令让它们做各种事：写诗、角色扮演、编写代码等。

So the way we currently sometimes use large language models is we prompt them to do all sorts of things, write a poem, do some role play, write code.

Speaker 1

如果我们能同样指挥游戏里具现化的智能体，对玩家来说会充满乐趣。

If we had the same ability, but also with an agent that has an avatar in game, which you could prompt, I think that could be lots of fun for gamers.

Speaker 0

是的。

Yeah.

Speaker 0

就像NPC的升级、升级、再升级，不是吗

It's like level, level, level up on an NPC, isn't

Speaker 1

？

it?

Speaker 1

对。

Yes.

Speaker 0

好的。

Okay.

Speaker 0

那么这里的主要目标是什么？

So what is the big objective here then?

Speaker 1

我认为CMA团队感兴趣的是构建传统智能体。

So I think what we are interested in the CMA team is to build traditional agents.

Speaker 1

我们以游戏为平台推动这一领域的创新突破，从而更接近AGI，理解如何构建真正通用的智能体。

So we use games as a platform to drive innovation and breakthrough into this particular space so that we can get a step closer to AGI, understand how to build those truly general agents.

Speaker 1

所以我们真正致力于AGI的研究。

So we are really in for the research towards AGI.

Speaker 0

DeepMind所追求的AGI定义是什么？

What's the definition of AGI that DeepMind work to?

Speaker 1

我认为AGI的定义是像人类一样通用且能干的智能体。

I think definition of AGI is an agent that is as general and capable as a human.

Speaker 1

我认为通用性才是关键。

And I think the general is is key here.

Speaker 1

可以比人类更强，但至少要达到人类水平。

Could be better than a human, but at least at human level.

Speaker 0

所以你可以把它放到不同环境中，它能表现得和人类一样好？

So you can pick it up and drop it into different environments and it can do as well as a human can?

Speaker 1

是的。

Yes.

Speaker 1

所以我认为，如果你想想游戏，作为人类，当有新游戏发布时，我会安装它。

So in in I think if you think about gaming, right, as a human, maybe there is a new game coming out, I install it.

Speaker 1

很快我就能玩这个游戏了。

And very quickly, I will be able to play this game.

Speaker 1

也许我需要先过一遍教程，因为游戏有些特定规则需要学习。

Maybe I need to go through tutorial because there are some specificity to the game that I need to learn.

Speaker 1

但总的来说，我认为我们非常擅长适应这些新情境、新环境，并能胜任。

But in general, I think we are very good at adapting to those new situations, to those new environments, and be competent at it.

Speaker 1

我们同样希望我们的智能体也能做到这一点。

And we really want the same for our agents.

Speaker 1

我们希望将智能体投入新环境或展示新游戏时，能观察到它们确实也能适应并在新情境中变得熟练。

We want to drop the agents in a new environment or show it a new game and observe that they can indeed also adapt and become competent at those new situations.

Speaker 0

某种程度上你们已经做到了，不是吗？

Well, you sort of done that already, haven't you?

Speaker 1

还有很长的路要走。

There is still a long way to go.

Speaker 1

我们的智能体目前仅被训练执行简单动作，实际上还有更多技能和场景可以用来训练它们。

So our agents, you know, have been trained to only perform simple actions, really, there is so much more skills and scenarios which we could use to train our agents.

Speaker 1

如何打造能应对上千种不同游戏的智能体？

How do we make an agent that could handle thousand different games?

Speaker 1

这需要什么条件？

What does it take?

Speaker 1

还需要一些关键突破。

There are some breakthrough that needs to happen.

Speaker 0

那么假设你们实现了这个目标——我相信你们一定能做到。

So then let's say that you manage that, which I'm I'm sure you will.

Speaker 0

那接下来会发生什么？

What what happens then?

Speaker 0

这个理念是否意味着，你可以将所学应用到我们生活环境中的那些通用型智能体上？

Is the idea that that you can take that learning and then apply it to really generalist agents that exist in in the environment that we all live in?

Speaker 1

是的，确实存在方法。

Yes, there are methods.

Speaker 1

我们是如何构建这个智能体的？

How did we build the agent?

Speaker 1

我认为理解我们采用何种方法取得这一成果至关重要，这样我们才能将相同方法应用到其他你可能感兴趣的领域。

Think the understanding what method we use to achieve this result is very important so that we can then apply the same method to other domains, which you might be interested in.

Speaker 1

理解构建这类智能体的技术细节，包括如何收集数据、哪些数据有用、采用什么算法，以及如何训练智能体——

Understanding technical kind of details of how to build those agents, understanding how to collect the data, which data is useful, algorithms, how do we train the agent?

Speaker 1

我认为这些同样非常重要。

I think that's very important as well.

Speaker 1

因此，任何能帮助我们理解如何推进AGI发展的创新和进步，都是我们所追求的。

So, yeah, any kind of innovation and progress we can make towards understanding the how to make progress in AGI is what we are after.

Speaker 0

但这些经验也能应用到其他领域吗？

But these are lessons that can apply elsewhere?

Speaker 1

它们可以适用于其他领域。

They could apply to other domains.

Speaker 0

好的。

Okay.

Speaker 0

那么给我举个例子吧。

So give me an example then.

Speaker 0

我应该期待什么样的智能体？

What kind of agents should I be hoping for?

Speaker 1

你是指我们在讨论遥远的未来吗？

I mean, so are we talking about the far future?

Speaker 0

好的，继续。

Yeah, go on.

Speaker 1

好的。

Okay.

Speaker 1

所以我认为智能体的例子可以是安全可靠的自动驾驶汽车。

So I think I could say example of agents would be self driving cars that are safe and reliable.

Speaker 1

我认为这些智能体会采取具有后果的行动。

I think those are agents which take actions that have consequences.

Speaker 1

因此，拥有能够适应未知情况的强大可靠的自动驾驶汽车将非常重要。

And so having very robust and reliable self driving car which can generalize to unseen circumstances would be very important.

Speaker 1

我认为这里的普适性非常重要，因为当你开车时，永远不知道会发生什么。

And I think the generality here is very important because when you drive, never know what's going to happen.

Speaker 1

如果现实世界中有智能体，我们显然希望它们能为我们执行指令。

If we had agents in the real world, one obvious things we would like would be for agents to carry out instructions for us.

Speaker 1

例如，你家中的机器人可以听你说'把这个重物搬到桌子上'。

For example, it would be a robot in your house where you say, maybe carry this heavy object to the table.

Speaker 1

你看，这是一个用语言定义的任务，会非常有用。

You know, that's a task which is defined in language, which would be very useful.

Speaker 1

也许能够编程和编写软件的虚拟智能体会对我们非常有用。

Maybe virtual agents, which can code and write software would be very useful for us.

Speaker 1

也许智能体可以帮助你完成日常的在线任务，比如购物。

Maybe agents that can do, that can help you in everyday tasks which you do online, for example, shopping.

Speaker 1

我的意思是，我们花了很多时间研究该买什么。

I mean, we spend a lot of time researching what to buy.

Speaker 1

如果我想买一副新耳机，可能会花几个小时来了解哪款适合我。

If I want a new pair of headphones, I might spend hours to understand which one is good for me.

Speaker 1

但如果能有个智能体替我完成这项调研，那会非常有用。

But if I could have an agent that can do this research for me, that would be very useful, think.

Speaker 0

我正想着，实际上我要回到你关于AGI的话题上，如果可以的话，就以此作为结束。

I'm just thinking about, I'm just going back to your AGI stuff actually, if I may, just to finish on.

Speaker 0

你认为AGI的火花会在你的实验室里迸发吗？

Do you think that the sparks of AGI will happen in your lab?

Speaker 1

嗯，这很难说，但我们确实希望如此，我们开发的方法将有助于推动AGI的进展。

Well, that's very difficult to say, but we do hope, yes, that's the methods we develop will help to make progress towards towards AGI.

Speaker 1

我认为如果我们能开发出一个可以玩任何游戏、执行任何游戏指令的智能体

I think if we had an agent that could play any games and carry out any instruction in any game

Speaker 0

并且展现出超人类的能力。

And perform a superhuman ability.

Speaker 1

是的，我认为这将是一个重要的里程碑，因为这至少能在游戏中展现出人类水平的智能。

Yes, I think that would be a good milestone to achieve because that would show human level intelligence, at least in games.

Speaker 1

但游戏非常丰富，有时甚至有点类似现实世界。

But games are very rich and sometimes are a bit like the real world.

Speaker 1

你需要理解周围环境，需要理解物体，需要理解行为的后果，特别是当你把语言因素加进来时。

You need to understand your surroundings, you need to understand objects, you need to understand the consequence of your actions, especially if you add language into the mix.

Speaker 1

因为如果把语言因素加入任务中，你实际上可以让智能体做任何事情。

Because if you add language into the mix for tasks, then really you could ask agent to do anything.

Speaker 1

所以如果这样的智能体能在任何游戏中执行任何指令，我想我们就离AGI更近了一步。

So if such agent is able to carry out any instruction in any game, I think we would be, you know, one step closer to AGI.

Speaker 0

弗雷德，非常感谢你参加我们的节目。

Fred, thank you very much for joining me.

Speaker 1

非常感谢你的邀请。

Well, thank you so much for having me.

Speaker 0

当然。

Of course.

Speaker 0

在做这个播客的过程中，有好几次我们都处在某个项目的早期阶段。

There have been a few moments now while making this podcast where we have been here at the early stages of a project.

Speaker 0

从前，当只有一丝微弱的火花可看时，结果总是宏大或引人注目的。

Before, the results were grand or headline grabbing when there was only the tiniest little spark of something to see.

Speaker 0

还记得我们第一次参观机器人实验室，看它们原地转圈的样子吗？

Like, remember when we first visited the robot lab to watch them flailing around in circles?

Speaker 0

或是我们初次见到AlphaFold时，它才刚刚开始解开蛋白质的奥秘？

Or the first time that we saw AlphaFold when it was only just starting to unravel the mysteries of proteins?

Speaker 0

嗯，我觉得我们现在对CEMA的认知就处于这个阶段。

Well, I think it feels a lot like this is the stage we're at with CEMA.

Speaker 0

当然，捡起一个蘑菇并不是我们期待中的AI未来图景，但这些智能体前进的方向，才真正蕴含着未来可能性的秘密承诺。

Because, sure, picking up a mushroom isn't quite the vision of the AI future that we've been waiting for, but it's the direction that those agents are moving in that really holds the secret promise to what might be possible.

Speaker 0

如果我们想要能自主决策、实现目标、独立于人类指令运作的AI，那么这就是必经之路。

Because if we want AI that can make its own decisions, achieve its own objectives, operate independently of our instructions, then this is a vital step along the way.

Speaker 0

所以今天，它可能只是个蘑菇。

So today, it might be a mushroom.

Speaker 0

问题是，下次我们再来时，它会变成什么？

Question is, what will it be next time we come back?

Speaker 0

您正在收听的是由我——汉娜·弗莱教授主持的《谷歌DeepMind》播客。

You've been listening to Google DeepMind the podcast with me, Professor Hannah Fry.

Speaker 0

如果您喜欢本期内容，请在YouTube订阅或通过您喜爱的播客平台关注我们。

If you like what you just heard, hit subscribe on YouTube or follow us on your favorite podcast platform.