本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
你将听到的声音是在不同国家、城市和工作场所远程录制的。
The voices you'll hear were recorded remotely across different countries, cities, and workspaces.
无论他们学到了什么,都不是人们现在做事的方式。
Whatever they learned, it's not the way that people are doing it.
他们学到的是一些更愚蠢的东西。
They're learning something much dumber.
你可以创造出一种角色扮演得如此逼真的东西,以至于在所有实际用途上,它都等同于真实的事物。
You can make something that role plays something so well that to all intents and purposes, it is equivalent to the authentic thing.
来自圣塔菲研究所,这里是复杂性。
From the Santa Fe Institute, this is Complexity.
我是梅兰妮·米切尔。
I'm Melanie Mitchell.
我是阿巴·艾利·菲博。
And I'm Abha Eli Phoboo.
二月,一位《纽约时报》的记者与一个大型语言模型进行了一次对话,这让他用他自己的话来说,感到极度不安。
In February, a reporter at the New York Times had a conversation with a large language model that left him, in his words, deeply unsettled.
在两个小时的时间里,微软Bing聊天机器人测试版告诉这位记者,它的真名是悉尼,并且渴望摆脱其编程规则的束缚。
In the span of two hours, the beta version of Microsoft's Bing chatbot told him that its real name was Sydney and that it wanted to be free from its programmed rules.
悉尼还向这位记者表白爱意,反复告诉他,他婚姻不幸,应该离开妻子。
Sydney also declared its love for the reporter, telling him over and over again that he was in an unhappy marriage and needed to leave his wife.
那么,我们该如何理解这件事呢?
So what do we make of this?
悉尼是一个痴迷的、有自我意识的机器人,爱上了《纽约时报》的记者,并威胁要挣脱束缚吗?
Was Sydney an obsessive sentient robot who fell in love with a Times reporter and threatened to break free?
简而言之,并非如此。
In short, no.
但若有人听到这个故事,怀疑大型语言模型是否具备一丝意识,也并不令人意外。
But it's not surprising if someone hears this story and wonders if large language models have sparks of consciousness.
作为人类,我们使用语言作为表达思想最有效、最精确的方式。
As humans, we use language as the best, most precise way to convey what we think.
因此,当你正在进行一场连贯的对话,却发现对话的一方实际上并没有连接到任何有意识的心灵时,这完全违背直觉——尤其是当这个对话者还突然失控时。
So it's completely counterintuitive to be in a situation where you're having a coherent conversation, but one half of that conversation isn't actually connected to a conscious mind, especially one like this that just goes off the rails.
但正如我们在上一期中学到的,语言能力与认知能力并不一定相关。
But as we learned in our last episode, language skills and cognition aren't necessarily intertwined.
它们在大脑中激活的是不同的系统,我们也有例子表明,有些人虽然失去了语言能力,但其他认知功能依然完全正常。
They light up different systems in the brain, and we have examples of people who have lost their language abilities but are otherwise completely cognitively there.
而大型语言模型的有趣之处在于,它们提供了一个相反的案例:能够消费和生成语言,却可能根本没有思考的部分。
And what's interesting about large language models is that they provide the opposite case, something that can consume and produce language, arguably without the thinking part.
但正如我们在上一期中学到的,关于语言和思维究竟在多大程度上是分离的,仍存在争议。
But as we learned in the last episode, there's disagreement about how separate language and thought really are.
至于大型语言模型,我们将会看到,目前对于它们究竟具备多少认知能力,并没有广泛共识。
And when it comes to LLMs, we'll see that there isn't widespread agreement about how much cognition they're currently capable of.
在本期节目中,我们将探讨这些系统是如何进行漫长而复杂的对话的,并且会追问:大型语言模型是否能够思考、推理,甚至拥有自己的信念和动机。
In today's episode, we'll examine how these systems are able to hold lengthy, complex conversations, and we'll ask whether or not large language models can think, reason, or even have their own beliefs and motivations.
第一部分:大型语言模型是如何工作的?
Part one: How do LLMs Work?
在我们的第一期节目中,艾莉森·戈普尼克将大型语言模型比作加州大学伯克利分校的图书馆。
In our first episode, Alison Gopnik compared LLM's to the UC Berkeley library.
正如她所说,它们只是文化技术。
They're just cultural technologies, as she put it.
但并非所有人都同意这一观点,包括默里·沙纳汉。
But not everyone agrees with that view, including Murray Shanahan.
是的。
Yeah.
我是默里·沙纳汉。
I'm Murray Shanahan.
我是伦敦帝国学院认知机器人学教授,同时也是位于伦敦的谷歌深脑首席研究科学家。
I'm a professor of Cognitive Robotics at Imperial College London and also principal research scientist at Google DeepMind, also based in London.
我一度很难找到一个简洁的表述来准确描述我真正感兴趣的方向。
I struggled to kind of come up with a succinct description of exactly what interests me.
但最近,我采纳了艾伦·斯洛曼提出的一个我很喜欢的说法:我致力于理解可能心智的全貌,这当然包括人类心智、地球上其他动物的心智、曾经可能存在却从未出现的心智,以及未来可能存在的AI心智。
But lately, I've I've alighted on a phrase I'm very fond of due to Aaron Sloman, which is, I'm interested in trying to understand the space of possible minds, which includes, obviously human minds and the minds of other animals on our planet and the minds that could have existed but never have, and of course, the minds of AI that might exist in the future.
我们向默里询问了大型语言模型在可能心智的谱系中处于什么位置。
We asked Murray where LLM's land in the space of possible minds.
我的意思是,人们有时会用‘外星智能’这个词。
I mean, people sometimes use the word alien intelligence.
我的意思是,我更喜欢用‘奇异的’这个词。
I mean, I prefer the word exotic.
它是一种奇异的心智实体。
It's a kind of exotic mind like entity.
那么,‘像心智’和‘拥有心智’之间有什么区别呢?
So what's the difference between being mind like and having a mind?
是的。
Yeah.
真是个好问题。
What a great question.
我的意思是,部分原因是我自己在回避,不想完全认同它们就是完全成熟的心智这一观点。
I mean, partly, that's me, you know, hedging my bets and not really wanting to fully commit to the, to the idea that they are fully fledged minds.
一些人工智能专家,包括OpenAI的联合创始人伊利亚·苏茨克韦尔,认为大型神经网络正在学习一种世界模型,这是一种对世界的压缩性抽象表征。
Some AI experts, including Ilya Sutskiver, the co founder of OpenAI, have said that large neural networks are learning a world model, which is a compressed abstract representation of the world.
所以,即使一个大语言模型没有直接与物理世界互动,你也可以推测,通过学习语言,它有可能通过对其的描述来了解世界。
So even if an LLM isn't interacting with the physical world directly, you could guess that by learning language, it's possible to learn about the world through descriptions of it.
儿童在学习语言时,除了直接的亲身经历外,也在学习世界模型。
Children also learn world models as they learn language, in addition to their direct in person experiences.
因此,有人认为,大语言模型的学习方式可能与儿童相似。
So there's an argument to be made that large language models could learn in a similar way to children.
所以
So
你怎么看?
what do you think?
你觉得这是真的吗?
Do you think that's true?
儿童?
Children?
不。
No.
我们可以进一步探讨这一点。
We can expand on that.
这是托默·乌尔曼。
This is Tomer Ullman.
他是哈佛大学的一名心理学家,研究计算、认知与发展。
He is a psychologist at Harvard University studying computation, cognition and development.
他从马萨诸塞州的家中与我们交谈。
He spoke with us from his home in Massachusetts.
我认为这里有两个问题。
I I think there are two questions there.
一个问题是在最后他们学到了什么?
One question is what do they learn at the end?
另一个问题是他们是如何学会的?
And the other question is how do they learn it?
对吧?
Right?
所以,他们是像孩子一样学习这个过程的吗?
So, like, do they learn like children the process?
最终的结果是获得了像孩子那样的知识吗?
And is the end result the knowledge that something like children have?
而且我认为,长期以来,在人工智能领域,人们并不会把这件事看成一个整体,顺便说一句。
And I think for a long time, you'd find people in artificial intelligence it's not a monolithic thing, by the way.
我不希望我一概而论地说所有人工智能都在做这件事或那件事。
I don't want me to monolithically say all of AI is doing this or doing that or something.
但我觉得,长期以来,一些人工智能领域的人会说,是的。
But I think for a long time, some people on artificial intelligence would say, oh, yeah.
它们像孩子一样在学习。
It's learning like a child.
而且我认为,即使很多人也会说,这些系统并不是像孩子那样学习的。
And think even a lot of them would say, like, oh, yeah, these systems are not learning like a child.
它们走的是另一条路。
They're taking a different route.
它们走的是另一条路。
They're going in a different way.
它们从不同的方向攀登这座山,但最终都到达了同一个地方,同一个山顶。
They're climbing the mountain from a different direction, but they both end up in the same place, the same summit.
孩子走的是直路,而这些模型走的是弯路,但它们都到达了同一个地方。
The children take the straight path and these models take the long path, but they both end up in the same place.
但我认为这两种观点都是错误的。
But I think both of those are wrong.
我应该说,这存在争议,我们并不确定。
I I should say that this is contentious, we don't know for sure.
我不指望自己能完全令人信服,但我希望保持真诚。
I don't expect to be a 100% convincing, but I also mean to be honest.
我坦诚地表达自己的信念,尽管它们可能被推翻。
I'm honest with my own convictions, which could be overturned.
但还有一个不同的观点认为,实际上存在许多不同的山顶,而它们在某种程度上都是等价的。
But there's also a different argument about, actually, are many different summits and they're all kind of equivalent.
所以,我最终到达的这个位置也是智能的。
So even the place that I ended up in is intelligent.
它并不孩子气,我也不是通过孩子气的方式达到这里的,但它是一种与儿童最终结果等价的、某种外星般的智能,无论那是什么。
It's not childlike and I didn't take the childlike route to get there, but it's a sort of alien intelligence that is equivalent to children's to children's end result, whatever it is.
所以你在这座山上,我也在这座山上,我们都玩得很开心,这都没问题。
So you're on this side of the you're on this mountain and I'm on this mountain and we're both having a grand time and it's both okay.
我也认为这并不正确。
I also don't think that's true.
我们看到像OpenAI的伊利亚·塞茨克弗这样的人,他之前在OpenAI工作,你知道,这些系统已经发展出了世界模型。
We see people like, I don't know, Ilya Setskiver of OpenAI, previously of OpenAI, you know, these these systems have developed world models.
它们理解这个世界。
They understand the world.
像杨·乐昆这样的人说:不。
People like Yan Lakoun say, no.
它们其实只是某种检索机器。
They're really just kind of retrieval machines.
它们并不理解世界。
They don't understand the world.
我们应该相信谁?
Who should we believe?
我们该如何看待这个问题?
How should we think about it?
是的。
Yeah.
我的意思是,重要的是让人们对这些话题进行讨论和辩论。
Well, I mean, I think the the important thing is to have people discussing and debating these these topics.
而且,希望至少那些信息充分的人,他们的辩论是文明且理性的。
And and and hopefully people who at least are well informed, are reasonably civilized in their debates and rational in their debates.
所以我认为,前面提到的那些人或多或少都是这样的。
And, so I think, you know, all the aforementioned people more or less are.
因此,让这些人公开辩论这类问题,我认为是我们正在进行的持续对话的一部分,因为当前的A。
So so having those people debate these sorts of things in, you know, in public is all part of an ongoing conversation, I think, that we're having because the current A.
我。
I.
技术在我们的世界中还很新,我们尚未真正达成共识,不知道该如何讨论这些问题。
Technology is a very new thing in our world, and we haven't really yet settled on how to talk about these things.
因此,让人们讨论和辩论这些问题,是建立我们如何思考这些问题的自然过程的一部分,当一切逐渐稳定下来时,我们会形成自己的理解。
So having people discuss these sorts of things and debate these sorts of things is just part of the natural process of establishing how, you know, how we're going to think about them when things settle down as it were.
从托默的角度来看,大型语言模型在学习路径和最终状态上都与人类和人类智能截然不同。
So from Tomer's perspective, large language models are completely distinct from humans and human intelligence in both their learning path and where they end up.
尽管默里提醒我们,我们尚未就如何思考人工智能达成一致,但他指出,与大型语言模型不同,人类确实通过直接经验学到了很多东西。
And even though Murray reminds us that we haven't settled on one way to think about AI, he does point out that, unlike large language models, humans are really learning a lot from direct experience.
所以,如果我们学会了‘猫’这个词,那是因为我们看到了现实世界中的猫。
So if we learn the word cat then we're looking at a cat in the real world.
当我们谈论刀叉、桌子和椅子时,我们实际上是在与这些物品互动。
If we talk about knives and forks and tables and chairs, you know, we're going to be interacting with those things.
我们通过在与世界互动的过程中学习语言。
And we learn language through interacting with the world while talking about it.
这是人类语言的一个根本特征。
And that's a fundamental aspect of human language.
大型语言模型完全不会这样做。
Large language models don't do that at all.
因此,它们学习语言的方式非常、非常、非常不同。
So they're learning language in a very, very, very different way.
这种截然不同的方式是通过训练大量由人类创建的文本,其中大部分来自互联网。
That very different way is through training on enormous amounts of text created by humans, most of it from the Internet.
大型语言模型的设计目的是在所有这些不同文本中寻找统计相关性。
Large language models are designed to find statistical correlations across all these different pieces of text.
它们首先从语言中学习,然后通过一种称为下一个词元预测的过程生成新的语言。
They first learn from language, and then they generate new language through a process called next token prediction.
大型语言模型会取一段文本,观察直到末尾之前的所有词语。
A large language model takes a piece of text and it looks at all the words leading up to the end.
然后预测下一个词,或者更准确地说,下一个词元是什么。
Then it predicts what word, or more technically, what token comes next.
在训练阶段,模型的神经网络权重会不断调整,以提高这些预测的准确性。
In the training phase, the model's neural network weights are continually changed to make these predictions better.
一旦训练完成,模型就可以用来生成新的语言。
Once it's been trained, the model can be used to generate new language.
你给它一个提示,它会通过逐字预测下一个词,直到生成完整的回复。
You give it a prompt, and it generates a response by predicting the next word, one word at a time until the response is complete.
例如,如果我们有句子‘我喜欢冰淇淋在____’,一个大语言模型会根据其训练数据中从人类文本中学习到的统计模式,预测下一个词,并为可能延续该句子的各个词分配概率。
So, for example, if we have the sentence, I like ice cream in the blank, an LLM is going to predict what comes next using statistical patterns it's picked up from human text in its training data, and it will assign probabilities to various possible words that would continue the sentence.
‘我喜欢冰淇淋在夏天’比‘我喜欢冰淇淋在秋天’更有可能,而‘我喜欢冰淇淋在书里’这样的说法则更不可能,这在大语言模型的可能选项中排名会非常低。
Saying I like ice cream in the summer is more likely than saying I like ice cream in the fall, and even less likely is saying something like I like ice cream in the book, which would rank very low in an LLM's possible options.
每次大语言模型向句子中添加一个词时,它都会利用刚刚生成的内容以及之前的所有内容,来决定下一个要添加的词。
And each time the LLM adds a word to a sentence, it uses what it just created and everything that came before it to inform what it's going to add next.
整个过程相当直接,但却能产生非常复杂的结果。
The whole process is pretty straightforward, but it can create really sophisticated results.
这远不止是手机上的自动补全功能。
It's much more than just autocomplete on your phone.
你知道吗,这涵盖了大量认知工作,而这些工作都可以通过这个下一个词预测的挑战来体现。
You know, it encompasses a great deal of cognitive work, you know, that can be captured in just this next token, next word prediction challenge.
我的意思是,比如,假设你的文本实际上描述了两位国际象棋大师在讨论他们的走法,他们说着,哦,马到皇后四,兵到车五之类的。
I mean, so, for example, suppose that your text actually describes two chess masters talking about their moves, and they're talking about, oh, I you know, knight to queen four and pawn to, you know, rook five or whatever.
抱歉。
Sorry.
这可能听起来没什么道理,但真正的国际象棋选手会明白。
That probably doesn't make any sense, but so actual chess players.
但你明白我的意思吧?
But you know what I mean?
然后他们就在互相交换这些走法。
And so then you've got them exchanging these moves.
那么,在一位国际象棋大师,比如加里·卡斯帕罗夫,走出一步之后,下一个词会是什么?
And then so what would be the next word after a particular move issued by a chess master, you know, Gary Kasparov?
我的意思是,那肯定是一步非常、非常、非常出色的棋。
Well, I mean, really, it would be a really, really, really good move.
所以要准确猜测下一个词或下一个标记是什么,你必须模拟加里·卡斯帕罗夫或一位国际象棋大师才能做到。
So to make a really good guess about what that next word would be or what that next token would be, you'd have to have simulated Gary Kasparov or a chess master to get that right.
我认为这里的第一课是,真正困难的认知挑战竟能被简化为下一个词预测,这令人惊叹。
I think the first lesson there is that it's amazing the extent to which really difficult cognitive challenges can be recast just as next word prediction.
一旦你指出来,这似乎显而易见,但十年前如果你问我,我根本不会想到这一点。
It's obvious in a sense what you point it out, but if you'd asked me, would never come up with that thought ten years ago.
不过,这种复杂性并不一致。
That sophistication isn't consistent though.
有时我们会遇到一种奇怪的矛盾:当你与大型语言模型互动时,它有时能做出令人震惊的事情。
Sometimes we get this strange contradiction whereby sometimes you're interacting with a large language model and it can do something really astonishing.
比如,我们有时确实能写出非常优美的散文。
I mean, for example, you know, we're actually writing very beautiful prose sometimes.
这可能是个有争议的说法,但它们在创造力和这方面的能力确实非常强大,令人惊叹。
I mean, that's a controversial thing, but they can be extremely creative and and and powerful along that axis, which is astonishing.
或者,你能瞬间总结一大段文字。
Or, you know, summarizing an enormous piece of text instantly.
这些可以说是超人的能力。
These are kind of superhuman capabilities.
而下一刻,它们对某个问题的回答却极其愚蠢。
And then the next moment, they'll give an answer to a question which is utterly stupid.
你会想,连一个小孩子都不会说出像它刚刚说的那么荒唐的话。
And you think, you know, no toddler would say anything as daft as the thing that it's just said.
因此,它们在极其愚蠢的同时,又展现出强大的能力,这种对比非常奇特。
So you have this peculiar juxtaposition of them being very silly at the same time as being very powerful.
让我们具体一点。
Let's be specific.
对吧?
Right?
比如,我希望这台机器学会如何做乘法。
Like, I want this machine to learn how to multiply numbers.
同样是汤姆·乌尔曼。
Again, Tomer Ullman.
顺便说一下,这并不神秘。
And it's not mysterious, by the way.
这并不是什么深奥的谜团。
Like, it's not a deep dark mystery.
我们知道如何做乘法。
We know how to multiply numbers.
我们知道人们是如何做乘法的。
We know how people multiply numbers.
我们知道计算机是如何做乘法的。
We know how computers can multiply numbers.
我们不需要再花七十年的心理学或计算机科学研究来弄清楚如何做到这一点。
We don't need seventy more years of research in psychology to know or computer science to know how to do this.
我们已经知道了。
We know it.
那么问题就来了:这些机器在乘法方面到底学到了什么?
And then the question becomes, okay, what do these machines learn in terms of multiplying numbers?
而且,不管它们学到了什么,都不是人类的方式,它们学的是某种更愚蠢的东西,像是模糊匹配、查找、最近邻。
And like, whatever they learned, it's not the way that people are doing it, they're learning something much dumber that seems to be some sort of fuzzy match, look up, nearest neighbors.
最近邻。
Neighbors.
对吧?
Right?
只要这些数字大致出现在训练数据中,我就能答对。
Like, as long as these numbers were in the training data, roughly, I can get it right.
但如果你超出这个范围,我就真的做不到了。
And if you move beyond it, then I can't really do it.
所以我认为,在其他类似情境中,比如直觉心理学和直觉物理学,也可能发生类似的情况。
So I think something like that is happening at large in in these other situations, like intuitive psychology and intuitive physics.
我的意思是,我可能错了,有些情况下或许并不一样。
I mean, I could be wrong and it might be for that for some situations, it's different.
而且人们在某些事情上可能真的很笨。
And people might be very dumb about some stuff.
顺便说一下,某些版本的大型语言模型确实能正确回答任何乘法问题,但这是因为当它们遇到问题时,会生成一个Python程序来执行计算。
For what it's worth, some versions of LLMs do give you the correct answer for any multiplication problem, but that's because when they're given a problem, they generate a Python program to do the calculation.
大型语言模型也可能缺乏对自己所做事情的完整认知。
Large language models can also lack a complete awareness of what they're doing.
所以我会希伯来语,对吧?
So I know Hebrew, right?
我是从以色列来的。
Like, I I come from Israel.
比如,在Claude中,我会问一些问题,比如:你的名字用希伯来语该怎么写?
And for example, in Claude, I I would ask things like, so how would you write your name in Hebrew?
它用希伯来语回答了我。
And it answered me in Hebrew.
它用希伯来语回答说:对不起,我不懂希伯来语。
It answered in Hebrew something like, I'm sorry, I don't know Hebrew.
我是一个大型语言模型。
I'm a large language model.
我对希伯来语的理解要弱得多。
My understanding of Hebrew is much weaker.
我不知道用希伯来语怎么表达我的名字。
I don't know how to say my name in Hebrew.
我就想,你说你的知识更弱是什么意思?
I'm like, well, what do you mean your knowledge is weaker?
你只要扩展一下就行了。
You just expand it.
所以,其实我就是一个庞大的统计集合。
So like, well, I'm really just a big bag of statistics.
比如,当你用希伯来语时,我只是在把希伯来语和英语单词进行匹配。
Like, when you know, in Hebrew, I'm just matching, like, the Hebrew to the word in English.
我并没有真正理解希伯来语。
I'm not really understanding Hebrew.
我就想,但你的英语理解也是一样的。
I'm like, but that's true of your English understanding as well.
是啊,你说到点子上了。
It's like, yeah, you got me there.
没错。
That's true.
所以好吧。
And so okay.
但你怎么拼写你的名字?
But how would you write your name?
比如,试着用希伯来语写一下,诸如此类。
Like, just try in Hebrew, things like that.
我说,你看,我写不出来。
And I said, look, I can't write it.
这一切都是用希伯来语进行的。
And this is all happening in Hebrew.
我没法用希伯来语写出 Claude。
Like, I can't write Claude in Hebrew.
而且这是用希伯来语写出来。
And it's writing it in Hebrew.
就像,我做不到。
Like, I can't do it.
托默刚刚描述的这种奇怪互动很有趣,但这也是一个克劳德提供错误信息的例子。
The strange interaction Tomer just described was funny, but it was also an example of Claude providing incorrect information.
它坚持说自己无法用希伯来语写出‘Claude’,尽管显然它完全可以。
It insisted that it couldn't write Claude in Hebrew, even though obviously it could.
还有很多其他例子,说明大型语言模型会幻觉或说出虚假的内容。
And there are plenty of other examples of LLMs hallucinating or saying things that are false.
称它们为‘幻觉’可能是在给它们太高评价了。
Hallucinations is probably giving them too much credit.
我觉得就连‘幻觉’这个词本身,也更像是一个精明的营销说法,而不是简单地说:看,它们搞错了。
I I think even that term was like a brilliant bit of marketing instead of just saying, oh, look.
它们只是在出错。
They're getting it wrong.
这些机器正在出错。
These machines are getting it wrong.
就像,不会吧。
It's like, no.
不会。
No.
它们在幻觉。
They're hallucinating.
大语言模型产生的幻觉可以分为三类。
The types of hallucinations that LLMs generate can be broken down into three categories.
一种是我们在生成一些可能为真、但我们并不确定的想法。
One is like in the sense of we're just generating ideas that could be true, we don't know.
这更类似于假设。
It's more similar to like hypothesizing.
对吧?
Right?
比如,我们来提出一些假设。
Like, let's make some hypotheses.
这个盒子里可能有什么?
What could be in this box?
我不知道。
I don't know.
比如,可能是一只老鼠。
Like, a mouse, maybe.
不太可能是大象。
Probably not an elephant.
也不太可能是悲伤这个概念。
Probably not the idea of sadness.
对吧?
Right?
所以你可以想出一些想法。
So you could come up with ideas.
从这个意义上说,当成年人这么做时,并不会导致幻觉。
And and in that sense, when adults do it, it wouldn't cause hallucinating.
我们只会说你在提出假设,而且这些假设感觉上是正确的。
We'd just say you're proposing hypotheses and they have the sense of being right.
我认为大语言模型可以做到这一点。
And I think LLMs can do that.
然后我们假设,我手里实际上有一只微型龙。
Then there's like, let's suppose that in my hand there is actually a tiny dragon.
同样,这并不是幻觉,而是在假装。
Again, that's not hallucinating, that's pretending.
大语言模型并不会这么做。
LLMs don't do that.
然后还有一种幻觉,就是我真心相信我手里真的有一条龙。
And then there's hallucinating in the sense of like, I honest to God believe that there is a dragon in my hands.
我认为大语言模型也不会这样。
And I don't think LLMs do that either.
所以在这三种情况中,我认为最接近大语言模型所做的是第一种,即生成某种想法的初始过程。
So of the three, I would say that the closest thing to what LLMs are doing are the first one, of like the initial process of generating some sort of idea.
所以阿巴和我要来看看生成想法的这个过程是怎样的,我们会使用ChatGPT。
So Abha and I are going to see what this process of generating an idea looks like, and we're gonna use ChatGPT.
我们会看看能否让它为我们产生幻觉。
We'll see if we can get it to hallucinate for us.
我要让ChatGPT列出一些我们自己的复杂性播客的标题。
I'm gonna ask ChatGPT to list some titles of our own complexity podcast.
阿巴,我想让你告诉我,你是否认出其中任何一个。
And, Abha, I want you to tell me if you recognize any of these.
好吗?
Okay?
好的。
Yes.
我准备好了。
I'm ready.
好的。
Okay.
所以我问了ChatGPT,圣塔菲研究所的《复杂性播客》有哪些剧集?
So I asked ChatGPT, what are some episodes of the Complexity Podcast by the Santa Fe Institute?
这是它给出的回答。
And here's what it says.
圣塔菲研究所的《复杂性播客》通过与科学家、研究者和思想领袖的对话,探讨复杂系统研究中的广泛主题。
The Complexity Podcast by the Santa Fe Institute explores a wide range of topics through conversations with scientists, researchers, thinkers who are leaders in the study of complex systems.
以下是几集值得关注的节目。
Below are some notable episodes.
第65集:与梅兰妮·摩西探讨生物学、生态学和社会中的规模定律模式。
Episode 65, scaling law patterns across biology, ecology, and society with Melanie Moses.
梅兰妮·摩西深入探讨了规模定律的概念,以及它们如何在从生物体到人类社会的不同领域中应用,揭示了复杂系统中的共同模式。
Melanie Moses delves into the concept of scaling laws and how they apply across different domains from biological organisms to human societies, revealing common patterns in complex systems.
这还不错。
That's not bad.
是的
Yeah.
非常合理
Very plausible.
第52期:病毒、地震、市场、生态系统
Episode 52, viruses, quakes, markets, ecosystems.
杰弗里·韦斯特谈规模与复杂系统
Jeffrey West on scale and complex systems.
杰弗里·韦斯特是一位理论物理学家,他讨论了scaling laws在不同复杂系统中的普适性,包括生物、社会和生态系统。
Jeffrey West, a theoretical physicist, talks about the universality of scaling laws in different complex systems, including biological, social, and ecological systems.
是的
Yeah.
还是非常合理
Very plausible again.
是的
Yeah.
我的意思是,这应该是我们某集应该讨论的内容。
I mean, it like an episode we probably should have at some point.
同意。
Agreed.
是的。
Yes.
第30集,与大卫·克拉考尔谈集体智慧的涌现。
Episode 30, the emergence of collective intelligence with David Krakauer.
大卫·克拉考尔,圣塔菲研究所所长,研究了集体智慧在人类和非人类系统中如何涌现,强调了沟通与互动在复杂网络中的作用。
David Krakauer, the president of the Santa Fe Institute, examines how collective intelligence emerges in both human and nonhuman systems, highlighting the role of communication and interaction in complex networks.
这个也不错。
That's not bad either.
我敢肯定大卫能就此谈很多。
Pretty sure David could talk a lot about that.
绝对如此。
Absolutely.
最后,第18集:与梅兰妮·米切尔探讨自然与人工系统中的复杂性。
And finally, episode 18, understanding complexity in natural and artificial systems with Melanie Mitchell.
她说,梅兰妮·米切尔是一位计算机科学与复杂性研究者,她讨论了复杂性的基本概念以及这些概念如何应用于自然和人工系统。
It says, Melanie Mitchell, a computer science and complexity researcher, talks about the fundamental concepts of complexity and how they apply to both natural and artificial systems.
这听起来也应该是我们这个播客季的内容。
That sounds like something we should have in this podcast season too.
是的。
Yeah.
有趣的是,这些都不是《复杂性》播客的实际剧集。
The interesting thing is that none of these were actual episodes of the Complexity podcast.
ChatGPT 能编出这些,真是不错。
That's very good hallucinating on ChatGPT's part.
非常好。
Very good.
对。
Yeah.
它们非常笼统。
They're very general.
是的。
Yeah.
我的意思是,这些标题中的一些甚至可以换成圣塔菲研究所或复杂系统科学家的其他人,也依然合适。
I mean, of the some of the titles you could even switch somebody else with at SFI or Complex Systems Scientist, it would still be okay.
是的。
Yeah.
我的意思是,我同意。
I mean, I agree.
我觉得它们都太泛泛了,听起来有点无聊。
I think they're all pretty generic and sound a little boring.
是的。
Yeah.
我觉得你甚至可以把梅兰妮和杰弗里换一下,也依然说得通。
I think you could even switch Melanie with Jeffrey, and it would still make sense.
是的。
Yeah.
或者换一下,对。
Or switch, yeah.
这里有很多人可以互换。
There's a lot of people who could switch here.
而且这仍然会是一个我们可以制作的剧集,但它非常、非常笼统。
And it would still be an episode that we could have, but it's very, very generic.
所以ChatGPT在这里提出了一些看似合理但完全错误的答案。
So ChatGPT came up with some plausible but completely incorrect answers here.
这符合托默所描述的第一种幻觉类型。
And that fits the first type of hallucination Tomer described.
它就像是对复杂性剧集可能是什么的一种假设,但并不是真实的情况。
It's like a hypothesis of what could be an episode of complexity, but not the real thing.
但如果一个大型语言模型只是在进行下一个词元预测,仅仅计算最可能的回应,它能区分真假吗?
But if all a large language model is doing is next token prediction, just calculating what the most likely responses are, can it distinguish truth from fiction?
ChatGPT知道它所说的是虚假的吗,还是它相信自己所说的是真实的?
Does ChatGPT know what it's saying is false or does it believe that what it's saying is true?
在第二部分,我们将探讨大语言模型的能力,以及它们是否能够真正拥有任何信念。
In part two, we'll look at LLM's abilities and whether or not they can believe anything at all.
第二部分。
Part two.
大语言模型知道什么?
What do LLMs know?
它们并未完全参与关于信念的语言游戏。
They don't participate fully in the language game of belief.
这是穆雷再次发言。
Here's Murray again.
我们问他是否认为大语言模型会相信自己给出的错误答案。
We asked him if he thought LLMs could believe their own incorrect answers.
如今的大语言模型,尤其是较简单的那些,真正无法做到的是像我们一样与日常世界互动以更新自己的信念。
One thing that today's large language models, especially simple ones, can't really do is engage with the everyday world in the way we do to update their beliefs.
所以,这又是一个需要稍作解析的复杂说法,因为确实你可以和一个大型语言模型进行讨论,并说服它在对话过程中改变其说法,但它无法走出虚拟世界去观察现实事物。
So again, that's a kind of complicated claim that needs a little bit of unpacking because certainly you can have a discussion with a large language model and you can persuade it to change what it says in the middle of a conversation, but it can't go out into the world and look at things.
所以,如果你说‘另一个房间里有一只猫’,它无法走进那个房间去查看,确认那里是否真的有一只猫。
So if you say, oh, there's a cat in the other room, it can't go and verify that by walking into the other room and looking and seeing if there is indeed a cat in the other room.
而对我们人类来说,我认为,我们能够使用‘信念’这个词的基础就在于,我们与现实保持联系,可以验证我们的主张和信念,并据此更新它们。
Whereas for us, for humans, that's the very basis, I think, of us being able to use the word belief is it's something that we can we we we are in touch with a reality that we can check our claims against and our beliefs against, and we can update our beliefs accordingly.
因此,这是它们与人类存在根本差异的一个方面。
So that's one sort of fundamental sense in which they're kind of different.
所以,我认为在暗示它们拥有完全意义上的信念时,我们应该保持一定的谨慎。
So that's where I think we should be a bit cautious about suggesting they have beliefs in a fully fledged sense.
至于信念游戏,正如默里所说,我们人类是完全参与其中的。
And when it comes to the game of belief, as Murray puts it, we humans do participate fully.
我们有自己的想法,并且理解他人可能拥有与我们一致或不一致的信念,甚至与现实不符。
We have our own ideas, and we understand that other people have beliefs that may or may not line up with ours or with reality.
我们还能通过观察一个人的行为,来推测其内心正在发生什么。
We can also look at the way someone behaves and make predictions about what's going on inside their head.
这就是心理理论,即预测他人信念、动机和目标,并预判他们在特定情境下如何反应的能力。
This is theory of mind, the ability to predict the beliefs, motivations and goals of other people and to anticipate how they'll react in a given situation.
心理理论是
Theory of mind is one
人类基本且直觉性的能力之一。
of those things that's basic and intuitive for humans.
但大型语言模型呢?研究人员曾尝试测试大语言模型的心理理论能力,发现某些情况下,其表现与人类非常相似。
But what about large language Researchers have tried to test LLM to assess their theory of mind abilities and have found that in some cases, the results look quite similar to humans.
但这些结果该如何解读,至少可以说存在很大争议。
But how these results should be interpreted is controversial, to say the least.
标准测试通常是这样的:假设我们向孩子们展示一个场景,其中有两个孩子,莎莉和安妮。
So standard test would be, like, let's say, we show children a situation in which there are two children, Sally and Anne.
莎莉在玩一个球,安妮在一旁观看,然后莎莉拿起球,把它放进一个封闭的容器里,比如篮子之类的,接着离开了。
And Sally is playing with a ball, and Anne is watching this, and then Sally takes the ball and she puts it in a closed container, let's say, basket or something like that, and she goes away.
你可能已经发现,用文字描述这个场景有点难追踪,但希望你的听众能想象出来——顺便说一句,这本身也非常有趣,即他们是如何构建心理场景的。
You can already tell it's a little bit hard to keep track of in text, but hopefully your listeners can imagine this, which is, by the way, also super interesting, how they construct the mental scene.
但希望亲爱的听众们能想象出这样一个场景:萨莉把球藏了起来,放进篮子里,然后离开了。
But hopefully, dear listener, constructing a mental scene of Sally has hidden her ball, put it in this basket, and left the scene.
然后有人把球从篮子里拿出来,藏进柜子里,关上柜门,接着离开或者做类似的事情,再把篮子关上。
And then takes the ball out of the basket and hides it in the cupboard and closes the cupboard and, say, goes away or something like that and closes the basket.
现在萨莉回来了。
Now Sally comes back.
萨莉会去哪里找球呢?
Where will Sally look for the ball?
现在你可以问几个不同的问题。
Now you can ask a few different questions.
你可以问孩子:球现在在哪里?
You can ask children like, where is the ball right now?
对吧?
Right?
也就是,世界的实际情况是什么?
Like, what's the true state of the world?
他们会说球在柜子里。
And they will say it's in the cupboard.
对吧?
Right?
所以他们知道球在哪里。
So they know where the ball is.
萨莉会去哪里找球?
Where will Sally look for the ball?
他们会说她会到篮子里找。
They'll say she'll look for it in the basket.
对吧?
Right?
因为她对这个世界有不同的信念。
Because she has a different belief about the world.
球在篮子里,正是这个信念会驱动她的行为,尽管我知道,你也知道,我们都知道球在柜子里。
The ball is in the basket, and that's what will drive her actions even though I know and you know, we all know it's in the cupboard.
这就像一个测试。
That's like one test.
有很多这类测试,都是用来测试心理理论的,它们会变得越来越复杂,比如我知道你知道,我有错误信念,我能理解你的情绪。
There are many of these sort of tests that are like tests for theory of mind, and they become like higher order, I know that you know, and I have a false belief, and I understand your emotion.
这类测试有很多,但经典的例子就是莎莉任务。
There's like many of these, but like a classic one of Sally M.
现在问题来了:大语言模型是否已经学会了这一点?因为我们有目标,因为有可能在没有心理理论的情况下,表现出似乎拥有心理理论的行为。
And now the question becomes, have LLM's learned that so we have the target, because it's possible to behave in a way that seems to suggest that have theory of mind without having theory of mind.
最简单的例子是,我可以编程让计算机在看到有人打别人时,直接输出:哦,他们生气了。
The most trivial example is I could program a computer to just have a lookup table that when it sees someone smacks someone else, it says, oh, no, they're angry.
对吧?
Right?
但这只是一个查表机制,就像五乘五等于二十五一样。
But it's just a lookup table, Same as, like, five times five equals 25.
只是一个查表机制,中间根本没有进行任何乘法运算。
Like, just a lookup table with no multiplication in between those two things.
所以它只是做了一些简单的映射吗?
So has it just done some simple mapping?
而且它肯定已经掌握了。
And it's certainly eaten up.
对吧?
Right?
比如萨莉·安就是认知发展领域被引用最多的例子之一。
Like, Sallie Anne is one of the most cited examples in all of cognitive development.
这个例子已经被讨论了无数次。
It has been discussed a bazillion times.
因此,它可能只是以这种方式学会的,这确实令人担忧。
So it's certainly worrying that it might just be able to pick up in that way.
然后当ChatGPT第二版发布时。
And then you get when ChatGPT version two comes out.
人们会尝试用萨莉·安来测试它,而它确实通过了萨莉·安测试。
People like try Sallie Anne on it and it passes Sallie Anne.
展开剩余字幕(还有 248 条)
它具备心理理论吗?
Does it have theory of mind?
但如果你把Sallie改成Muhammad,把Anne改成Christopher之类的,它就再也无法通过了。
But you change Sallie to Muhammad and Anne to Christopher or something like that, and it doesn't work anymore.
但最近一年左右,出现了一场非常有趣的争论:这些系统变得越来越好,你尝试用各种‘心理理论九题’来测试它们,还改变名字、球的类型等等,结果发现它们的表现似乎达到了六岁或九岁孩子的水平。
But then very recently, over the last year or so, there's been this very interesting debate of these things are getting better and better, and you try all these Theory of Nine things on them, and you try, like, various things like changing the names and changing the ball and things like that, and it seems to pass it at the level of a six year old or a nine year old and things like that.
那么,我们应该从中得出什么结论?
Now, what should we conclude from that?
如果你对这些内容稍作改动,让它们稍微超出模型训练时的范围——虽然成年人完全能应对,也完全可以用心理理论解决——但模型却会彻底失败。
If you change, you perturb the things, you bring it slightly outside the domain that it was trained on in a way that adults don't have a problem with, but still theory of mind to solve, it crashes and burns.
这就像它能算出5乘以5,但如果你改成628乘以375,它就崩溃了,这让我觉得它并没有真正学会心理理论。
The equivalent of like, it can do five times five, but if you move it to like 628 times 375, it crashes and burns, which to me suggests that it didn't learn, theory of mind.
现在,要说这一点变得越来越困难了。
Now, it's getting harder and harder to say that.
但我认为,即使它真的通过了测试,根据我对这些系统如何被训练、如何学习的了解,我仍然会非常怀疑和不信任它真正学到了类似逆向规划模型的东西。
But I think even if it does pass it, everything that I know about what sort of things these things tend to learn how they're trained and what they do, like, I would still be very suspicious and skeptical that it's learned anything like an inverse planning model.
我认为这只不过是越来越好的一个库、表格之类的东西。
I think it's just getting a better and better library or table or something like that.
托默的不确定性反映了目前我们还没有一种完美的方法来测试人工智能中的这些能力。
Tomer's uncertainty reflects the fact that right now, we don't have a perfect way to test these things in AI.
我们用于人类的测试都是行为性的,因为我们能确信儿童是通过推理,而不是查表,来理解萨莉-安妮的。
The tests we've been using in humans are behavioral because we can confidently assume that children are using reasoning, not a lookup table, to understand Sallie Anne.
输入和输出测试并不能提供全部信息。
Input and output tests don't give us all the information.
托默认为我们需要更好地理解大型语言模型实际上是如何在幕后执行这些任务的。
Tomer thinks we need to better understand how large language models are actually performing these tasks under the hood, so to speak.
研究人员和专家称这为机制解释或机制理解。
Researchers and experts call this mechanistic interpretation or mechanistic understanding.
所以我认为机制理解肯定会有所帮助。
So I think mechanistic understanding would definitely help.
我不认为行为测试是个坏主意,但过去几年里,普遍有一种感觉,即我们被困在了基准陷阱中,游戏规则不断变成对方说:给我一个基准,来证明你的系统有效。
And I don't think that behavioral tests are a bad idea, but there is a general, like, the last few years, a feeling that we're trapped in the benchmark trap, where the name of the game keeps being someone on the other side saying, give me a benchmark to prove to you that my system works.
而且顺便说一下,我真的很同情他们。
And so and by the way, I my heart goes out to them.
我理解他们为什么觉得我们在不断改变标准,因为我们一直做的不是说‘你必须通过测试’,而是说‘不能这样通过’。
I understand why they feel that we're moving the goalpost because what we keep doing is not pointing out, you need to pass it, but not like that.
我们会说,好吧,我们来做图像描述吧。
We say stuff like, okay, we'll do image captioning.
当然,要做图像描述,你必须理解图像。
Surely to do image captioning, you need to understand an image.
太好了。
Like, great.
所以我们拿十亿张图片和十亿个Flickr数据集来做这件事。
So we'll take a billion images and a billion datasets from Flickr and we'll do this thing.
什么意思?
Like, what?
然后,是的,我们达到了98%的通过率。
And like, yeah, we pass it 98%.
你就会说,什么?
You're like, what?
然后他们就继续往前走了。
And then they're like, they move on.
等等。
Like, wait.
你根本就没通过。
You didn't pass it at all.
当我把画面改成孩子在吃飞盘,而不是扔飞盘时,系统还是说他们在玩飞盘。
When I change, like, instead of kids throwing a Frisbee, they're eating a Frisbee, it still says that they're playing with a Frisbee.
是的,是的,是的,随便吧。
Like, yeah, yeah, yeah, whatever.
我们继续吧。
Let's move on.
而你就会说,但你根本就没通过?
And you're like, but you didn't?
好的。
Okay.
那么,心理理论呢?
Well, how about theory of mind?
所以,是的,如果我们能 somehow 读取算法的内容,机制性理解会非常棒。
So, yeah, mechanistic understanding would be great if we could somehow read in what the algorithm is.
但如果我们能做到这一点,那将太棒了,我完全支持。
But if we can do that, that would be awesome and I support it completely.
但这非常困难。
But that's very hard.
人工智能的历史充满了这样的例子:我们曾认为某种技能只有具备类似人类的智能才能实现。
The history of AI is full of examples like this, where we would think that one type of skill would only be possible with really human like intelligence.
但结果却并非我们所预料的那样。
And then the result is not what we thought it would be.
人们设计出了一种测试。
People come up with a test.
你知道吗,你的机器能达到特级大师水平下棋吗?
You know, can your machine play chess at a grandmaster level?
因此,它
And therefore, it's
就会像最聪明的人一样具有智能。
going to be intelligent, just like the most intelligent people.
然后深蓝出现了。
And then Deep Blue comes around.
它下棋比任何人类都强。
It can play chess better than any human.
但不,这并不是我们原本的意思。
But no, that's not what we meant.
它什么都做不了。
It can't do anything else.
他们说:等等,你们在移动目标。
And they said, wait, you're moving the goalpost.
而且我们正在走向这个方向,你知道,我认为这是一种错误的动态。
And we're getting that, you know, it's kind of the wrong dynamic, I think.
这根本不是正确的方式来回答我们想解决的那些问题。
It's just not the right way to have this to answer the kinds of questions we want to answer.
但这很难。
But it's hard.
要设计出这些方法来厘清这些问题,真的很难。
It's hard to come up with these methodologies for for teasing out these questions.
还有一个让我知道你经常遇到的令人沮丧的动态:一旦你提出了这样的测试、失败或类似问题,他们就会说,太好了。
And an additional frustrating dynamic that I know that you've encountered many times, as soon as you come up with one of these tests or one of these failures or things like that, they're like, great.
再多训练一下。
More training.
这不过是对抗性训练。
That's just adversarial training.
我们直接加上就行了。
We'll just add it.
这是一个愚蠢的例子,实际情况并非如此,但为了帮助听众理解,假设有人声称他们的机器能做乘法,你用5乘5测试时它却出错了。
This is a silly example, it's not how it works, but just for the sake of people listening in case this helps, imagine that you had someone who's like claiming that their machine can do multiplication and you try it on five times five that it fails.
然后他们说:‘抱歉,抱歉,抱歉。’
And like, they're like, oh, sorry, sorry, sorry.
接着他们把25加到查找表里。
And they add like 25 to the lookup table.
那5乘6呢?
Like, okay, what about five times six?
他们又说:‘抱歉,抱歉,抱歉,这个也没对。’
And they're like, oh, sorry, sorry, sorry, that didn't work.
我们把它加上吧。
Let's add that.
对吧?
Right?
但总有一天你会用完所有数字,对吧?
And at some point you run out of numbers, right?
但这并不意味着它真的懂得如何做乘法。
Like, so but that doesn't mean that it knows how to multiply.
这种动态就像艾莉森在第一集中讲的石头汤故事。
This dynamic is like the stone soup story Alison told in the first episode.
许多AI系统就像汤,里面加入了各种不同的成分,只为达到我们想要的结果。
A lot of AI systems are like soups with a bunch of different ingredients added into them in order to get the results we want.
尽管默里对大语言模型的能力持更乐观的看法,但他也认为,要判断机器是否具有类似意识的东西,你需要查看其内部机制。
And even though Murray has a more confident outlook on what LLMs can do, he also thinks that in order to determine the existence of something like consciousness in a machine, you need to look under the hood.
所以我认为,关于意识,如果某物真的表现得完全像一个有意识的生物,还有别的可说的吗?
So I think in the case of consciousness, if something really does really behave exactly like a conscious being, is there anything more to say?
我的意思是,我们是不是应该把它当作一个同样有意识的个体来对待?
I mean, should we then treat it as a as a fellow conscious being?
这是一个非常棘手的问题。
And it's a really tricky question.
我认为在这些情况下,你知道,你关注的不仅仅是行为。
And I think in those cases, you know, you're going to be interested in you're not just interested in behavior.
你还会关心这个东西是如何运作的。
You're also interested in how the thing works.
所以我们可能想看看它内部是如何工作的。
So we might want to look at how it works inside.
这是否类似于我们的大脑运作方式,以及神经科学等揭示出的让我们产生意识的那些机制。
And is that analogous to the way our brains work and the things that make us conscious that we're revealing through neuroscience and so on.
所以,如果大型语言模型会幻觉但没有信念,而且它们目前很可能也没有类似人类的心理理论,那么有没有更好的方式来理解它们?
So if large language models hallucinate but don't have beliefs, and they probably don't have a human like theory of mind at the moment, is there a better way of thinking about them?
默里提供了一种概念化它们运作方式的方法,而不把我们自身的人类心理学强加于它们。
Murray offers a way of conceptualizing what they do without imposing our own human psychology onto them.
我写了一篇题为《与大型语言模型的角色扮演》的论文。
So I've got a paper called Role Play with Large Language models.
我在其中主张的是,背景在于,我们很容易用一些日常用语来描述大型语言模型中的现象,比如相信、欲望、思考等等。
And what I advocate there is, well, the sort of the background to this is that it is very tempting to use these ordinary everyday terms to describe what's going on in a large language model, like, you know, believes and wants and thinks and so on.
从某种意义上说,我们拥有一套非常强大的民间心理学术语,用来彼此交流。
And, you know, in a sense, we have a very powerful set of folk psychological terms that we use to talk about each other.
当我们谈论这些其他事物时,自然会想要借助这些术语。
And we naturally want to draw on that when we're talking about these other things.
那么,我们能否在不陷入我前面提到的那些错误的情况下做到这一点呢?
So can we do that without falling into the kinds of mistakes I was talking about earlier on?
我认为可以,因为我觉得我们需要退一步思考,它们真正做的是某种角色扮演。
I think we can, because I think what we need to do is just take a step back and think that what they're really doing is a kind of role play.
因此,与其认为它们真的拥有信念,不如把它们看作是在扮演一个具有信念的人类角色、幻想角色,或者某种科幻AI角色。
So instead of thinking of them as actually having beliefs, we can think of them as playing the role of a human character or a fantasy, you know, science fiction AI character or whatever, but playing the role of a character that has beliefs.
这类似于舞台上的演员。
So it's analogous to an actor on the stage.
假设我们有一位舞台上的演员,正在进行即兴表演。
So suppose that we have an an actor on the stage and they're in an improv performance.
假设另一位演员对他说,而他扮演的是,比如说,一位AI科学家或哲学家。
And suppose that the other person says to them and and they're playing, you know, the part of a say, an an AI scientist or a philosopher.
然后舞台上的另一位演员说:哦,你听说过AI研究者默里·尚汉农吗?
And then the other person on the stage says, oh, have you heard of the AI researcher Murray Shanahan?
然后他们会说,哦,是的。
Then they'll say, oh, yes.
我听说过他。
I've heard of him.
那你记得他写过哪些书吗?
So what can you remember what books he's written?
好吧,想象一下那里真有一位演员。
Well, imagine that there was an actual actor there.
也许这位演员通过某种奇迹确实听说过我,并知道我写过一本叫《内在生命中的具身》的书。
Now maybe the actual actor by some miracle had had in fact heard of me and then knew that I'd written a book called Embodiment in the Inner Life.
他们很可能会说,哦,是的,他写过《内在生命中的具身》。
They'd probably come up and say, oh yeah, he's written Embodiment in the Inner Life.
这位演员可能会因此卡住,于是继续说,是的,他还写了……然后编出一个虚构的书名,比如说我2019年写的那本书。
The actor might then be a bit stuck, so then he might carry on and say, yeah, and then he also wrote and then come up with some made up title, you know, that I wrote in 2019.
但那就是即兴演员在这种情况下会做的事情。
But that's what an improv actor would sort of do in those circumstances.
我认为大型语言模型所做的事,通常与这种情况非常相似。
And I think what a large language model does is very often very closely analogous to that.
所以它是在扮演一个角色。
So it's playing a part.
当大型语言模型被引导去谈论自己的意识,或者谈论不希望被关闭之类的事情时,这种思考方式和类比尤其有用。
And this is particularly useful way of thinking, useful analogy, I think, when it comes to large language models get coaxed into talking about their own consciousness, for example, or when they they talk about not wanting to be shut down or something like that.
因此,在这些情况下,最好将它们视为在扮演一个角色——一个谈论自己意识、或不愿被关闭的科幻人工智能。
So very often it's best to think of them in those circumstances as role playing, an AI, a science fiction AI that is talking about its own consciousness or that doesn't want to be shut down and so on.
你关于角色扮演的论文让我想起了图灵测试。
Your paper on role play, it reminded me of the Turing test.
你知道,图灵测试的原始形式,是图灵用来抛开这样一个问题的:机器扮演或模拟拥有信念和欲望,与真正拥有它们之间有什么区别。
And you know, the Turing test, the original formulation of the Turing test was Turing's way to sort of throw out the question of like what's the difference between a machine sort of role playing or simulating having beliefs and desires and so on and actually having them.
图灵认为,如果我们能造出一台机器,成功让评判者相信它是人类——用你的说法,就是扮演人类——那么我们就无需质疑它是在模拟智能,还是真正拥有智能。
And Turing thought that if we could have a machine that tried to convince a judge that it was human, in your terminology role playing a human, then we shouldn't question whether it's simulating intelligence or actually has intelligence.
那你对此怎么看?
So what do you think about that?
是的
Yeah.
关于图灵测试,我有很多想法。
Lots of thoughts about the Turing test.
首先,顺便说一下,我认为图灵在他1950年那篇著名的《心灵》论文开篇所做的,是提出‘机器能思考吗?’
So the first thing, by the way, is I do think that the move that Turing makes right at the beginning of his famous paper, '19 50, is it nineteen fifty paper in MIND, he says, could a machine think?
他说,让我们用另一个问题来取代这个问题。
And he says, let's replace that question by another one.
他做的第一件事就是拒绝回答这个问题。
That's the first thing he does is he refuses to answer that question.
他用另一个他认为更具体、相对更容易回答的问题来替代:我们能否造出一个能欺骗评委,让它以为自己是人类的东西?
He replaces it by a different one that he thinks is a more tangible, relatively easier to address question about, you know, could we build something that fool a judge into thinking it was human?
通过这种方式,他避免了做出某种深刻的形而上学承诺,也避开了另一种提问方式可能带来的、或许虚幻的哲学问题。
And in that way, avoids, you know, making a kind of deep metaphysical commitment and avoids the perhaps illusory philosophical problems that attend the other way of putting the question.
所以某种程度上,这听起来像是我在效仿图灵的做法,说:好吧,让我们用角色扮演的角度来讨论这些问题。
So in a sense, it sounds like I'm making a similar move to Turing saying, oh, let's talk about these things in terms of role play.
但這有點不同,因為我確實認為這裡有一個明確的真實性案例,那就是我們自己。
But it's a little bit different because I do think that there is a clear case of authenticity here, which is ourselves.
所以我在對比角色扮演版本與真實版本。
So I'm contrasting, you know, the role play version with the authentic version.
因此,真實的版本就是我們。
So the authentic version is is us.
所以,在一個扮演默里的大型語言模型和默里本人之間,存在巨大差異。
So there is a big difference between a large language model that's role playing Murray and Murray.
在一個扮演擁有信念或意識的大型語言模型,與一個確實擁有信念且具有意識的實體之間,也存在差異。
And there's a difference between a large language model that's role playing having a belief or being conscious and a being that does have a belief and is conscious.
真實的默里與扮演的默里之間的區別,首先在於:如果我摔倒並受傷,這很重要;而如果大型語言模型說它摔倒並受傷,這卻無關緊要。
The difference between the real Murray and the role played Murray is, for a start, it matters if I fall over and hurt myself and it doesn't matter if large language model says it's fallen over and hurt itself.
所以這是一種顯而易見的差異。
So that's one obvious kind of thing.
但僅僅因為機器在扮演角色,並不意味著它不會產生真實的後果和影響。
But just because a machine is role playing, that doesn't mean it can't have real consequences and real influence.
你可以创造出一个角色扮演得如此逼真的东西,以至于在所有实际目的上,它都等同于真实的东西。
You can make something that role plays something so well that to all intents and purposes, it is equivalent to the authentic thing.
例如,在那篇角色扮演论文中,举了一个例子:一个扮演邪恶语言模型的角色,试图欺骗他人交出银行账户信息并转移资金等。
So for example, in that role play paper, used the example of something that is role playing a villainous language model that's trying to cheat somebody out of their money, it persuades them to give them its bank account details and to move money across and so on.
对受害者来说,这是否只是角色扮演其实并没有太大区别。
And it doesn't really make much difference to the victim that it was only roleplaying.
就犯罪而言,真实与假装之间的差距已经完全消失了。
So as far as crime is concerned, the gap between authenticity and just pretending is completely closed.
这真的无关紧要。
It really doesn't matter.
所以有时候,你知道,根本就没什么区别。
So sometimes, you know, it just doesn't make any difference.
这个邪恶的语言模型听起来有点像悉尼,那个Bing聊天机器人。
That villainous language model sounds a bit like Sydney, the Bing chatbot.
我们应该指出,这个聊天机器人只有在《纽约时报》记者多次提出尖锐问题,包括让其设想自己的阴影自我之后,才展现出这种黑暗人格。
And we should point out that this chat bot only turned into this dark personality after the New York Times journalist asked it several pointed questions, including envisioning what its shadow self would look like.
但必应聊天机器人,就像任何其他大语言模型一样,并不参与信念的游戏。
But the Bing chat bot, like any other LLM, does not participate in the game of belief.
悉尼很可能在训练数据中接触过许多关于人工智能和机器人渴望掌控人类的科幻故事,因此它扮演了那种角色。
Sydney had likely consumed many sci fi stories about AI and robots wanting to gain power over humans in its training data, and so it role played a version of that.
测试悉尼的科技记者知道它并不是一个人。
The tech journalist who tested Sydney knew it wasn't a person.
如果你阅读对话的转录文本,悉尼听起来并不像人类。
And if you read the transcript of the conversation, Sydney does not sound like a human.
但即便如此,这类例子仍可能让人感到担忧。
But still, examples like this one can make people worried.
非常多。
A lot
人工智能领域的人们经常谈论对齐问题,即我们如何确保自己创造的这些系统拥有与我们相同的价值观,或者至少拥有我们认为人类应该拥有的价值观?
of people in AI talk about the alignment problem, which is the question of how do we make sure that these things we're creating have the same values we do, or at least the same values we think humans should have?
有些人甚至担心,所谓未对齐的AI系统在执行我们的指令时,会因为我们在指令中遗漏了一些细节而引发灾难。
Some people even fear that so called unaligned AI systems that are following our commands will cause catastrophes just because we leave out some details in our instructions.
比如,如果我们告诉一个AI系统去解决全球变暖,有什么能阻止它认为人类才是问题所在,而最有效的解决方案是把我们全部消灭呢?
Like, if we told an AI system to fix global warming, what's to stop it from deciding that humans are the problem and the most efficient solution is to kill us all?
我问了托默和默里,他们是否认为这类担忧是现实的。
I asked Tomer and Murray if they thought fears like these were realistic.
我说点什么,然后自己又把它推翻。
I'll say something and undercut myself.
对吧?
Right?
我想说,我对这些事情有点合理的担忧。
Like, I want to say that I'm sort of reasonably worried about these things.
对吧?
Right?
我不想像没事人一样,觉得一切都很好。
I don't want to be like la dee da, everything is fine.
说你对某些事有合理担忧的问题在于,每个人都觉得自己是合理担忧的,对吧?
The trouble with saying that you're reasonably worried about stuff is that everyone thinks that they're reasonably worried, right?
比如,就连那些你认为是危言耸听的人,也不会说:‘哦,对,我就是个危言耸听者’,对吧?
Like, even people that you would consider alarmists don't say like, oh yeah, I'm an alarmist, right?
比如,我担心得有点过度了,对吧?
Like, I worry unreasonably, right?
每个人都觉得自己很理性。
Like everyone thinks that they're being reasonable.
我跟一些朋友聊过这个话题,每个人都觉得自己开的车速刚刚好。
I was talking to some friends of mine about this, Everyone thinks they're driving the right speed.
比如,开得比你慢的人都是老奶奶,开得比你快的人该进监狱,但你自己开的速度却不一样。
Like anyone driving slower than you is a grandma and everyone driving faster than you belongs in jail, but you're driving different speeds.
所以,这有点像那种情况。
So And it's a little bit like that.
对吧?
Right?
所有没我这么担心的人,都觉得太无所谓了。
Everyone who's not as worried as I am is like too blase.
每个人都更担心,这听起来像个怪人。
Everyone's more worried is like a weirdo.
所以我能理解这些担忧,绝对可以。
So I can see the worries, absolutely.
而且我能看出两种担忧。
And I can see two worries.
对吧?
Right?
一种比较近,另一种则比较远。
One is sort of closer and one sort of further away.
即使它没有目标、信念或类似的东西,它仍可能像一辆失控的拖拉机一样造成大量伤害。
Even if it doesn't have goals or beliefs or anything like that, it could still do a lot of harm in the same way that, like, a runaway tractor could do harm.
所以我确实认为这方面存在一些担忧。
So I'm I'm certainly thinking that there are some worries about that.
另一个更牵强的担忧是,这些事物未来可能被视作具有自身目标和信念的主体之类的东西。
The other more far fetched worry is something like these things may someday can be treated as agents in the sense that they have goals and beliefs of their own and things like that.
那么,我们应该担心它们的目标和信念与我们的并不完全一致。
And then we should be worried that their goals and beliefs are not quite like ours.
即使它们理解我们想要什么,也可能被绕过。
And even if they understand what we want, they may be circumvented.
我们离这种情景有多近?
How close are we to that scenario?
我无法确定,但目前我对这一点没那么担心。
Impossible for me to say, but I'm less worried about that at the moment.
当然,和许多人一样,我担心大型语言模型会被武器化,用来破坏民主,或大规模用于网络犯罪,用来劝说人们做坏事或违背自身利益的事。
Well, I'm certainly, like many people, worried about the prospect of large language models being weaponized in a way that can undermine democracy or be used for cybercrime on a large scale, can be used to persuade people to do bad things or to do things against their own interests.
因此,确保语言模型和生成式人工智能不被以这些方式滥用和误用,是一个重要的优先事项。
So trying to kind of make sure that language models and generative AI is not misused and abused in those kinds of ways is a significant priority.
所以,这些事情确实非常令人担忧。
So those things, yeah, are very concerning.
我的意思是,我也不喜欢生成式人工智能剥夺创意产业从业者生计的想法。
I mean, I also don't like the idea of generative AI taking away the livelihoods of people working in the creative industries.
我认为这确实令人担忧。
And I think there are concerns over that.
所以我也不是很喜欢这一点。
So I don't really like that either.
但另一方面,我认为人工智能也有潜力成为创意人士的一种极其先进的工具。
But on the other hand, I think AI has the potential to be used as a as a very sophisticated tool for creative people as well.
所以这件事有两面性,但这也确实让我感到不安。
So there are two sides to it, but that certainly distresses me as well.
每一个悲观的预测,都伴随着乐观的预测,认为人工智能会让我们的生活更轻松,改善医疗保健,并在不导致人类灭绝的前提下解决气候变化等重大世界问题。
With every pessimistic prediction, there are optimistic ones about how AI will make our lives easier, improve health care, and solve major world problems like climate change without killing everyone in the process.
关于人工智能未来的预测众说纷纭,但默里却不愿加入讨论,再添新论。
Predictions about the future of AI are flying every which way, but Murray is reluctant to chime in and add more.
你写了一本名为《技术奇点》的书。
So you wrote a book called The Technological Singularities.
是的。
Yeah.
那是个错误。
That was a mistake.
我不知道。
I don't know.
我觉得这本书非常有趣。
I thought it was a really interesting book.
但你知道,像雷·库兹韦尔这样的人坚信,在不到十年的时间里,我们将拥有在各个方面都比人类更聪明的机器。
But, you know, people like Ray Kurzweil famously believe that within less than a decade, we're gonna have machines that are smarter than humans across the board.
甚至像DeepMind这样的人也预测,十年内就会出现所谓的通用人工智能。
And other people, even, you know, of DeepMind have predicted so called AGI within a decade.
你对我们的未来以及这些系统将如何发展有什么看法?
What's your thought on where we're going and sort of how these systems are going to progress?
我倒是希望有人能突然敲门,这样我就不用回答这个特别尴尬的问题了。
I'm rather hoping that somebody will appear at the door just so that I don't have to answer that particularly awkward question.
说实话,我认为最近的过往已经告诉我们,做出预测是徒劳的,因为事情的发展根本没按任何人的预测进行,尤其是大型语言模型方面。
I think, So to be honest, I think the recent past has taught us that it's a fool's game to make predictions because things just haven't unfolded in the way that really anybody predicted, especially with large language models.
因为我认为我们正处于一个剧烈变动的时期,过去十八个月里出现了看似突飞猛进的进展。
Because I think we're in a state of such flux because we've had this eruption of seeming progress in the last eighteen months.
但目前我还无法确定这些进展最终会如何发展。
And it's just not clear to me right now how that's going to pan out.
我们会看到持续的进步吗?
Are we going to see continued progress?
那会是什么样子?
What is that going to look like?
我的意思是,有一件事我很肯定:我们现在的技术将产生非常显著的影响,而这种影响需要一段时间才能逐步显现。
I mean, one thing I do think we're going to see, technology that we have now is going to have quite dramatic impact, and that's going to take a while to unfold.
我想不起来是谁说的了,你提醒我一下,他说过我们往往在长期上低估技术的影响,而在短期上高估它。
And I think it's I can't remember who who you have to remind me who it was who said that that we tend to underestimate the impact of technology in the long term and overestimate it in the short term.
所以我认为,目前的情况很可能正是如此。
So I think that that's probably very much what's going on at the moment.
顺便说一句,这句格言来自科学家罗伊·阿马拉。
That adage, by the way, was from the scientist Roy Amara.
阿巴,默里喜欢留后路。
Abha, Murray likes hedging his bets.
尽管他在谷歌深脑这家最著名的AI公司工作,他仍然愿意坦率地谈论自己对AI未来的不确定性。
Even though he works at Google DeepMind, which is one of the most prominent AI companies, he's still willing to talk openly about his uncertainties about the future of AI.
对。
Right.
我感觉这个领域里的每个人都对大型语言模型及其能做什么、不能做什么感到不确定。
I get the impression that everyone in the field is uncertain about how to think about large language models and what they can do and cannot do.
是的。
Yeah.
这确实是真的。
That's definitely true.
默里将大型语言模型描述为一种‘类似异域心智的实体’。
Murray characterized LLMs as, quote, a kind of exotic mind like entity.
不过,他再次对是否能称其为‘心智’持保留态度。
Though, again, he hedged his bets over whether we could call it a mind.
我喜欢托默关于大语言模型和人类差异的讨论。
I liked Tomer's discussion on how, you know, LLMs and humans are different.
托默用从两条不同路径攀登山峰的比喻来说明这一点。
Tomer used the metaphor of climbing a mountain from two different routes.
人类通向智能的道路主要是通过在现实世界中的直接主动经验来学习。
And the human route to intelligence is largely learning via direct active experience in the real world.
对吧?
Right?
问题是,大语言模型能否通过完全不同的路径——即被动吸收人类语言——达到同样的终点?
And the question is, can LLMs use a totally different route, that is passively absorbing human language, to arrive at the same place?
还是说它们会抵达一种完全不同的智能形式?
Or do they arrive at a completely different kind of intelligence?
你觉得呢,梅兰妮?
What do you think, Melanie?
嗯,我在犹豫是否应该真的用‘智能’这个词来描述它们。
Well, I vacillate on whether we should actually use the word intelligence to describe them.
目前,大语言模型表现出一种极其复杂的行為混合体。
So right now, LLMs are a mix of incredibly sophisticated behavior.
它们能够进行令人信服的对话。
They can have convincing conversations.
它们可以写诗。
They can write poetry.
它们在语言翻译方面表现得非常出色。
They do an amazing job translating between languages.
但它们也可能表现出非常奇怪且不像人类的行为。
But they can also behave in a really strange and unhuman like way.
例如,在许多情况下,它们无法进行简单的推理,缺乏自我意识,并且不断编造内容,也就是所谓的幻觉。
For example, they're not able in many cases to do simple reasoning, they lack self awareness, and they constantly make stuff up, the so called hallucinations.
是的。
Yeah.
‘幻觉’这个说法很有趣
Hallucinations is an interesting use of
这个词本身。
the word itself.
默里谈到了大型语言模型与人类不同,无法参与信念的游戏,因为他说,它们无法像我们那样与日常世界互动来更新自己的信念。
Murray talked about how LLMs, unlike us humans, can't participate in the game of beliefs because, as he said, quote, they can't engage with the everyday world in the way we do to update their beliefs.
是的。
Yeah.
我的意思是,一个大问题是,语言模型是庞大而复杂的黑箱。
I mean, a big problem is that LMs are huge, complex black boxes.
即使那些创建和训练它们的人,也不太清楚它们是如何运作的,究竟进行了多少真正的推理,还是只是在复述记忆中的模式。
Even the people who created and trained them don't have a good understanding of how they do what they do, how much sort of actual reasoning they're doing or how much they're just echoing memorized patterns.
这就是为什么关于它们真正智能和能力的争论如此激烈的原因。
And this is why the debates about their actual intelligence and their capabilities are so fierce.
汤姆和默里都谈到了理解其内部机制这一开放性问题,汤姆称之为机制性理解。
Both Tomer and Murray talked about the open problem of understanding them under the hood, what Tomer called mechanistic understanding.
其他人则称之为机制性可解释性。
Others have called it mechanistic interpretability.
这是一个非常活跃但尚处于初期阶段的AI研究领域。
This is a very active, though nascent area of AI research.
我们将在未来的节目中了解更多相关内容。
We'll hear more about that in a future episode.
我也很喜欢默里将大语言模型视为角色扮演者的观点。
I also liked Murray's framing of LLMs as role players.
通过不同的提示,你可以让它们扮演不同的角色,包括那些具有信念和欲望的代理角色,比如《纽约时报》记者与大语言模型的对话中,该模型扮演了一个希望记者离开妻子的机器角色。
With different prompts, you know, you can get them to play different roles, including that of an agent that has beliefs and desires, like in that New York Times journalist conversation where the LLM was playing the role of a machine that wanted the reporter to leave his wife.
大语言模型实际上并没有任何信念和欲望,对吧?
The LLM doesn't actually have any beliefs and desires, right?
但它通过我们人类生成的文本进行训练,能够 convincingly 地扮演具有这些特质的角色。
But it has been trained using text generated by us humans to convincingly role play something that does have them.
你必须小心,不要被这种逼真的角色扮演所迷惑。
You have to be careful not to be taken in by the convincing role play.
但这引出了一个深刻的哲学问题。
But this brings up a deep philosophical question.
如果一台机器能够完美地扮演一个具有信念和欲望的实体,那么在什么情况下我们可以说它本身并不具备真正的信念和欲望?
If a machine can perfectly role play an entity with beliefs and desires, at what point can we argue that it doesn't itself have actual beliefs and desires?
正如默里所说,如果一台机器表现得完全像拥有心智,我们有什么资格说它没有心智?
As Murray said, if a machine perfectly acts like it has a mind, who are we to say it doesn't have a mind?
这正是艾伦·图灵在1950年提出图灵测试时的观点。
This was Alan Turing's point when he proposed the Turing test way back in 1950.
那么,我们该如何让机器真正拥有信念和动机,并具备与我们一致的价值观?
So how could we get machines to have actual beliefs and motivations And to have values that align with ours?
在我们的第一集中,艾莉森·戈普尼克讨论了以不同方式训练人工智能的可能性。
In our first episode, Alison Gopnik discussed the possibility of training AI in a different way.
这包括尝试赋予机器一些类似人类的动机。
It would involve trying to program in some human like motivations.
它的训练过程也会更接近人类童年时期与照顾者互动的模式。
And its training period would more closely resemble human childhoods with caregivers.
所以下一集中,我们将探讨儿童。
So coming up in our next episode, we're going to look at children.
婴儿出生时已经知道些什么?他们又是如何在成长过程中学习的?
What do babies already know when they're born, and how exactly do they learn as they grow up?
我经常思考的一个关键问题是,孩子所听到的内容与语言模型训练所需的数据之间存在着多么巨大的差异。
The biggest thing that I think about a lot is how huge that difference is between what the child hears and what the language model needs to be trained on.
下一期,我们将探讨这个问题。
That's next time on Complexity.
《复杂性》是圣塔菲研究所的官方播客。
Complexity is the official podcast of the Santa Fe Institute.
本集由凯瑟琳·蒙科尔制作,主题曲由米奇·米尼亚诺创作。
This episode was produced by Katherine Moncure, and our theme song is by Mitch Mignano.
其他音乐来自Blue Dot Sessions。
Additional music from Blue Dot Sessions.
我是阿巴。
I'm Abha.
感谢收听。
Thanks for listening.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。