本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
我不知道你怎么样,但我的名字出现在互联网的各个角落,甚至还伴随着我对一些事情的想法和观点。
I don't know about you, but my name appears all over the Internet, even along with some of my thoughts and opinions about things.
这没什么。
That's fine.
这些都是我自己造成的,是我的错,但当其他信息也出现时,比如我的地址、电话号码、社会安全号码,甚至购物习惯或健康信息,就让人不那么愉快了。
I make that happen myself, my fault, but it's less pleasant when other things appear, like my address, phone number, Social Security number, even things like shopping habits or health information.
这是对隐私的侵犯,更不用说这还可能带来真正的危险,比如欺诈、诈骗和身份盗窃。
That's a breach of privacy, not to mention a real danger that can lead to fraud, scams, and identity theft.
Incogni 就是为了帮助你而存在的。
Incogni is here to help.
Incogni 是一项能够追踪并从网络多个网站上删除你个人信息的服务。
Incogni is a service that tracks down and removes your personal data from multiple sites across the web.
Incogni 服务既包含自动化部分,能自动搜索并删除你的数据,也提供自定义删除功能,该功能仅限其无限套餐用户使用。
The Incogni service has both an automated component where they search for your data and get it removed, but also a custom removal feature, which is available with their unlimited plans.
如果你发现自己的信息出现在不想被看到的地方,可以通知 Incogni,他们会指派一位专业的隐私专家为你处理。
If you find your info somewhere you don't want it to be, you can let Incogni know, and a dedicated privacy expert will handle it for you.
Incogni的订阅模式包括在后台持续监控,以确保您的数字足迹长期保持干净。
An Incogni subscription model includes continuous monitoring in the background to ensure your digital footprint stays clean for the long term.
我已经使用这项服务好几个月了,你无法想象我的名字在网上的可查记录中出现过多少次,一旦这些信息被删除,我感到安心多了。
I've been using the service for several months now, and you wouldn't believe how many times my name appears in easily available records online or how much peace of mind it confers once they're removed.
所以,用Incogni获得一份安心吧。
So get some peace of mind with Incogni.
前往 incogni.com/mindscape,使用代码 MINDSCAPE 享受年度计划60%的折扣。
Go to incogni.com/mindscape to get 60% off an annual plan using code MINDSCAPE.
在 incogni.com/mindscape 享受60%的折扣。
That's 60% off at incogni.com/mindscape.
Incogni。
Incogni.
如果他们找不到你,就无法伤害你。
They can't harm you if they can't find you.
营销很难。
Marketing is hard.
但我告诉你一个小秘密。
But I'll tell you a little secret.
其实并不一定如此。
It doesn't have to be.
让我指出一点。
Let me point something out.
你现在正在听一个播客,而且它很棒。
You're listening to a podcast right now and it's great.
你喜欢这个主持人。
You love the host.
你会主动寻找并下载它。
You seek it out and download it.
你在开车、锻炼、做饭,甚至上厕所时都会听它。
You listen to it while driving, working out, cooking, even going to the bathroom.
播客是你相当亲密的伙伴。
Podcasts are a pretty close companion.
这是一则播客广告。
And this is a podcast ad.
我吸引到你的注意了吗?
Did I get your attention?
你可以通过LibsynAds的播客广告触达像你这样的优质听众。
You can reach great listeners like yourself with podcast advertising from LibsynAds.
你可以从数百个顶级播客中选择主持人推荐,或者在数千档节目中投放预制作的广告,如本条,让你的目标受众在他们最爱的播客中接触到你的品牌,尽在LibsynAds。
Choose from hundreds of top podcasts offering host endorsements or run a pre produced ad like this one across thousands of shows to reach your target audience in their favorite podcasts with LibsynAds.
前往 libsynads.com。
Go to libsynads.com.
今天就访问 libsynads.com。
That's libsynads.com today.
大家好。
Hello, everyone.
欢迎收听Winescape播客。
Welcome to the Winescape Podcast.
我是主持人肖恩·卡罗尔。
I'm your host, Sean Carroll.
我一直认为,现代人工智能方法、大型语言模型以及其他联结主义方法的一个有趣之处在于,这些模型在自然状态下通常不擅长算术。
I've always thought that one of the interesting aspects of modern approaches to AI, large language models, other connectionist things, is that very often or at least in their natural state, an LLM is not good at arithmetic.
它们不擅长将数字相加。
It's not good at adding numbers together.
你可以通过增强程序来让它们变得非常擅长。
You can augment the program so that they're very good.
你可以让大型语言模型直接访问计算器。
You can give basically an LLM access to a calculator.
这和人类的情况完全一样。
It's exactly like human beings.
从某种意义上说,人类并不擅长算术,但如果你有计算器,他们就能做到。
They're not very good at arithmetic in some sense, but if you have a calculator, they can do it.
但在自然状态下,大型语言模型在处理将中等大小数字相加这类简单问题时会出错。
But in their natural state, LLMs make mistakes about simple problems adding medium sized numbers together.
其他基于数字的问题,它们也不太擅长。
Other kinds of number based problems, they're not very good at.
数单词 'strawberry' 中有多少个 'r'。
Counting the number of r's in the word strawberry.
随机数。
Random numbers.
如果你让一个大语言模型生成一百万个介于零到一百之间的随机整数,然后绘制频率图,结果不会是均匀分布的。
If you ask an LLM to generate a million random integers between zero and a 100 and then made a plot of the frequencies, it would not look uniform.
当然,人类在这些方面本质上也不太擅长,但你内心还是会想:喂,这可是电脑。
Of course, these are all things that human beings are also intrinsically not very good at, but part of you thinks, come on.
它应该能解决简单的算术问题。
It's a computer.
当然,答案其实并没有什么神秘之处。
It should be able to do simple arithmetic problems.
当然,答案其实并没有什么神秘之处。
And, of course, the answer is there's no real mystery here.
运行大语言模型的计算机本身完全能进行算术运算,但你并不是在和计算机对话,而是在和一个程序对话。
The computer on which the LLM is running has no problem doing arithmetic, you're not talking to the computer, you're talking to a program.
而这个程序可能并没有被设计成能处理这类任务。
And the program might not be set up to do those kinds of things.
而且,这和人类的情况完全一样。
And again, it's exactly like humans.
这几乎就像你竭尽全力去打造一个听起来像人类的程序,结果在这个过程中,它反而失去了进行算术运算的能力——仔细想想,这其实挺有意思的。
It's it's almost like you tried really hard to make a program that sounded human, and in the course of doing that, it lost the ability to do arithmetic, which which is kind of interesting when you think about it.
但这也提醒我们,当你提到‘思考’或‘思维’时,你其实并不是在指单一的东西。
But it's also a reminder that when you say thought or thinking, you're not really referring to a single thing.
能将数字相加和能进行对话,这是两种截然不同的能力。
The ability to add numbers together and the ability to carry on a conversation, those are two very different abilities.
在设计程序或通过自然选择演化生物时,你可能会优先优化其中一种能力,而牺牲另一种。
And you might optimize for one over the other in building a program or evolving an organism through natural selection.
尽管如此,我们确实还是倾向于设定一套关于‘正确思考’的通用标准。
Nevertheless, we do sort of aim at a sort of standard set of standards for thinking correctly.
对吧?
Right?
我们在加数字时,希望得到正确的答案。
We want to get the right answer when we add numbers together.
我们希望用逻辑来解决给我们的谜题。
We want to logic our way through puzzles that we are given.
我们希望得出理性、合理的结论。
We want to reach rational, reasonable conclusions.
那么,你如何将这些整合起来呢?
So how do you sort of fit together?
一方面,是我们这些理性生物所追求的纯粹逻辑与推理规则。
On the one hand, the pristine rules of logic and reasoning to which we aspire as thinking reasonable creatures.
另一方面,是我们心智、大脑和具身智能的现实,它首先在生物演化过程中被选择出多种不同的能力。
And on the other hand, the reality of our minds and our brains and our embodied intelligence, which has, number one, a whole bunch of different things that it was selected for over the course of biological time.
其次,还受到能量、燃料、时间等种种限制。
And number two, all sorts of constraints in terms of energy and fuel and time and things like that.
如果你拥有一种能够进行任意精准算术运算的人类大脑,这可能会削弱它在其他对生存更重要的能力上的表现。
If you had a brain, a human kind of brain that was able to do arbitrarily good arithmetic, that might make it worse at other things that were more important for survival.
因此,这个想法——对认知科学家来说是一个极其困难的问题——就是找出思维的法则,即一个在特定约束下运作的理想理性生物所遵循的规则。
So the idea, this is a very hard problem for cognitive scientists, the idea of coming up with the laws of thought, the laws that an ideally rational creature working under certain constraints would follow.
这些约束不仅包括资源有限,还包括你可能对事实并不确定。
Those constraints include not only the fact that there's finite resources, but also things like you might not be certain about the facts.
对吧?
Right?
你可能必须成为一个类似贝叶斯或近似贝叶斯的推理者,对某些事情为真的可能性赋予概率,然后学会更新这些概率等等。
You might have to be a kind of Bayesian or nearly Bayesian reasoner who has probabilities for certain things being true and then learn to update them and so forth.
因此,今天我们与认知科学家汤姆·格里菲斯讨论的,正是这种寻找思维法则的追求。
So the quest for these laws of thought is what we're talking about today with Tom Griffiths, who's a cognitive scientist.
他即将出版一本新书,猜猜叫什么?《思维的法则:探索心智的数学理论》。
He has a new book coming out called, guess what, The Laws of Thought, The Quest for Mathematical Theory of the Mind.
正如我在对话中了解到的,他目前还有一本新书即将出版,某种程度上可以说是这本书的技术版,或者至少是它的配套读物。
As I learned in the conversation, he has another book coming out also right now, which is sort of the technical version of this in some sense, or at least a companion to it.
所以,思维法则的目标受众是所有人。
So the laws of thought is meant for everybody.
另一本书由合作者福尔克·莱德和弗雷德里克·卡洛韦共同撰写,书名为《认知资源的理性使用:一种新的非理性行为建模方法,用于模拟人类认知。》
The other book is with coauthors Falk Leader and Fredrick Calloway, and it's called The Rational Use of Cognitive Resources, a new approach to irrational behavior modeling human cognition.
通过思考这些问题,你学到的一部分是:人类某些看似非理性行为,如果与你所追求的完美逻辑法则相比,实际上都有其合理的原因。
Part of what you learn by thinking about these things is that certain ways in which human beings act irrationally, if you compare them to the perfect laws of logic that you might aspire to, actually have good reasons for them.
对吧?
Right?
我们之所以有某些倾向和偏见,背后都有其原因。
There are reasons why we have certain inclinations, certain biases, and so forth.
这些发现对人工智能领域意味着什么?
What does all that mean for the world of programming artificial intelligence?
我们应该让人工智能像人类一样思考吗?
Should we try to make the AIs think just like human beings?
有没有捷径?
Are there shortcuts?
我们是不是可以做些更好的事情?
Are there better things that we can do?
毕竟,人工智能也有其约束,但这些约束与人类的约束并不相同。
After all, the AIs also have constraints, but they're not the same constraints that human beings have.
因此,弄清楚认知为何如此运作,以及如何在不同类型的系统中更好地实现它,将成为近期的一个增长领域。
So figuring out why cognition is the way it is and therefore how to implement it better in different kinds of systems is it's gonna be a growth area in the near term.
我们就这么说吧。
Let's just put it that way.
那么,我们开始吧。
So let's go.
汤姆·格里菲斯,欢迎来到《心灵之声》播客。
Tom Griffiths, welcome to the Mindscape Podcast.
是的。
Yeah.
很高兴来到这里。
It's great to be here.
你知道,我是个物理学家,我常常开玩笑说,物理学适合注意力短暂的人,因为他们没法同时在脑海中保持太多复杂性。
You know, I'm a physicist, and I often sort of teasingly say that physics is good for people with short attention spans, no real ability to keep a lot of complexity in their minds all at once.
而做完物理学研究后,你就会得出物理学的定律。
And at the end of doing physics, you come up with the laws of physics.
作为一名认知科学家,研究着宇宙中我们所知的最复杂的事物,我们是否还有希望谈论所谓‘思维定律’这样的概念,就像你新书的标题那样?
As a cognitive scientist, as someone studying literally the most complex thing that we know about in the universe, is there any hope that we should even talk about something like laws of thought, the title of your new book?
是的。
Yeah.
我的意思是,我认为是有可能的。
I mean, think there is.
一个有趣的事实是,那些最初投身于数学物理事业的人,原本也想用同样的方法来理解思维是如何运作的。
So, one interesting fact is that the people who set out with starting that enterprise of mathematical physics had in mind doing the same thing for understanding how thought works.
对吧?
Right?
所以,你知道,最早期的哲学家,以及所谓的科学家,在科学诞生之初,就真的把数学视为理解外部世界和内部世界的工具。
So, you know, the the the the very earliest philosophers and sort of, you know, scientists and sort of in in in the inception of science itself, we're really thinking about mathematics as a tool for understanding both the external world and the internal world.
我们在笛卡尔、莱布尼茨以及许多其他人的论述中都能看到这种观点。
And we see that in Descartes and Leibniz and many of these people sort of talking about that idea.
因此,我认为只要我们仔细思考我们试图刻画的是什么,就有可能得出一套类似定律的结论。
And so I think it's possible for us to end up with something that looks like a set of laws as long as we think carefully about what it is that we're trying to characterize.
我们希望理解智能是如何运作的,而对此可以有多种不同的思考方式。
So, you know, we want to understand how intelligence works, and there are many different ways you could think about that.
你可以从抽象层面思考,比如支配智能的一般原则是什么;也可以从具体层面思考,比如大脑这样的系统是如何产生智能的。
You could think about it abstractly in terms of what are the sort of general principles that would govern intelligence and concretely in terms of, you know, how is it that things like brains work in order to produce intelligence.
在这些不同层次上,我认为我们都能指出一些普遍性的规律,这些规律或许具备被称为‘定律’的恰当特征。
And at those different levels, I think there are there are things that we can point to that are generalizations that, you know, maybe have the the right character to be things that we could call laws.
我甚至不确定这算不算一个问题,但让我感到非常有趣的是,如果追溯到亚里士多德等古代思想家,你会发现亚里士多德大量讨论过世界——自然世界,而当时我们今天所说的物理世界和生物世界虽然有区分,但被视为一个连续体。
I'm not even sure if this is a question, but it is fascinating to me if you go all the way back to, like, Aristotle, etcetera, that Aristotle talked a lot about the world, the natural world, and the the discussion of the what we would call the physical world and the biological world, there were distinctions drawn, but it was kind of a continuum.
对吧?
Right?
他基本上试图在两种情况下都使用相同的概念。
Like, he kind of tried to use the same concepts in both cases.
我不确定在哪个节点上,这种策略不再流行了。
I don't know at what point that ceased to be a popular strategy.
是的。
Yeah.
而且我要指出的是,
And he I mean, it's worth pointing out.
他在许多方面可以说是我们的第一位认知科学家,因为他试图提出描述规律——虽然不清楚他当时关注的是论证还是思维,但他确实进行了早期的逻辑研究,为现代数学逻辑奠定了基础。
He he was also, in many ways, our first cognitive scientist in terms of trying to come up with the laws that characterized in his case, it's not clear whether he was focused on argument or thought, but but doing some of the first work in logic that really provided the foundations for, you know, modern approaches to to mathematical logic.
这就引出了一个问题:当我们谈到逻辑时,如果只使用‘思维规律’这个说法,是指思维应该遵循的规范性方式,即正确的思维方式,还是仅仅描述思维实际是如何运作的?
Well, which brings up the question, when we talk about logic, if we just use the phrase laws of thought, is this meant to refer to the normative ways that you should be thinking, the right way of thinking, or is it just a descriptive phrase of like, this is how thinking actually gets done?
对。
Yeah.
所以我认为,这让我们回到了关于我们在哪个层面上理解思维的问题。
So I think we that gets us back to this idea of what levels we're trying to understand thought at.
对吧?
Right?
在认知科学中,根据大卫·马的研究,我们谈论可以应用于信息处理系统的不同分析层次。
So in cognitive science, following the work of David Ma, we talk about there being different levels of analysis that we can apply to information processing systems.
其中最抽象的是马所说的计算层次,即试图理解一个系统在解决什么问题,以及该问题的理想解决方案是什么。
And so the most abstract of those is what Ma called the computational level, which is trying to understand what it is that a system is doing in terms of what what problem it's solving, and then what the ideal solution to that problem looks like.
这实际上是在探讨系统的功能和目标,以及实现这一特定功能的理想解决方案是什么。
And so that's really trying to ask a question about the function of the system and the goals of the system, and then what what's an ideal solution to to, you know, that particular to execute that particular function.
在此之下,是他所说的表征与算法层次,或称算法层次,更关注的是为了产生接近理想解决方案的结果,你可能实际参与的认知过程是什么。
And then below that, we have what he called the level of representation and algorithm or the algorithmic level, which is more about what are the actual cognitive processes that you could engage in in order to produce something which is maybe an approximation to that ideal solution.
再往下是实现层次,即这些过程如何在物理系统中实现。
And then below that is the level of implementation, which is how is that realized in a physical system.
对吧?
Right?
因此,对于人类来说,这体现在大脑中,对于计算机来说,则体现在硅芯片等介质中。
And so, you know, for humans that's in brains, for computers it's in silicon and so on.
因此,当我们谈论识别思维规律时,我认为最自然的思考层面就是最抽象的计算层次。
And so when we talk about identifying laws of thought, I think the most natural level to think about that is that most abstract computational level.
像逻辑和概率论这样的东西,凸显为我们可以用来说明心智必须解决的各种问题应该如何解决的原则。
And there are things like logic and probability theory stand out as the principles that we can use for saying how it is that you should be solving the kinds of problems that minds have to solve.
我知道我想退一步,谈谈莱布尼茨、逻辑以及类似的东西。
And I know I wanna sort of back up and talk about Leibniz and and logic and things like that.
但为了帮助听众了解我们接下来的讨论方向,你对‘思维的规律是什么’这个问题最简短的回答是什么?
But, you know, just to help the listeners with the road map of where we're going, what is your shortest answer to the question, so what are the laws of thought?
这是个很好的问题,我认为当我刚开始写这本书时,我写过一个引言,内容和我在认知科学课堂上通常说的一样:我们已经做了大量工作,试图理解心智是如何运作的。
So, it's a it's a great question, and I think it's something where, when I started out writing the book, I I wrote a sort of introduction that said what I normally say in my cognitive science classes, which is, you know, we've done a lot of work here trying to understand how minds work.
在很多方面,认知科学家所做的,是找到了更好提问的方式,但并不一定为我们提供了这些问题的答案。
And in many ways, what cognitive scientists have done is kinda figure out better ways of asking the questions that we want to ask without necessarily giving us answers to those questions.
但经过几年撰写这本书、反复修改,再加上外部世界也发生了一些变化,当我写到书的结尾时,我最终觉得,我们其实已经相当不错地刻画了这些抽象规律可能是什么样子。
But then having spent a few years working on the book and and writing it and having, you know, things in the external world having changed a little bit as well, By the time I got to the end of it, I I ended up feeling like, actually, we've done a pretty good job of characterizing what some of those sort of abstract laws might look like.
特别是,在计算层面,我认为很明显,我们应该关注的是逻辑和概率论。
So in particular, at that computational level, I think it's pretty clear that the things that we should be thinking about are things logic and probability theory.
还有一些其他的东西,虽然我在书中没有详细讨论,但更多是关于这些理论如何转化为行动。
And then some sort of, you know, additional things that I don't really talk about in in detail in the book, but which are more about how that translates into action.
所以像决策理论、与之相关的强化学习,以及这些抽象原则的一些更实际的应用。
So things like decision theory associated with that reinforcement learning, and, you know, some some of these kinds of more practical applications of those those those abstract principles.
然后还有另一个层面,那就是这些原理如何转化为现实世界中可实现的物理系统,比如通过研究人工神经网络和人类大脑所得到的原则,对于理解智能同样重要,只是理解的方式不同。
And then there's another level, which is how does this translate into something which can be actually realized in a, you know, in a in a physical system in our world, right, where the principles that come from studying things like artificial neural networks and, you know, correspondingly, human brains are just as important to understanding intelligence, but understanding it in a different way.
对吧?
Right?
这更多不是关于‘我们为什么做这些事’,而是关于‘我们如何做这些事’,以及这些过程如何在物理系统中实现。
It's less the why do we do the things that we do and more the how do we do those things and what what's the the the way that that can be instantiated inside a a physical system.
然后,在这两个层面之间,还有很多我们尚未解决的开放性问题。
And then there are lots of open questions that we have that kinda lie in the territory between those two things.
对吧?
Right?
那么,你如何构建能够用神经网络等工具实现我们抽象思维所设想的那种能力的系统呢?
So, you know, how is it that you can make systems that are able to do something like what we think abstractly thought should be like using things like neural networks and so on?
它们在哪些方面还存在不足?
Where is it that they fall short?
它们与人类思维所找到的解决方案有何不同?
How is it that they differ from the kinds of solutions that human minds find?
作为认知科学家,我们还有很多工作要做,但我认为那些最抽象的问题,我们已经有一些答案了。
So there's plenty for us to do as cognitive scientists, but I think those most abstract questions are ones that we have, you know, some resolution on.
但你有没有,我知道这可能是个极其不公平的问题。
But do you have I know this is probably an extremely unfair question.
有没有一套这样的定律清单?
Is there, like, a list of the laws?
比如在热力学中,我们有第一定律、第二定律之类的,还是说这些并没有被如此系统地归纳?
Like, in thermodynamics, we have the first law, the second law, things like that, or is that a little bit not quite codified like that?
并没有被如此系统地归纳。
It's not quite codified like that.
是的。
Yeah.
我的意思是,我认为我们仍然可以指出一些东西,比如贝叶斯定理,作为概率论的一个普遍原则,它帮助我们描述如何进行归纳推理。
I mean, I think there's still you know, there there are things that we can point to, like, you know, Bayes rule as a general principle of probability theory that allows us to describe how it is that we should go about making inductive inferences.
你可以把这看作是一种候选的法则。
And you could think about that as something that's a candidate kind of law.
然后是逻辑,类似的东西比如假言推理,一种特定的论证形式。
And then logic, there's sort of analogous things like modus ponens, a particular form of argument.
对吧?
Right?
如果p,那么q;p,因此q。
If p then q, p therefore q.
对吧?
Right?
这是一种对某种推理的描述,只要你能代入你的p和q,在任何情况下这种推理都是有效的。
That's a sort of description of a a kind of inference that it's valid to make in in any circumstance where you can substitute in your your p's and q's.
对吧?
Right?
所以这些是具有法则特征的东西,但更准确地说,我们拥有这些数学系统,它们能帮助我们描述在不同情境下思考应该是什么样子。
And so those are things that are sort of appropriately law like, but it's more that we have these mathematical systems that allow us to characterize what it is, you know, thinking should look like in these different circumstances.
好的。
Okay.
你提到了概率论、贝叶斯定理之类的内容,我们稍后会谈到这些。
So you've mentioned probability theory, Bayes' theorem, things like that, and we will get there.
但我想先谈谈在那之前的情况,让大家明白这一转变有多么重要。
But I wanna sort of, you know, impress upon people how important that move was by first talking about what there was before that.
我的意思是,亚里士多德、莱布尼茨,甚至弗雷格和更现代的逻辑体系,都关注的是非真即假的事物,而不是概率。
I mean, Aristotle, Leibniz, even, you know, Frege and the more modern versions of logic, they were really about things that were either true or false, not just having probabilities.
对吧?
Right?
是的。
Yeah.
没错。
That's right.
所以,这本书开篇实际上是讲亚里士多德试图弄清楚什么构成了一个好论证。
So, really, the the place where the the book starts is with Aristotle trying to figure out what makes a good argument.
对吧?
Right?
亚里士多德是通过三段论来研究这个问题的。
And so Aristotle did that by thinking about syllogisms.
对吧?
Right?
这些是简单的论证形式,包含两个前提和一个结论,比如所有A都是B,所有B都是C,因此所有A都是C。
These sort of simple arguments where you'd have two premises and a conclusion, you know, where they're about sets of things like all a's are b's, all b's are c's, therefore all a's are c's.
对吧?
Right?
这是一种经典的三段论。
That's a sort of classic kind of syllogism.
所以他首先进行了一些理论思考,试图找出什么是好的三段论。
And so he he did some theorizing about, first of all, trying to identify what are good syllogisms.
然后第二步,他试图阐明:究竟什么是构成一个好三段论的理论基础?
And then second, trying to say, you know, what's the theory of what makes a good syllogism?
对吧?
Right?
我们可以用哪些属性来观察这些好的三段论,并找出它们的共同点?
What are the properties that we can use to look at these good syllogisms and say what it is that they have in common?
这就是为什么我说他可能是第一个真正尝试更深入地构建关于良好论证或甚至良好思维理论的人。
And that's why I was saying he was maybe the first person to to really try and develop a little bit more of a a theory of of what good argument or maybe even good thinking might look like.
这之所以重要,是因为对于接下来试图真正形式化思维的莱布尼茨和布尔来说,他们所要做的就是形式化亚里士多德的思想。
And the reason why that was important is that for both Leibniz and Boole, who were the next people who who tried to actually formalize thought, what they were trying to do was formalize Aristotle.
因此,他们宣称自己成功建立了思维的数学理论,其证明就在于他们能够重现亚里士多德关于什么构成有效三段论的结论。
So their way of saying, oh, I've succeeded in coming up with a mathematical theory of thought, the proof of that was going to be that they could reproduce the conclusions that Aristotle had produced about what made something a a good syllogism or not.
于是,他们各自以略有不同的方式开始了这项工作。
And so they set out to do that each in slightly different ways.
莱布尼茨认为,算术就足够了。
Leibniz had this idea that arithmetic was gonna be enough.
你知道,那就是他所理解的数学。
You know, that was sort of what he had as math.
对吧?
Right?
他深刻地懂得如何做,你知道的,你可以自动化算术。
And he sort of knew deeply how to do, you know, you could you could you could automate arithmetic.
他确实制造过机械计算器。
He'd sort of built mechanical calculators.
所以,如果你能用算术来表达思想,那么就能让机器为你完成这些工作。
And so if you could express thought in terms of arithmetic, then it was going to be something that you could get a machine to do for you.
他对此有一套完整的构想,想象这一切如何运作。
And he sort of had this vision of how all of this would work.
但他在自己构建的系统中并没有完全成功。
But it didn't quite work out in the the system that he had.
一百年后,乔治·布尔出现了,他接受了不同于英国人的数学训练,几乎学遍了所有欧洲大陆的代数。
And then a hundred years later, George Boole came along, and he he had a a kind of mathematical training that was unusual for an Englishman, sort of having learned all of this continental algebra.
然后他意识到,哦,真正需要的并不是算术。
And then he recognized that, oh, it wasn't quite arithmetic that you needed.
这是一种稍微不同的代数。
It was a slightly different algebra.
然后,他利用自己引入的这种数学方法,证明了可以将亚里士多德的全部理论以数学方式重新演绎。
And then that became something where he was then able to show using that math that he'd introduced, you could you could actually take all of Aristotle and sort of start doing it mathematically.
而那个标题《思维法则》正是源于十九世纪的这一努力,当时戈尔德及其同时代人认为,是的,就像自然界有规律一样,思维也可能存在一套平行的法则。
And that that that title, the the laws of thought, that comes from that nineteenth century effort where Gould and his contemporaries were interested in this idea that, yeah, just like you have the laws of nature, you might have this parallel laws of thought.
这就是我开始追溯至现代的第一条线索。
So that's the first thread that I start to trace through to the modern day.
我对莱布尼茨有点好奇,因为我其实并不了解你提到的他那部分工作——也就是将思维形式化为算术的想法。
I'm kinda curious about Leibniz because I actually don't know, about the aspect of his work that you're referring to, the idea of sort of formalizing thought as arithmetic.
一方面,我觉得这可能是现代心智计算理论的前身。
So on the one hand, I can see that it's maybe a precursor to modern computational theories of the mind.
另一方面,我完全不明白他在说什么。
On the other hand, I have no idea what he's talking about.
难道大量的思维活动跟算术毫无关系吗?
Like, isn't lot of thought have nothing to do with arithmetic?
他的目标究竟是什么?
What, like, what was his aspiration there?
是的。
Yeah.
这基于他写的一系列未发表的笔记。
So this is based on a a series of unpublished notes that he wrote.
这是一个非常有趣的案例,因为莱布尼茨显然是个天才,但这个问题他始终没有解决。
It's it's a really interesting case if, you know, Leibniz was clearly a genius, and this was a problem that he never solved.
但你能看到他如何一步步思考,因为他写了一系列笔记,这些笔记原本是他那本关于所谓‘普遍字符’的巨著的一部分。
But you get to see him working through this because there's this series of notes that he writes that are going to be part of his big book, which is about something he called the universal character.
对吧?
Right?
‘普遍字符’这个概念是说,你或许能够以某种方式记录事物,使得从中能清晰推导出其后果。
And the universal character was this idea that you might be able to write things down in such a way that it was clear what the consequences were that followed from those things.
因此,这个想法在当时他的同代人中也是普遍存在的。
And so this was an idea that, you know, his his contemporaries shared.
嗯嗯。
Mhmm.
在我写的书里,我提到了威尔金斯牧师,他发明了一种文字系统和相应的发音方式,使得说出虚假的话成为不可能。
In the book I talk about, the Reverend Wilkins, who came up with this script and corresponding way of pronouncing it in which it was impossible to say something which was false.
因此,你可以从‘鱼’这个词和‘猫’这个词中看出,鱼和猫是不同的东西。
So you could you could tell from the, you know, the word for fish and the word for cat that fish and cats were not the same things.
所以你永远不可能说‘鱼是猫’,因为你使用的符号或发出的声音本身就告诉你了这个物体在分类体系中的位置。
So you could never say a fish is a cat just because the the symbols that you use or the sounds that you would make corresponded to sort of it told you the location of this object within a taxonomy.
因此,人们曾有这样的想法:可以以某种方式系统化语言,从而使事物的真理性变得不言自明。
And so there was this idea that you could you could sort of systematize language in such a way that then, you know, the the truth of things would become become self evident.
对莱布尼茨而言,他想更进一步,用数学来实现这一点。
And for Leibniz, he wanted to take that one step further and do that through math.
于是,他把亚里士多德的三段论取过来,为这些三段论中的每个术语,或者每组事物,都关联上一些数字。
And so what he did was take, you know, your Aristotelian syllogisms, and then with each term within those syllogisms or each, you know, set of things, you wanted to associate some numbers.
因此,我们可以认为他发明了向量嵌入,因为他有一个想法:为每个事物关联一串数字。
So we could credit him with inventing the vector embedding because he had this idea that there was a string of numbers that you would associate with each thing.
这种版本只需要两个数字。
One version of this is just two numbers.
抱歉。
Sorry.
向量嵌入是大型语言模型中用来
The vector embedding is what is used in large language models to
是的。
Yeah.
没错。
That's right.
所以它的理念是,你可以用一组数字来表示一个词。
So it's the idea that you can represent a word as a vector of numbers.
于是他有了这样的想法:如果有一组数字代表某个集合,另一组数字代表另一个集合,当第二组数字能整除第一组数字时,我们就可以说第二组包含在第一组中。
And so he had this idea then, okay, so maybe if you have these numbers for this set and then these numbers for this set, if the second set of numbers divides the first set of numbers, then we can say that the second set is contained within the first.
因此,他找到了一种思考方式,即如何利用算术、质数以及尝试让这套系统运作起来。
And so, you know, that sort of gave him a way of thinking about how you could use arithmetic and sort of doing things with prime numbers and trying to figure out how to make this work.
然后他这么做了,给各种术语分配了数值,接着展开了一番论证。
And then he does this and he sort of, like, assigns values to various terms and then runs an argument.
他心想,好吧。
He's like, okay.
有效。
It works.
接着他转向亚里士多德的下一个内容,然后说,哦,不行。
Then he goes on to the next one in Aristotle, and he's like, oh, no.
不奏效。
It doesn't work.
然后笔记就戛然而止了。
And then the notes the notes stop.
对吧?
Right?
哦,还有
Oh, and
然后接下来的笔记,过了一会儿,你知道,他又让另一个论点成立,如此继续,但他始终没能完全让整个体系运转起来。
then the next note, a little bit later, you know, he gets another argument to work and so on, but he never quite gets the whole thing to to work out.
是的。
Yeah.
我对这个完全不熟悉。
I I am completely unfamiliar with that.
这太惊人了。
That's amazing.
但我非常熟悉它的精神。
But I'm very familiar with the spirit of it.
很明显,莱布尼茨是个追求万有理论的人。
Like, clearly, Leibniz was a theory of everything guy.
对吧?
Right?
就是那个能解释整个宇宙的简单秘诀。
Like, the one simple trick that will explain the whole universe.
这一直是历史上思想家们屡屡陷入的陷阱。
This is this has been something that thinkers throughout history have fallen for.
是的。
Yeah.
但他也具有远见,意识到如果他能将这套体系发展为算术,那么思考就将成为机器可以为我们代劳的事情。
But but he was also, you know, I think visionary in recognizing that if he could get this to work out as arithmetic, then thought would be something that we could get machines to do for us.
对吧?
Right?
从那以后发生的一切,其实都是在寻找更优的数学方法来实现这一目标,结果就是,我们确实能让机器替我们完成一部分工作。
And then you can think about everything that's unfolded from there is really trying to find better kinds of math to to make this work with with the consequence that, yeah, we can get machines to do some of it on our behalf.
乔治·布尔是另一个非常有趣的人物,因为同样地,我对那段历史了解不够深入,但我觉得人们可能容易低估他的贡献,因为他只是提出:如果所有数字都只有零和一呢?
And then George Boole is another fascinating character because, again, I don't know enough enough about the history there, but I think it's maybe easy to underestimate his contribution because in some sense, he just says, what if all numbers were zero or one?
对吧?
Right?
如果一切事物都只是真或假,这正是我们唯一关心的呢?
Like, what if everything was just true or false, that's all we cared about?
这比那要稍微复杂一点。
It's, yeah, it's a little more complicated than that.
如果你仔细梳理一下,他写了两本书。
If you if you work your way through so he wrote he wrote two books.
第一本篇幅很短,他是在一种充满远见的灵感状态下写成的。
The first one, he wrote it's quite short, and he, he wrote it in sort of like a visionary fit.
对吧?
Right?
他其实早在十几岁的时候,就曾在英格兰的一片田野里散步时产生了这个最初的构想。
So he he he actually had this first vision as a as as a teenager, you know, wandering through a field in England.
哇。
Wow.
他当时有一种顿悟,甚至将这种想法归因于神启,那就是或许可以用代数来描述思维。
He And sort of has this moment, which he really attributed to a divine insight, which was the idea that maybe something like algebra could be used to describe thought.
然后他变得极其忙碌。
And then he was incredibly busy.
他创办了自己的学校,在大半生里同时担任教师和校长,一边管理学校,一边撰写那些属于当时最高水平的数学论文,尽管从未获得过大学职位,仍荣获了皇家学会的奖章。
He, you know, started a school of his own, and he was he was, you know, for most of his life, a teacher and headmaster and running the school at the same time as writing these mathematical papers that were then, you know, the highest level of mathematics and receiving a a medal from the Royal Society despite never having had a university affiliation.
因此,这种数学精神体现在他关于这一主题的第一本书中,这本书正是我所说的,试图将亚里士多德的思想转化为数学。
And so that mathematical spirit was expressed in his first book about this, which did exactly what I was saying about trying to turn Aristotle into math.
他为此提出了一套方法。
And he sort of like had a scheme for doing this.
后来,这套思想进一步发展成一部关于思维规律的长篇论著,上半部分讲的是概率论,下半部分是——抱歉。
Then it got developed further into this long treatise and investigation of the laws of thought, which the first half is about probability theory, and then the second half is about sorry.
上半部分讲的是逻辑,下半部分讲的是概率论。
The first half is about the first half is about logic, and the second half is about probability theory.
在这本书中,他从亚里士多德出发,开始思考集合,以及如何用数学运算来表达集合之间的关系。
And in that book, the the place he starts because of Aristotle is with thinking about sets and how it is that you could think about expressing the relations between sets in terms of mathematical operations.
对吧?
Right?
如果x是一个集合,y是另一个集合,那么x乘以y就是我们现在所说的这两个集合的交集。
So if x is a a set and y is a set, then x times y is gonna be what we'd now call the the intersection of those sets.
对吧?
Right?
既是x又是y的东西。
The things that are both x and y.
他进而推导出了实现这一点的数学方法。
And he sort of works out the the math for doing that.
然后他进一步发展了这种数学方法,将其应用于推理。
And then he works out how to extend that math to arguments.
对吧?
Right?
所以,那些看起来更像我们所理解的逻辑陈述的东西,比如p为真、q为真等等。
So things that look a little more like what we think of as as as logical statements in terms of, you know, p is true and q is true and so on.
在这个发展过程中,有没有人曾想过——我知道你已经提到过布尔讨论过概率论,但人们当时是怎么想的,比如:
And did any at any point along this development I I know you already mentioned that Boole talked about probability theory, but what what was the thought about people thinking like, okay.
这确实是逻辑,但现实世界中的人并不总是那么有逻辑的。
This is logic, but actual people in the actual world aren't very logical all the time.
就连布尔也思考过这个问题。
Even Boole thought about that.
所以我认为布尔真正有趣的一点是他某种程度上把自己视为心理学家。
So I think one thing that's really interesting about Boole was that he he kind of thought of himself as a psychologist.
至少有一处地方,他称自己为心理学家。
There's there's at least one place where he describes himself as a psychologist.
对。
Right.
我想是在他提名皇家学会时,他是这样自我描述的。
It's I think in his nomination to the Royal Society, that's how he self describes.
但他绝对不是一位经验主义心理学家。
But he was definitely not an empirical psychologist.
如果他算得上心理学家的话,那他也是一位非常理论化的心理学家。
He was a very theoretical kind of psychologist if he was a psychologist at all.
因此,他对自己的方法毫不掩饰,曾写道:我们没有必要去进行实验,试图找出思维的规律。
And so his his approach, which he's unapologetic about, he he writes saying, you know, there's there's no need for us to go off and do experiments and sort of figure out what are what are the laws of thought.
因为当我们把它们写下来时,我们会觉得这种方式思考显然是合理的。
Because when we write them down, it's sort of self evident to us that this is a good way of thinking.
我们能够识别什么是好的思考方式,然后用数学将其捕捉下来。
We can we can recognize what good thinking looks like, and then we can sort of capture that with mathematics.
因此,他与人类实际的行为方式拉开了距离。
And so he had distanced himself from whatever it actually it is that that that humans do.
而这在二十世纪回过头来给认知科学家带来了麻烦,因为认知科学最初的兴起,正是源于认识到可以利用逻辑和计算机的工作方式,来构建关于人类头脑内部运作的理论。
And that was something that would come back and bite cognitive scientists in the twentieth century, right, as, you know, the the first kinda growth of cognitive science came out of recognizing that it was possible to use something like logic and something like what computers were doing as a way of generating theories about what could be going on inside people's heads.
他们一度大力推动这一思路,但随后开始意识到,这种方法并不能很好地解释许多现象。
And they ran with that for a little while, but then started to realize, oh, there were lots of things that it it didn't describe very well.
这为他们思考其他理论方法打开了大门。
That sort of opened the door to them thinking about other theoretical approaches.
这很可能是一种极大的简化。
Well, it's probably a huge oversimplification.
但认真对待我们对事物并不确定这一事实,即我们对某种信念是真是假赋予一定的概率,这个想法是有价值的。
But the idea of just taking seriously that we're not sure about things, that we do assign a certain probability to something being a true belief or a false belief.
这在我脑海中似乎是下一件大事。
That seems like the next big thing in my head.
是的。
Yeah.
那就是布尔著作的后半部分。
And that was the second half of Boole's book.
对吧?
Right?
我在谈论不确定性推理。
I was talking about uncertain inference.
所以布尔对归纳和演绎都非常感兴趣。
So so Boole was very much interested in induction as much as deduction.
对吧?
Right?
演绎是从确定的事物推导出不确定的事物。
So deduction is reasoning from certain things to uncertain things.
抱歉。
Sorry.
演绎是从确定的事物推导到其他确定的事物。
Deduction is reasoning from certain things to other certain things.
而归纳是从我们所知道的事物中进行推理,这些事物可能不足以完全确定结论,但仍然基于此做出某种合理的推断。
And induction is reasoning from the things we know, which might not be enough to determine, you know, what the conclusions are, but still nonetheless sort of making some reasonable inference on that basis.
对吧?
Right?
因此,他从科学家如何发现周围世界原理的角度关注归纳,显然他们并不是在进行类似演绎的过程。
And so he he was interested in induction from the perspective of how it is that scientists figure out the principles of the world around them, where it was clear that they're not doing something like deduction.
对吧?
Right?
并不是简单地识别出一堆为真的事物,然后推导出其后果。
Not sort of like being able to identify a bunch of things that are true and driving the consequences.
也许他们在理论构建时会做一点点这样的事。
Maybe they do a little bit of that when they're they're theorizing.
但提出理论本身,以及对世界做出一般性概括,甚至发现自然规律,都是如此。
But coming up with the theories themselves and coming up with a sort of generalizations about the world and even coming up with the laws of nature.
对吧?
Right?
是的。
Yeah.
这是一种归纳性的事业,他非常想理解这一点。
It's something that's an inductive enterprise, and he really wanted to understand that.
但这确实是一种归纳性的事业。
But it's an inductive enterprise.
让我们把这一点说清楚,因为我整整困惑了几十年。
Let's get this clear because it confused me for literally decades.
我的意思是,有一种逻辑意义上的归纳,它总能得出确定的结论。
I mean, there's a there's a logical kind of induction, which always does get you to certain answers.
对吧?
Right?
数学归纳法。
Mathematical induction.
但还有一种更非正式的归纳方式,你看到很多事情在发生。
But then there's this more informal kind of induction where you see, well, I see a lot of things happening.
也许这种情况一直都在发生。
Maybe that happens all the time.
是的。
Yeah.
没错。
That's right.
而且,同样地,十九世纪的人们试图厘清这些界限并弄清楚这一点。
And and, again, people in the nineteenth century were trying to draw those lines and figure that out.
对吧?
Right?
于是像查尔斯·桑德斯·皮尔士这样的人试图弄清楚:这就是演绎似乎运作的方式。
So you had people like Charles Sanders Peirce who is trying to work out, you know, here's here's the way that deduction seems to work.
让我们看看是否能为不同类型的归纳论证写出类似的模式。
Let's let's see if we can write sort of similar kinds of schemas for different kinds of inductive arguments.
他区分了归纳法,即观察事物的实例并推导出普遍规律,以及溯因法,没错。
He distinguish between induction, which is kind of seeing instances of things and then going to the general law, and abduction, which is Right.
观察到某事发生,然后为其提出一个解释。
Seeing something happen and then coming up with an explanation for it.
我会把这两种都称为归纳推理。
I would call both of those inductive inferences.
但,好吧。
But but Okay.
你知道,它们都具有某种根本性的特征。
You know, they're they're both have this sort of, like, fundamental.
你得出的结论中存在某种不确定性。
There's something uncertain about the conclusion that you're reaching.
然后我认为,要真正开始运用概率论提供的解决方案来理解人们如何做出归纳推理,还花了一点时间。
And then I think it it took a little longer to to really start to be able to use the solutions to that that are offered by probability theory as a tool for understanding how it is that people make inductive inferences.
展开剩余字幕(还有 480 条)
而且,这又是二十世纪的一种创新。
And, again, that's a kinda twentieth century innovation.
我非常推崇溯因推理。
I'm a huge fan of abduction.
我不确定我是不是喜欢得有点过头了,但在我看来,它最接近科学实际运作的方式。
I'm not sure if I'm more of a fan than I should be, but it it seems to me to be the closest to the way that science actually works.
对吧?
Right?
而且,它有时会被与最佳解释推论混淆,这其实是在承认:这并不是非黑即白、算法化的过程,但我们仍然在做某种独特合理的事情。
And and sometimes it's mixed up with inference to the best explanation, and it's sort of admitting that this is not clear cut and and algorithmic, but still there is something uniquely sensible that we're doing.
是的。
Yeah.
我认为这是对的。
I think that's right.
我的意思是,对于许多这类归纳推理,概率理论为我们提供了很好的描述,说明你应该如何做出这些推理。
I mean, I think for for many of these kinds of inductive inferences, probability theory gives us a good description of how it is that you should make those inferences.
但对于像 abduction 这样的情况,很多工作实际上是在人们如何真正做出这些推断的层面上完成的。
But for something like abduction, a lot of the work is actually done at the level of how people really do make those inferences.
对吧?
Right?
比如,其中一个挑战是你可能在提出一种前所未有的想法。
Like, one of the challenges is you're probably coming up with a kind of thing that no one's thought about before.
对吧?
Right?
你必须先提出一个假设,才能去思考这个假设。
You have to come up with a hypothesis in order to be able to entertain that hypothesis.
这属于一种算法层面的现象。
And that's something which is a algorithmic level phenomenon.
对吧?
Right?
这关乎我们的大脑在做什么,而不是我们被告诉用数学如何去做。
It's something which is about something our brains are doing rather than something that's told we're told how to do using the math.
这其实是在跳到我们对话的未来,但我现在就想提一下。
And this is skipping way ahead to the future of our conversation, but let me just bring it in right now.
我觉得,像我们现在拥有的大型语言模型或某种机器学习算法,在合适的背景下完全有能力求解爱因斯坦的广义相对论方程,但它们很难产生那个最初的创造性时刻——比如提出‘引力是时空弯曲’这样的想法,因为它们只是被训练去学习已经发生过的事情。
Like, I have this feeling that a large language model or some machine learning algorithm that we have right now would be perfectly good at solving Einstein's equations of general relativity in the right context, but it would really struggle to have that first creative moment where it it suggested that gravity is the curvature of space time because simply because they are trained on things that have already happened.
这是不是我太以人类为中心了?还是你觉得这其中确实有道理?
Is this me being anthropocentric, or do you think that there's some truth there?
我觉得这确实有一定道理。
I I think there's some truth to that.
这无疑是一个开放性问题,即这些模型在多大程度上能够做出这类推断性思考。
It's certainly something which I think is an open question about the capacities of these models in terms of the extent to which they're able to make those sorts of extrapolative inferences.
我们预期它们能够做到这一点,前提是它们被训练的内容能为这种能力提供某种基础。
And we would expect that they could do so to the extent that, you know, again, what they've been trained to do provides some kind of, you know, infrastructure for being able to do that.
对吧?
Right?
因此我们发现,这些模型实际上在某些类型的创造性思维上表现得相当不错。
So we find that there are certain kinds of creative thinking that these models can actually do reasonably well.
比如,它们在提出简单的类比时,可能比人类更擅长。
Like, they're maybe better than people that coming up with simple kinds of analogies.
对吧?
Right?
是的。
Mhmm.
你可以把这看作是它们对语言有着极其精细理解的结果。
And you can think about that as being a consequence of having this very fine understanding of language.
对吧?
Right?
这对实现这一点很重要。
That's that's important for doing that.
但类比也是发现的重要组成部分。
But analogy is an important part of discovery too.
有时候,你通过意识到某个领域中的想法可以应用到另一个领域而做出发现。
Sometimes you make a discovery by recognizing that, an idea from one domain applies in another domain.
所以我认为,情况不会像他们无法做到我们所认为的溯因推理那样简单。
And so I think it's not gonna be as simple as they're not able to do this thing that we think of as abductive inference.
我认为,他们会有一些擅长的事情,因为这些事情与他们试图完成的任务相契合,而另一些事情可能对他们来说更难,因为这些事情与他们的训练方向相冲突。
I think it's gonna be something where there's gonna be a sort of set of things that they're gonna be able to do well because they align with the kinds of tasks they're trying to do, and then a set of things that maybe are are harder for them because they push against that training.
是的。
Yeah.
好的。
Okay.
这确实有道理。
That does make sense.
回到概率和信念,我的播客听众经常听我谈论贝叶斯推理。
And back to the probabilities and the beliefs, my podcast listeners hear me talk about Bayesian reasoning all the time.
我们这里讨论的就是这个吗?
Is that what we're talking about here?
我知道贝叶斯本人只是……我印象中他的公式是死后才发表的吧?
And I know that Bayes himself just sort of I get was it even posthumously published his his formula?
所以他并不是这场讨论中的主要人物,但我们还是给他一些赞誉。
So he was not a big player in that discussion, but we give him some credit.
是的。
Yeah.
所以我们之前谈到了一条线索,对吧?那就是从莱布尼茨到布尔的逻辑线索。
So so we we talked about one thread, right, which is the the thread of logic through Leibniz and Boole.
而这本书实际上讲的是三种思维脉络。
And then the book is really about three threads of thinking.
对吧?
Right?
第一条是逻辑。
So one is logic.
第二条是神经网络的基础,我大致将其描述为空间、特征和网络。
One is sort of foundations of neural networks, which I sort of characterize in terms of spaces, features, networks.
对吧?
Right?
所以把思想看作是空间中的一个点。
So thinking about thoughts as corresponding to a point in space.
对吧?
Right?
而我们用来思考空间的一些数学工具,比如微积分等,也可以用来思考思想是如何运作的。
And some of the mathematics that we use for thinking about spaces, like calculus and so on, being a tool for then thinking about how thoughts work.
第三个线索是概率论,我认为它对解释思想运作的各个方面都非常有帮助,并且与其他线索相互补充。
And then the third thread is this thread of probability theory, which I think each of them is very helpful and sort of complementary to the other in in explaining various aspects of of how thinking works.
关于概率论,我在书中重点关注的是十八世纪的这种观点,即概率论可以应用于思想,这是一个非常激进的想法。
And so for probability theory, yeah, the the the origin that I focus on in the book is this sort of eighteenth century idea where really the the radical idea is that probability theory could be applied to thought.
对吧?
Right?
在那之前,人们一直在将概率论发展为一种数学理论。
So before that, people had been developing probability theory as a kind of mathematical theory.
嗯。
Mhmm.
这是一门应用于赌博游戏等事物的数学理论。
And it was a mathematical theory that applied to things like gambling games.
对吧?
Right?
因此,你在概率论的最早起源中就能看到这一点。
So you see this in the very earliest origins of probability theory.
最典型的例子就是吉罗拉莫·卡尔达诺,他是一位数学家,同时也是一位沉迷赌博的人。
It's, you know, it's the the the best example we have is like Juralama Kadana, who is a mathematician, but also an addictive gambler.
是的。
Yep.
对吧?
Right?
因此,出于娱乐和财务方面的考虑,他非常想知道如何思考掷骰子或这类概率事件的结果。
And so he he really wants to know for his recreational, you know, and financial reasons how to think about the outcomes of rolling dice or these sort of probabilistic events.
于是,他推导出了如何计算这些结果的数学方法。
And so he works out the mathematics of how to do that.
在概率论的早期发展中,下一个重要时刻是布莱斯·帕斯卡做了类似的事情。
And there's a few, the next sort of moment that we see in the origins of probability theory is Blaise Pascal doing something similar.
对吧?
Right?
他试图解决一个赌博问题,并由此发展出概率论的一些基础理论。
Sort of like trying to solve a sort of gambling problem and and from that developing some of the the foundations of probability theory.
但那是一种关于掷骰子会发生什么的理论,是的。
But that was a theory of, yeah, what happens when you roll dice.
没错。
Right.
对吧?
Right?
贝叶斯的创新在于提出:也许我们现有的这种数学体系。
And the the innovation that comes with Bayes was saying, well, maybe this mathematical system that we have.
对吧?
Right?
这种描述某个数学对象的公理体系,也同样适用于我们感兴趣的另一个事物。
This sort of set of axioms that characterizes some mathematical object also characterizes another thing that we're interested in.
这不仅仅是关于掷骰子的事情。
It's not just what's going on with dice.
它还关乎我们改变信念时,头脑内部发生的过程。
It's also what's going on inside our heads when we change our beliefs.
对吧?
Right?
所以他当时在思考一些受赌博启发的例子。
So he was thinking about some gambling inspired examples.
对吧?
Right?
比如,如果有一个彩票以某种比率兑奖,你该如何估算它的兑奖率?
So, you know, if there is a lottery which is paying off at some rate, how do you estimate the rate at which it's paying off?
但他设定这个问题的方式,是基于你对这个彩票所持有的信念。
But the way that he sets that up is in terms of the beliefs that you have about that lottery.
比如,基于你迄今为止看到的例子,你对它在下一刻 payout 的概率应该做出怎样的合理估计?
Like, what's the what's a reasonable estimate that you should have for the probability that it's gonna, you know, pay off at the next moment given the the examples that you've seen so far.
随后,皮埃尔·西蒙·德·拉普拉斯进一步发展了这一思想,他独立提出了这一观点,并深入探讨了这种思考方式对更新我们信念的所有后果。
And then that that idea is developed further by Pierre Simon de Plasse who really sort of came up with it independently and really, you know, worked out all of the consequences of of that way of thinking about how to how to update our beliefs.
所以,是否可以大致说,在贝叶斯和拉普拉斯之前,人们可以赌博。
So would it be fair to say that, at least roughly speaking, pre Bayes and Laplace, you could gamble.
当时存在不确定性之类的问题。
There could be uncertainties and things like that.
但在思考时,人们被认为要么对,要么错。
But when it came to thinking, you were supposed to be right or wrong.
而在他们之后,你可以说:我对自己的信念有一定的信心程度。
And after them, you could say, well, I have a certain degree of confidence in my beliefs.
我认为,贝叶斯之前确实存在一个明确的转折点,只是当时还没有系统地发展出这一理论。
It's it's I I you know, I think I think there's a discrete point that happens with Bayes in the past just instead of working working out that that theory.
但在那之前,你确实能看到一些人用这种说法进行讨论。
But you do see hints of people talking in those terms before that.
对吧?
Right?
所以就连威尔金斯,那个提出了一种你永远不会说错话的语言理念的人,也把概率当作表达信念程度的一种方式。
So even Wilkins, you know, who who came up with this sort of idea of, you know, the the language that you could use where you could never say anything false, talked about probability as a way of talking about a degree of belief.
他并没有深入推导出数学上的后果,但他确实使用了这种语言。
He didn't sort of work out the mathematical consequences, but he sort of used that language.
帕斯卡在离开数学界、转而投身宗教思考后, famously 做了一个概率论的论证,这其实也是一个关于信念的论证。
And Pascal famously, you know, after he departed the world of mathematics and and instead started to think about religion, made a probabilistic argument in that setting, which is really an argument about belief as well.
所以在贝叶斯和拉普拉斯之前,你就已经能看到很多这样的迹象了。
So you you see hint lots of hints of this Good.
在贝叶斯和拉普拉斯之前,你就已经能看到很多这样的迹象了。
Prior to Bayes and Laplace, but I think they're the the ones who really developed that into a a theory of what we could call thought.
让我们深入探讨一下这种我们可以称之为思维的理论。
So let's dig into that theory of what we could call thought.
我们拥有一些信念。
We have some beliefs.
它们是概率性的。
They're probabilistic.
你该如何像布尔所建议的那样,用对待真假命题的方式来推理它们?
How do you reason with them in the way that Boole would have had us reasoning with true and false statements?
我认为贝叶斯概率最酷的地方在于,一种非常自然的理解方式就是将其视为逻辑的延伸。
I I think the really cool thing about Bayesian probability is that one very natural way to see it is just an extension of logic.
对吧?
Right?
在逻辑中,我们谈论可能的世界。
So in logic, we talk about possible worlds.
对吧?
Right?
如果你有两个命题 p 和 q,你可以想象所有你可能身处的世界。
So if you have two propositions, p and q, you can imagine all of the possible worlds that you could be in.
你可能处于 p 为真且 q 为真的世界,p 为真而 q 为假的世界,p 为假而 q 为真的世界,以及 p 和 q 都为假的世界。
You could be in a world where p is true and q is true, a world where p is true and q is false, a world where, you know, p is false and q is true, and a world where both p and q are false.
对吧?
Right?
这些就是我们可能身处的可能世界,而逻辑真正关注的是,基于你对自身所处世界的了解,你能确定地得出哪些结论。
Those are the possible worlds we could live And logic is really about what conclusions you can draw with certainty based on the information you have about what world you might be in.
对吧?
Right?
所以,如果你掌握了足够的信息,能够排除某些可能世界,从而确定你所处的世界必然是q为真的那个,那么得出q为真的结论就是合理的。
So if you have got enough information to rule out some of those possible worlds such that it has to be the case that the world you're in is one where q is true, then it's reasonable to conclude the q is true.
对吧?
Right?
因此,你所熟知的经典逻辑论证,其实就是告诉你:你所掌握的信息对可能世界施加了足够的约束,使得你可以合理地得出关于这个世界的结论。
And so your classic logical arguments are arguments that are telling you, oh, the information you have gives you enough constraints on the world that you might be in that this is a reasonable conclusion that you can draw about that world.
概率论更进一步,它为这些可能世界赋予了一个数值。
Probability theory takes one more step, which is to say, for those possible worlds, we're also going to assign a number to them.
这个数值反映了我们对这个世界为真的信念程度或概率。
And that number reflects our degree of belief about the the probability that that world is true.
一旦你这么做了,并开始遵循概率论的规则,你就会根据到目前为止所掌握的信息,更新你对当前可能处于哪个世界的信念。
And then as soon as you do that and you start following the rules of probability theory, you're then updating your beliefs about, you know, what world is it that we're likely to be in based on the information that I've got so far.
因此,逻辑论证仍然有效。
And so the logical arguments still work.
对吧?
Right?
所以,你可以把逻辑论证理解为告诉你某事以概率一为真。
So the you know, you can think about a logical argument is telling you that something is true with probability one.
也就是说,你必然处于一个该事物为真的世界中。
That with certainty, it has to be the case that you're in a world where this thing is true.
但概率论对此进行了扩展,允许我们说:我们没有足够的信息来确定某事必然为真。
But it's generalized by probability theory and allowing us to say, oh, we don't have enough information to determine that this thing is true, you know, with certainty.
我们可以说:这件事有70%的可能性为真。
We can say, oh, well, there's like a 70% chance this thing is true.
但核心理念是一样的:随着你获得更多信息,你可能会排除一些可能的世界。
But the same kinda idea of as you get information, you're maybe ruling out some possible worlds.
也许,正因为如此,你正在改变你所处的其他世界的概率,或者你获得了新的信息,从而改变了你认为自己处于某个世界而非另一个世界的可能性。
Maybe, you know, and as a consequence of that, you're changing the probabilities of the other worlds that you could be in or you're getting information that that changes the chances that you think you're in one world or another.
然后你就只是在做类似的事情。
And then you're just doing the same kind of thing.
只不过你现在是以一种更加细致入微的方式在做这件事。
It's just that you're now doing it in this much more graded way.
就像我们可以质疑人们在多大程度上符合传统的逻辑一样。
And just like we can question how conventionally logical people are.
我们也可以质疑他们在新数据到来时,更新信念的能力有多强。
We can also convent we can question how good they are at updating their beliefs when new data come in.
我的意思是,我们是否应该把完美的贝叶斯推理再次视为思维法则的一种理想目标,还是说它本意是描述人们实际的思考方式?
I mean so maybe should we think about perfect Bayesian reasoning as once again aspirational when it comes to laws of thought, or is this meant to be a description of how people actually think?
不。
No.
它是一种在抽象计算层面上的工具,用来说明:我们究竟应该怎么做?
It's it's a it's a tool at that abstract computational level of saying, what is it that we should be doing?
是的
Yeah.
对吧?
Right?
我们心智所面临问题的理想解决方案是什么?
What's the solution the ideal solution to the problem that that our minds face?
因此,当我们的思维面对归纳问题时,概率理论告诉我们这些问题的理想解决方案是什么样子。
And so when our minds face inductive problems, probability theory tells us what the ideal solution to those problems look like.
当然,有很多方式表明这与人们实际的行为并不一致。
And, of course, there's lots of ways that that doesn't line up with the things that people actually do.
在二十世纪,我们开始探索这些差异。
And in the twentieth century, we started to explore those.
其中一些,我认为可以从这种贝叶斯视角来解释,但我们需要考虑人们实际上在做些与被指示内容略有不同的事情。
Some of those, I think, are things that we can we can explain from this sort of Bayesian perspective, but where we have to think about people doing something slightly different from the thing that they've been told to do.
嗯
Mhmm.
我可以再多谈一点这个。
I can talk a little more about that.
而其中一些则处于这些不同分析层次之间。
And then some of them are in between these different levels of analysis.
对吧?
Right?
所以我们有抽象计算层面,即你该做什么?
So we have that abstract computation level, what should you be doing?
然后是算法层面,即什么是接近目标的好策略?
And then there's the algorithmic level, which is what's a good strategy for trying to get close to that?
你可以在这一层提出一个问题,即在给定资源限制的情况下,你能做到的最好程度是什么。
And you can ask a question at that level, which is about, you know, what's the best that you could do at trying to do the thing you're supposed to be doing with particular constraints on the resources that are available to you.
对。
Right.
当我们明确这些限制条件时,实际上就能推导出其具体表现形式。
And then when we characterize what those constraints are, we can actually work out what that looks like.
在许多情况下,这实际上让我们对人们一些看似奇怪的行为有了更深的理解。
And and in many cases, that actually gives us insight into some of the, you know, what might seem like strange things that people do.
我在这方面做了很多研究,我们几周后将出版一本关于这个理念的书,我们称之为资源理性。
And that's something that I've done a bunch of work on, and we actually have a book coming out in a few weeks, which is about that idea of, we call it resource rationality.
这本书探讨了如何在认识到这些约束后,改变我们对理性的理解。
And the the book explores how to change our notions of rationality as a consequence of recognizing those kinds of constraints.
所以你有两本书将在一个月内相继出版吗?
So you have two books coming out, like, within a month of each other?
这真的很尴尬。
It's it's very awkward.
它们的出版时间只相隔一周。
They're coming out within a week of one another.
其中一本面向普通读者,探讨关于思维规律的这些理念。
So where one one is, you know, more for a general audience and sort of exploring these kinds of ideas about the laws of thought.
另一本则更学术一些,专注于我们该如何利用有限的认知资源。
And then the other is a slightly more academic book, which is focused on, this this idea of, what we should do with our, limited cognitive resources.
你可以同时为两本书签名,举办签售会。
You should you can still do book signings where you sign both at once.
这样也没问题。
That's okay.
这完全说得通。
That makes perfect sense.
换句话说,我们能不能简化一下:我们希望用贝叶斯定理来更新我们的信念,但这很难,而且需要大量资源。
So in other words, can we can we sort of simplify it down to we would like to use Bayes' theorem to update our beliefs, but that's hard, and that takes a lot of resources.
所以,进化和生物学赋予了我们某些捷径吗?
So evolution and biology have equipped us with certain shortcuts?
我认为我可以这样理解:是的。
I think the the way that I would think about it is yeah.
我认为人类认知的一个有趣悖论在于,从心理学家的角度看,我们是容易出错的决策者,会使用这些启发式方法导致偏见。
I think it's that one of the interesting paradoxes of human cognition is that we are both, you know, from the perspective of a psychologist, error prone decision makers who sort of use these heuristics to result in biases.
但从计算机科学家的角度看,我们又是那些具备我们希望AI系统具备的能力的理想型主体。
And from the perspective of computer scientists, these aspirational agents that are doing the kinds of things that we'd like our AI systems to do.
对吧?
Right?
因此,如果你想解决这个悖论,我的做法是说:我们擅长利用现有的资源解决所面临的问题。
And so if you wanna resolve that paradox, the way that I resolve it is to say, we are, you know, good at solving the kinds of problems that we face with the resources that we have.
对吧?
Right?
你可以将这视为不同形式适应的结果。
And you can think about that as being the consequences of the different kinds of adaptation.
包括进化和学习。
So evolution as well as learning.
对吧?
Right?
因此,在我们的一生中,学会更有效地利用我们的认知资源。
So over the course of our lifetime, learning to use our cognitive resources better.
对吧?
Right?
还包括进行某种规划,或者我们称之为元推理,即思考如何恰当地应对我们试图解决的不同类型的问题。
As well as just sort of engaging in some planning or we call it meta reasoning about how to appropriately approach different kinds of problems that we're trying to solve.
我们之前曾邀请卡尔·弗里斯顿做客播客,他谈到了自由能原理。
We did have, Carl Friston on the podcast some time ago talking about the free energy principle.
这是否是一个例子,说明大脑正在以一种高效的方式解决这些难题?
Is that an example of, you know, the brain trying to solve these hard problems in a efficient way?
这是个好问题。
That's a good question.
我还没有从这个角度思考过。
I haven't thought about it in those terms.
我认为他设定这个框架时,更多是从系统所追求的一种目标出发,而不是从资源限制的角度来看。
The I think the way that he sets that up is more in terms of a a kind of objective that the system has rather than in terms of a resource constraint.
而我们的方式则更明确地指出:如果我们想重新定义理性,使其适用于具有有限计算资源的智能体,
And the way that we think about it is a little more explicitly saying, if you are if we want to redefine what rationality is in a way that works for agents with finite computational resources.
这借鉴了人工智能领域斯图尔特·罗素和埃里克·霍维茨的一个观点。
This is drawing on an idea from from the AI literature from Stuart Russell and Eric Horvitz.
对于一个有限资源的智能体来说,定义理性的方式更侧重于:不再关注概率理论告诉你应该采取的行动,而是使用最佳算法来选择你要采取的行动。
The the way to define what rationality is for a a bounded agent is more in terms of, taking, instead of focusing on sort of taking the action that's the action that probability theory and someone tells you you should take, it's using the best algorithm to choose the action that you're going to take.
这就相当于提升了一个抽象层次。
And so it's it's sort of popping up a level of abstraction
好的。
Okay.
从元层面思考理性,将其作为一种工具,用以确定在具体层面——即你所采取的行动中——如何合理运用你的认知资源。我之前写的《算法来生活》一书,实际上是对这些理念的一个很好的大众化解读。
In terms of thinking about, you know, defining rationality at that meta level as a as a tool for then generating, you know, what are the sort of appropriate ways of using your cognitive resources at the what what the the object level, the actions you take in the And and my previous book, Algorithms to Live By, is actually a a pretty good sort of general audience treatment of those ideas.
我们当时并没有用资源理性这一框架来表述,但它核心讲的是:在某种程度上,计算机科学为理性提供了更好的指引。
We we didn't express it in terms of this framework of resource rationality, but it's really about the idea that, you know, in some ways, computer science provides a better guide Right.
比经济学更能指引理性,比如那种基于概率和回报的思维方式。
To rationality than than, you know, economics, right, sort of thinking in terms of probabilities and rewards.
我的意思是,我以前从未这样想过,但当你在编写计算机程序时,资源限制是显而易见的。
I mean, I guess I've never really thought about it this way, but when you are computer programming, the resource limitations are obvious.
你有有限的时间、有限的内存,还有有限的各种数据。
You have a certain amount of time, certain amount of memory, right, a certain amount of whatever data.
但当然,同样的道理也适用于人脑,我们可以想象理想化的思维,但真正实现这些思维可能会消耗过多的资源。
But, of course, the same things are gonna apply to human brains where we can we can imagine ideal thoughts, but actually having them is something that's gonna be probably too resource intensive.
是的。
Yep.
对。
Yep.
我认为这就是我思考这个问题的原因。
I think that's a reason why I think about it.
那么,你刚才提到过一点,但到底我们应该怎么做呢?
And so what so what is you I mean, you said a little bit about this, but what do we do?
我们的策略是什么?
What are our strategies?
生活在一个不完美的世界里,不可能时刻都做到精确的贝叶斯推理,人类实际使用了哪些捷径?
Living in an imperfect world, not being able to be exact Bayesians at all times, do we have what are the shortcuts that actual human people use?
是的。
Yeah.
所以我们使用的一些策略,就是人们所识别出的启发式方法。
So some of the the kinds of strategies we use are the kinds of things people have identified as, you know, heuristics.
对吧?
Right?
启发式方法就是指一种经验法则或解决问题的捷径。
Heuristic just means a rule of thumb or a shortcut for solving a problem.
我认为,从资源合理性角度重新分析这些启发式方法的价值在于,能够说明使用这些启发式方法并不一定是坏事。
I think part of what is valuable about reanalyzing those from the perspective of, you know, resource rationality is is being able to say that using those heuristics isn't necessarily a bad thing.
对吧?
Right?
而这些启发式方法带来的偏见,也可能并非在你现有的认知资源条件下能够避免的。
And the biases that come from those might not necessarily be things that you can avoid given the cognitive resources that you're operating with.
因此,我们可以问一个问题:我们是否在现有认知资源条件下做到了最好?
So instead, we can ask a question like, are we sort of doing the best job we could with the cognitive resources that we have?
然后,我们是否可以通过在特定情境下使用不同的启发式方法,来缓解这些偏见?
And then is there a way that we could mitigate those biases by maybe using a different heuristic in a particular setting or something like that?
因此,我们在资源理性研究中关注的策略包括用于近似贝叶斯推理的采样策略。
And so the kinds of strategies that we focus on in our work on resource rationality are things like sampling strategies for approximating Bayesian inference.
因此,你不必考虑整个概率分布,而只需考虑少数几个样本、少数几种可能的情况。
So instead of thinking about a whole probability distribution, you might think about a few samples, a few possible instances.
对吧?
Right?
当你做决定时,不必考虑所有可能的结果,而只需考虑少数几种可能的结果。
Instead of when you're making a decision considering all of the possible outcomes, you might think about a few possible outcomes.
对吧?
Right?
诸如此类的方法。
And things like that.
我们可以观察人们在这样做时所使用的策略——当他们没有考虑所有可能性,而只考虑部分可能性时,这些策略是否符合在有限资源条件下应考虑的最优选择。
And we can look at the kinds of strategies that people seem to use when they do that, when they don't consider all the possibilities are the ones that they're considering, the ones that they should be considering from the perspective of using limited resources.
我们思考的其他方面包括设定目标,或者子目标。
And then the other kinds of things that we think about are things like setting goals, right, or sub goals.
所以我认为,从认知研究的历史角度来看,艾伦·纽厄尔和赫伯特·西蒙似乎提出了这样一个观点:当我们解决复杂问题时,需要将这些问题分解成若干部分。
So I think from the perspective of the sort of history of thinking about cognition, looks like Alan Newell and and Herb Simon sort of introduced this idea of, you know, we can think about when we're solving challenging problems, we need to decompose those problems into parts.
而聪明的一部分表现,就在于能够以这种方式进行分解。
And part of what it is to be smart is to be able to decompose them in those ways.
但在很多方面,这种分解问题和设定目标的能力,实际上是资源受限的必然结果。
But in many ways, that ability to kinda break down problems and set goals is really a consequence of a resource constraint.
对吧?
Right?
所以,如果你拥有无限的认知资源,就根本不需要设定目标,因为你可以直接推理出你所做选择最终会导致的所有结果。
So, you know, if you had infinite cognitive resources, you would never need to set goals because you can just reason all the way to the end of the trajectory of, you know, whatever is gonna, arise from from the the choices that you're going to make.
因此,设定目标和子目标,是一种在认知资源有限的情况下推进问题解决的工具。
And so setting goals and sub goals and so on is a tool for being able to make progress on problems with finite cognitive resources.
那么我们可以进一步问:哪些目标是好的目标?
And then we can ask, what are the good goals to set?
从资源理性角度来看,什么样的问题结构才是理想的?
What's a good structure to to sort of give to to a problem from that perspective of resource rationality?
抱歉。
So sorry.
从某种意义上说,我们每个人都有一个目标,可以理想化为过上最好的生活。
In some sense, we all have a goal that could be idealized as live the best possible life.
但你说,作为一种实现这一目标的策略,这并不真正合理。
But you're saying that as a strategy for getting there, that's not really reasonable.
我们不可能像《复仇者联盟》中的奇异博士那样,穿越多元宇宙的每一个可能路径,我们必须设定一些中间目标,以获得一条大致不错的前进轨迹。
We can't actually be doctor Strange in the Avengers and go through every possible part of the multiverse, we have to sort of have subgoals along the way that give us an approximately pretty good trajectory.
是的。
Yeah.
然后你可以问,显然你不希望你的子目标太近。
And then you can ask, you know, you obviously don't wanna make your subgoal too close.
对吧?
Right?
你也不希望它太远。
And you don't wanna make it too far away.
因此,人们应该在哪里设定他们的子目标,以及他们是否很好地设定了这些子目标,这个问题我们可以从资源合理性的角度来探讨。
And so the question of where people should set their sub goals and do they do a good job of setting those sub goals is is a question that we can engage with from that from that perspective of resource rationale.
总会有这样一个问题,即贝叶斯推理:因为整体图景是你有一些先验概率。
There's always this question Bayesian reasoning that because the whole picture is you have some prior probabilities.
你获得了一些数据。
You get some data.
你计算出一个似然函数。
You calculate a likelihood function.
你更新了你的先验。
You update your priors.
但那么,这些先验又是从何而来的呢?
But so then where did the priors come from?
这个问题在这里是否也涉及其中?
Is is that question involved here?
也就是说,人类究竟从哪里获得对不同命题可能性的初步直觉?
Like, where do human beings actually have their rough feelings about the plausibility of different propositions?
是的
Yeah.
所以我认为这是一个非常深刻的问题,而且我认为可以从不同角度来提出这个问题。
So this is, I think, a very deep question, and I think there's different different ways that you can ask it.
对吧?
Right?
所以,有一种思考方式是,当你把贝叶斯法则视为描述归纳问题理想解决方案的工具时,这种描述既适用于你在感知中做出的推断,比如当你试图解读落在视网膜上的光线时。
So, there's there's one way of thinking about so so when when you think about Bayes' rule as our tool for describing what ideal solutions to inductive problems look like, That characterization applies both to, you know, an inference that you might make in perception when you're trying to interpret the light that's falling on your retina.
你的大脑必须做一些看起来像是归纳推断的事情,才能弄清楚外部世界的结构,比如解读别人说的一句话,你听到或感受到的声音,会转化为对对方所说内容及其含义的推断。
Your brain has to do something that looks like an inductive inference, right, to figure out the structure of the world out there, to interpreting a sentence that somebody says, right, where you're taking the words that you hear or the, you know, the sound that's hitting your eardrum and sort of turning that into a a an inference about what it is that the person said and maybe what they meant.
但同时也适用于一些根本性的问题,比如我们最初是如何学习语言的?
But also to, you know, like, fundamental things like how do we learn language in the first place?
而且
And
大脑是如何学会解读我们周围物理世界的结构的?
how is it that brains, you know, come to be able to interpret the structure of the physical world around us?
对吧?
Right?
所以所有这些都可以被视为归纳问题。
So all of those things are things you can think about as inductive problems.
因此,在这些不同情况下,先验的来源也会不同。
And so asking where the priors come from is gonna be different in those different cases.
对吧?
Right?
所以更根本的情况,也就是关于我们如何学习语言的情况,如果我们把这看作一个归纳推理问题的话。
So the the the more fundamental case, the one which is about, like, you know, how do we learn language if we think about that as a a problem of inductive inference.
在那里,先验将反映我们天生的学习语言的倾向,同时也包括所有其他非语言输入的信息来源。
The priors there are gonna reflect whatever the innate predispositions we have to learn language, but also all of the other sources of information that we have, you know, that are sort of not the linguistic input.
对吧?
Right?
因此,我们在世界中的经验等等,都会影响我们从听到的言语中学习语言的方式。
So the experience that we have in the world and and so on is is stuff that's gonna inform the way that we learn language from the the utterances that we hear.
因此,这是一个很好的工具,可以帮助我们思考人类与大型语言模型之间的差异,而人类心智和大脑与当今大型语言模型之间的主要区别就在于归纳偏差。
And so that is a good tool for using for thinking about what are differences between, like, humans and large language models, where the big difference between human minds and brains and large language models that we have today is about inductive bias.
它关乎我们人类能够从极少的数据中学习,而大型语言模型则需要海量的数据进行训练。
It's about being able to learn from the small amounts of data that we get as humans relative to the very large amounts of data that our large language models are trained on.
对吧?
Right?
一个孩子只需五年左右的语言接触就能学会使用语言。
So a human child learns to use language in about five years of exposure.
对吧?
Right?
相比之下,训练大型语言模型所用的数据量,相当于5050年连续不断的语音输入。
By comparison, that he used to train large language models is, you know, the equivalent of between 5,050 of of continuous speech.
对吧?
Right?
因此,这仅仅是数量级上的巨大差异。
So it's just sort of orders of magnitude difference.
而构成这一差距的就是归纳偏差。
And the thing that makes up that gap is inductive bias.
它是我们作为人类所拥有的先验分布(广义上)所带来的东西,正是这些使我们能够弥合这一差距。
It's it's the the sort of the thing that comes from our prior distributions broadly construed, right, as human beings that allows us to close that gap.
当我们观察日常生活中所做的推断——比如解释一句话或理解视觉信息这类短期任务时,这些先验正是那些学习过程发挥作用的结果。
When we look at, you know, the the sort of everyday inferences that we make, these sort of short term things like interpreting a sentence or or, you know, making sense of visual information, those priors are things that are really a consequence of those learning processes having worked.
对吗?
Right?
我们已经构建了关于周围世界的模型,这些模型影响着我们解释所经历数据的方式。
It's that we've built models of the world around us that inform the way that we interpret the data that we experience.
这一点上,我认为这些先验的来源没那么神秘,因为它们来自这个世界。
And that's something where, you know, I think it's it's a little less mysterious where those priors come from because they come from the world.
但它们也来自这个世界,再加上我们作为学习者所具备的更普遍的归纳偏差。
But they also come from the world plus, again, whatever are sort of, like, more general inductive biases as learners.
所以我们并不是白板。
So we're not blank slates.
对吧?
Right?
我的意思是,我想从康德到诺姆·乔姆斯基,许多思想家都说过,是的,我们生来就带着一些想法。
I mean, I guess various thinkers from Kant to Noam Chomsky have said that, like, yeah, we're we're born with some ideas in our heads.
而且,我们如今在二十一世纪,显然更擅长分辨哪些想法是与生俱来的,哪些是后天习得的。
And presumably, we're a lot better now in the twenty first century at at teasing out which of the ideas we do come born with and and which we pick up along the way.
是的。
Yeah.
我觉得这已经涉及到二十世纪的认知科学了。
I I think the the sort of this is getting into sort of twentieth century cognitive science.
对吧?
Right?
我们从十九世纪的逻辑和概率理论,跃迁到了二十一世纪对人们是否以正确方式、资源理性等的思考。
So we made a leap from the nineteenth century, you know, which is our logic and probability theory to sort of, like, twenty first century considerations about, you know, are people basing in the right ways and resource rationality and so on.
中间缺失的环节是二十世纪,那时人们开始将这些数学工具用于试图理解人类心智。
The missing chunk there is the twentieth century, which is where people began to use these mathematical ideas as a tool for trying to understand human minds.
对吧?
Right?
因此,二十世纪上半叶,心理学专注于成为一门严谨的科学学科,而它最初的基础其实是内省式的,你会问人们
And so, the first half of the twentieth century, psychology was, really focused on trying to be a a sort of rigorous scientific discipline, having gone from its foundations, which were really sort of introspective, where you'd be asking people
是的。
Right.
他们是否看到了什么、听到了什么,或者对它的感受是什么。
Whether they saw something or heard something or what the impression of it was.
对此产生了一种反拨,行为主义心理学家说,不,不,不。
There was a sort of reaction against that, and and behaviorist psychologists said, no, no, no.
我们看不到思想,也摸不到感觉。
We can't see thought or touch a feeling.
让我们关注那些我们能看到或触摸到的东西,也就是环境以及它们所产生的行为。
Let's focus on the things we can see or touch, which are environments and the behaviors that they produce.
对吧?
Right?
因此,人们不允许将这些心理状态作为解释人类行为的依据。
And so not allowed to really sort of talk about those those mental states as explanatory things in in accounting for human behavior.
随后,计算机的出现,以及用于思考如何在计算机上执行任务的数学框架,还有由此衍生出的逻辑扩展等,为心理学家提供了全新的理论工具,使他们能够构建关于心智运作的严谨理论。
And then the advent of computers, you know, and the existence of mathematical frameworks for thinking about how to do things on computers and sort of, you know, these sort of, like, extensions to logic and so on which came out of that, that provided a new set of theoretical tools that psychologists could use to come up with rigorous theories of how minds work.
对吧?
Right?
所以,你可以谈论思想了。
So you can talk about thoughts.
也许我们还没完全触及情感,但至少可以谈思想了。
Maybe we haven't quite got to feelings yet, but thoughts.
嗯。
Mhmm.
如果你拥有像逻辑这样精确的数学工具,就可以据此提出关于思想作用的假设。
If you have a precise mathematical device like logic for then coming up with hypotheses about what it is that thought does.
因此,这一研究领域的主要成就之一,就是艾伦·纽厄尔和赫伯特·西蒙创造了逻辑理论家,这是一台能够发现数学命题和逻辑证明的机器。
And so the the big sort of successes of that enterprise were Alan Newell and Herbert Simon creating the logic theorist, which was a machine that could discover, you know, proofs for for mathematical propositions and logic.
诺姆·乔姆斯基表明,对形式语言的思考为我们提供了一种工具,用以理解人类所使用的自然语言。
And Noam Chomsky showing that thinking about formal languages gave us a tool for then making sense of sort of natural languages that humans use.
对。
Right.
这些想法确实为科学奠定了基础,但也带来了一些有趣的挑战。
And and those ideas really sort of provided the foundation of science, but they also led to some interesting challenges.
乔姆斯基对语言的研究非常有效地说明了,语言比行为主义者所假设的要复杂得多。
So Chomsky's approach to language was very good in sort of illustrating that language was a much more complex object than behaviorists had assumed.
对吗?
Right?
你需要拥有类似内部结构的东西,比如动词短语、名词短语,以及看起来像语法的结构,才能解释人类语言的结构。
That you needed to have kinda like internal structures and things like verb phrases and noun phrases and and things that sort of look like grammars in order to account for the structure of human languages.
但这又带来了新的问题:人类是如何从有限的数据中学会这些极其复杂的结构的呢?
But then it created this new problem, which was how is it that human beings could possibly learn these very complex objects from the limited data that they get?
因此,这促使乔姆斯基提出:也许他们并不是真正地在学习它。
And so that's the thing that then pushed Chomsky to say, well, maybe they're not really learning it.
是的
Yeah.
你知道,这是通过一些非常强的约束来实现的,这些约束限定了他们能够学习的语言范围,因此只需要相对少量的数据,就能确定:哦,原来这就是我实际所讲语言的特定组合方式。
You know, acquiring it as a consequence of having some very strong constraints on what it is that they can learn as languages, but then only require relatively small amounts of data to determine, oh, okay, it's this particular configuration of, you know, bits and pieces that's that that characterizes the language that I'm actually speaking here.
我们是否应该以类似的方式设计人工智能?
And is there a thought that we should design our AIs similarly?
我的意思是,从连接主义的人工智能方法中得出的教训——这些方法催生了大型语言模型等——是人类已经完成了所有这些工作。
I mean, it seems like the the lesson from connectionist approaches to AIs that have led to large language models, etcetera, is the human beings have done all that work.
我们可以让人工智能成为白板,然后用海量数据来训练它们。
We can just, like, let the AIs be blank slates and and train them on a huge amount of data.
是的
Yeah.
这正是两者的对比。
So that's that's exactly the contrast.
对吧?
Right?
AI模型很好地说明了要解决乔姆斯基所识别的问题,需要多少语言数据。
Is that the the AI models give us a really good illustration of how much language you need to solve the problem that Chomsky had identified.
对吧?
Right?
去学习如此复杂的一个对象。
To learn something that is this very complex object.
对吧?
Right?
所以,如果我们默认AI已经学到了类似人类语言的东西,那么这就可以看作是对乔姆斯基观点的一个证明。
So if we sort of take as given that they've learned something like human language, then you can think about that as a as a proof of the point that Chomsky was making.
要学习类似人类语言的东西,你需要海量的数据。
You're gonna need lots and lots and lots of data to learn something like human language.
对吧?
Right?
所以他说得对,在孩子成长的五年里,他们不可能仅凭自身推导出语言的结构。
So he was right that in the five years that the kid gets, they're not gonna be able to figure out the structural language.
事实上,结果表明,要做得很好,你需要大约5000年或50000年的数据。
And in fact, it turns out you need something more like 5,000 or 50,000 years of data in order to do a really good job.
对吧?
Right?
所以我认为,这是一种非常棒的方式来理解这种差异。
And so I think that's a that's a really nice way of thinking about what that that difference is.
它也为我们提供了一个很好的视角,来思考如何构建在学习能力上更接近人类的系统。
And it also gives us a a good way of thinking about what the challenge is then if you wanted to make systems that are more human like in their ability to learn.
对吧?
Right?
所以你可以这样想。
So you can think about it.
如果你希望仅用五年数据就能学会,而目前却需要五千年数据,那么你就需要弥补四千九百九十五年的差距——也就是在儿童头脑中的归纳偏置或先验分布上做出改进。
If you wanna be able to learn from five years of data and you're currently learning from five thousand years of data, then you've got four thousand nine hundred and ninety five years to make up in terms of the content of that inductive bias or those prior distributions that are inside the child's head.
对吧?
Right?
是的
Mhmm.
因此,你可以从这些其他来源来思考这一点。
And so you can think about that coming from these other kinds of sources.
对吧?
Right?
所以进化是其中之一,还有一些其他因素。
So evolution is one of those, as well as some other things.
对吧?
Right?
因此,孩子在学习语言过程中所拥有的更广泛的经验。
So the the broader set of experiences that the child has as they're learning language.
对吧?
Right?
这有点像构建一个关于周围世界的模型,使得他们所学习的内容不仅仅是任意的词语序列,而是真正与有意义的事物相对应的。
Sort of like building a model of the world around them that means that the things that they're learning aren't just sort of arbitrary sequences of words, but actually things that map onto things that are meaningful.
对吧?
Right?
所以你的大型语言模型必须仅从它所生成的词语序列中,推断出它对世界的所有认知。
So your large language model has to figure out everything that it knows about the world just from the sequences of words that it's saying.
而且,孩子所获得的那些东西,不仅仅源于那些进化带来的约束,也不仅仅源于他们更广泛的经验,还包括他们通过使用语言来产生周围世界中理想结果而获得的内容——也就是说,把语言当作一种工具,而不仅仅是用来预测的东西,对吧?
And also, the kinds the of things that a child is getting, you know, not just as a consequence of whatever those evolved constraints are and not just as a consequence of their their broader experience, but also, you know, the the things that they're getting from, being able to, engage in using that language to produce, you know, desirable outcomes in the world around them, right, using it as a a tool, not just something that you're necessarily learning to predict.
我们在训练大型语言模型的最后阶段,确实也做了一点类似的事情。
And we do a little bit of that in our training of large language models at the end.
有一些基于强化学习的微调之类的方法。
There's some sort of fine tuning about reinforcement learning and so on.
但我认为,对于认知科学家来说,一个非常有趣的研究方向是:去刻画我们所面临的这种差距。
But but but I I think that's a really interesting kind of project for cognitive scientists is thinking about, yeah, how to characterize that that gap that we have.
对吧?
Right?
并把这些模型当作工具,来探索和解决这个问题。
And using these sorts of models as a tool for working that out.
因此,在我的实验室里,我们做了一些工作。
And so in in my lab, we've done a bit of work.
这最近是与现在在耶鲁大学的汤姆·麦考伊合作,研究一种称为元学习的神经网络训练方法。
This is most recently with Tom McCoy who's, now at Yale, looking at an approach that's called meta learning, for training neural networks.
元学习的目标是通过调整神经网络的初始权重,使其能够从更少的数据中学习。
And what meta learning does is it tries to create neural networks where we manipulate the initial weights that the neural network has in such a way that it's able to learn from less data.
因此,这种方法的原理是
And so the way that
这在某种程度上提供了一个起点。
this works a head start in a little sense.
是的。
Yeah.
没错。
That's right.
但同时也更接近于捕捉这些归纳偏差,是的。
But and also coming closer to capturing those sort of inductive biases and Yeah.
先验分布。
Prior distributions.
对吧?
Right?
所以它的原理是,你可以说:我有一系列不同的学习任务需要解决。
So so the way that it works, you say, I've got a bunch of different learning problems I want to solve.
在语言学的情况下,你可以想象成:我要学习很多种不同的语言。
In the linguistic case, you could think about this as I'm gonna want to learn lots of different languages.
我想学习英语。
I'm gonna want to learn English.
我想学习韩语。
I'm gonna want to learn Korean.
我想学习乌尔都语。
I wanna, you know, learn Urdu.
我想学习你希望掌握的每一种语言。
I wanna learn, you know, each each of the languages that you you want to be able to learn.
你知道你只能依靠这五年的输入数据来学习这些内容。
And you know that you're only gonna be able to learn those from, you know, your five years of input.
对吧?
Right?
你会想,我应该在神经网络中设置怎样的初始权重,才能帮助这个网络仅凭这五年的输入数据,利用它自身的学习机制,学会每一种语言?
And you say, what are initial weights that I can put in my neural network that are going to help that neural network learn each of these languages from that five years of input, you know, just using the sort of mechanisms that it has for for learning?
我们使用一种叫做模型无关元学习的算法来解决这个问题,这个算法包含一个外层循环和一个内层循环。
And so the the way that we solve that problem using an algorithm that's called model agnostic meta learning is you you have a learning process which has an outer loop and an inner loop.
内层循环就是学习某一种具体的语言。
And the inner loop is just learning an individual language.
你只是将神经网络的权重从初始值调整,以便学会每一种语言。
So you're just adjusting the weights of the neural network away from those initial weights to learn the the each of those languages.
但外层循环要问的是:当我观察自己在所有这些语言上的表现时,如果我改变初始权重,我的表现会如何变化?
But the outer loop is saying, when I look at my performance across all of those languages that I want to learn, how how how does my performance on those languages change when I change my initial weights?
对吧?
Right?
因此,你可以通过尝试找到那些能提升所有语言表现的初始权重来学习这些初始权重。
And so you can actually learn the initial weights by, know, trying to find initial weights that that help to improve performance across all of the languages.
通过这样做,你就在为神经网络找到一个起点,这个起点能让它们在有限的数据下快速学习,对吧?
And so by doing that, you're finding a starting point for your neural networks, which is one that's going to allow them to learn quickly, right, from the limited data that they're getting.
然后我们可以回头看看这一组初始权重。
And then we can go back and we can look at that set of initial weights.
我们可以问,这能告诉我们人类学习者可能具有哪些偏见?
And we can say, oh, what does that tell us about the biases that human learners might have?
对吧?
Right?
你知道,为了能够从我们实际获得的数据量中学习语言,你需要具备什么样的偏见?
You know, what's what are the kind of biases that you need to have in order to be able to learn language from the amount of data that we actually get?
所以,我对这方面的有限了解可以追溯到AlphaGo和AlphaZero,那是下棋的程序。
So my limited knowledge of this stuff goes back to, you know, AlphaGo and, AlphaZero that was the chess playing program.
我听说,这些程序如果从未接触过人类棋手和围棋手,而只是自己学习,表现会更好。
And I I'm I'm told that those programs did better if they never were exposed to human chess players and Go players and just learned it themselves.
所以,与此类似,你这种通过特定初始化方式给模型一个起步优势的捷径,会不会让它们在某种程度上缺乏创造力呢?
So is there a worry analogously that your version where you can sort of do a little bit of a shortcut to give the models a head start, by by by initializing them in a certain way, will that make them less creative in some way?
我认为这会让它们更不愿意寻找不同于人类解决方案的路径。
I I think it will make them less inclined to find solutions that are not like the human solutions.
对吧?
Right?
这既有好处,也有坏处。
And that's a plus and a minus.
固定答案。
Frozen answer.
是的。
Yeah.
没错。
Exactly.
是的。
Yeah.
没错。
That's right.
因为人类在很多方面其实很糟糕。
Because there are lots of things that humans are really bad at.
对吧?
Right?
我们或许希望打造能够弥补人类短板的AI系统,去完成那些人类不擅长的事情。
And we might wanna be able to make, you know, AI systems that can complement us by being able to do the things that people are bad at.
我认为,这是一种很好的思维方式,可以设想一个人类与AI共存的未来,这种共存对每个人都有好处。
And that's a really, I think, good way of thinking about what a possible future is where humans and AI get to exist side by side in a way which is sort of good for everybody.
我认为,让AI系统具备更贴近人类的归纳偏差,一个很好的优势是,这不仅能让神经网络用更少的数据进行学习,还可能改善训练模型时涉及的能耗等问题。
I think the thing that might be quite good about making AI systems that have inductive biases that are more aligned with people is that it it's going to not only make it possible for those neural networks to learn from less data, and so some of the energy concerns and so on that are involved in training those models might get better.
而且,当你给它们提供五千年的数据时,它们或许还能学会更令人惊叹的东西。
And maybe they'll actually be able to learn even more impressive things when you give them the five thousand years of data.
对吧?
Right?
但这同时也意味着这些系统对人类来说会更有意义。
But it also means that those systems are going to make more sense to humans.
所以,我认为人类思维与当前AI系统之间的两个主要差异之一是归纳偏差,也就是从少量数据中学习的能力。
So one of the kind of the the two things I would say are the the big differences that we see between human minds and our current AI systems are one of these is about an inductive bias, right, ability to learn from small amounts of data.
另一个差异是泛化能力——一个AI系统可能在解决某个问题上非常出色,但在面对邻近问题时却会彻底失败。
And the other is about generalizability where you can have an AI system that's very good at solving one problem and then fails quite spectacularly on a problem that's right next to it.
嗯。
Mhmm.
对吧?
Right?
我想我们都经历过这样的情形:某个AI系统看起来非常聪明,却做出一些非常奇怪的事情。
I think we've all had that experience of, like, you know, some the AI system seems very smart and then does something very weird.
对吧?
Right?
这已经被一些人称为‘破碎的智能’。
And this has been called jagged intelligence by, you know, various people.
而且,你知道,这对人工智能研究者来说是一个有趣的前沿领域,他们正试图弄清楚这些参差不齐的边界是什么样子,以及如何让我们的系统变得更好。
And, you know, that's a that's a sort of interesting frontier for AI researchers trying to figure out is how do we understand what those jagged boundaries look like and how do we make our systems better.
我认为认知科学实际上是非常好的工具,可以帮助回答这类问题。
I think cognitive science is actually a really good tool for trying to answer those kinds of questions.
但可能发生的情况是,如果我们创建出具有更接近人类归纳偏见的人工智能系统,那么当它们在所获得的数据上进行训练时,找到的解决方案也会更类似于人类找到的方案。
But but one thing that might happen is that if we create AI systems that have inductive biases that are more similar to people, then the solutions that they're going to find when they're trained on the data that they get will be more like the kinds of solutions that humans find too.
对吧?
Right?
所以你可以把你的AI系统想象成一块白板。
So you can kind of think about it as your AI system is your blank slate.
好的。
Okay.
它需要数千年的语音数据才能达到你五岁孩子所达到的水平。
It takes, you know, five thousand years of speech to get it to the point where your five year old gets to.
但那可能并不是同一个点。
But that might not actually be the same point.
对吧?
Right?
从外部看,它们可能看起来有点相似,是的。
It might, from the outside, look kinda similar Yeah.
因为它们都在很好地使用和生成语言。
In that they're both doing a good job of sort of using and producing language.
但在内部,可能非常不同。
But on the inside, it might be quite different.
而且它找到了一条奇特的路径,使其能够很好地使用语言。
And there's some weird path that it's found that gets it to the point where it's able to do a good job of using language.
但从我们的角度来看,这确实是一种非常奇特的解决方案。
But it is kind of just a very weird solution from our perspective.
事实上,我们实验室的一些分析表明,情况确实如此。
And in fact, some of the analyses we've done in my lab suggest that that's the case.
因此,如果我们能利用归纳偏差将模型引导至更接近人类的解决方案,这些方案可能也会让我们觉得更合理。
And so if we can use inductive bias to nudge the models towards more human like solutions, they're probably gonna be things that make a little more sense to us as well.
我想多听听你刚才提到的那句插话。
Well, I wanna hear more about that little parenthesis you just said.
我的意思是,我一直以为,生成听起来像人话的句子时,大语言模型的内部机制与人脑的运作方式截然不同。
I mean, I I always presumed that the internal machinations of the LLMs that output a human sounding sentence were very, very different than what goes on in an actual human brain.
那么,我们对这一点了解多少呢?
So what do we know about that?
是的。
Yeah.
所以,我实验室里有一些研究例子,揭示了这种奇特之处。
So, some examples of things that, we've we've done in in my lab that sort of reveal some of this weirdness.
其中一个例子是,大语言模型对它们所生成输出的概率非常敏感。
One of them is that, large language models are very sensitive to the probabilities of the outputs that they're producing.
对吧?
Right?
所以,当人们刚开始对这些模型感到兴奋时,有一篇名为《AGI的火花》的论文指出,GPT-4展现出了一些非凡的能力。
So when people were very excited about these models, there was the the paper, the sparks of AGI paper that came out that said, you know, GPT four sort of exhibits these remarkable abilities.
汤姆·麦科伊和一些同事写了一篇论文,我们称之为‘自回归的余烬’,意思是,尽管顶部会迸发出火花,但底部仍然存在一些余烬,这是这些模型训练方式的结果。
Tom McCoy and and some some colleagues, we we wrote a paper that we we called embers of autoregression, which was saying, much as you're getting sparks at the top, there are still these embers at the bottom, which are a consequence of the way these models are trained.
其中一点是,如果你——要知道,在现代系统中,人们已经用了各种技巧来绕过这个问题。
And so one of these is that if you, you know, again, these are things that in modern systems, there's all sorts of tricks that they've used to sort of get around this.
但如果你拿一个像GPT-4那样的原始语言模型,让它解决一些简单的问题,比如计算字符串中字母的数量,它们的表现会受到它们必须生成的答案的概率影响。
But if you sort of take a raw language model of the kind that we were getting with GPT-four, and you ask it to solve sort of simple problems like counting the number of letters that appears in a string, how well they do on that is influenced by the probability of the answer that they would have to produce.
例如,它们计算含有30个字母的字符串要远比计算含有29个字母的字符串表现更好,因为数字30在互联网上出现的频率高于29。
So for example, they're much better at counting strings that have 30 letters in them than strings that have 29 because the number 30 appears on the internet more often than the number 29.
所以这是一种情况:存在一些相近的、相当不错的答案,而其中一些答案的概率更高。
So it's a situation where there are other nearby answers that are pretty good, and some of those have higher probability.
因此,模型最终生成的是概率更高的答案,而不是它本应生成的那个正确答案。
And so as a consequence, it sort of produces the high probability thing rather than the thing that it's supposed to produce.
这可以说是语言模型的一种奇特的偏见。
And so that's a, you know, sort of like weird idiosyncratic bias of language models.
这是由它们的训练方式所导致的。
It's a consequence of the way that they're trained.
因此,更广泛地说,我对这些系统的看法是,我们应该预期这种现象正是我们从计算层面视角所看到的。
And so more generally, the way that I think about these systems is that, you know, we should expect this is this is applying our computational level lens.
对吧?
Right?
我们应该预期智能系统的行为会受到它们所要解决的问题类型的影响。
We should expect intelligent systems to behave in ways that are shaped by the kinds of problems that they're trying to solve.
当我们设计人工智能系统时,我们明确地选择了它们将要解决的问题类型。
And when we design our AI systems, we're making explicit choices about the kinds of problems that they're going to solve.
对。
Right.
比如预测序列中下一个词或标记的能力。
Things like being able to predict the next word or token that appears in a sequence.
这将会影响它的行为。
And that's gonna be something which influences its behavior.
因此,如果我们训练系统的目标函数与人类大脑在进化过程中所解决的计算问题之间存在差异,那么我们预期它们找到的解决方案也会大不相同。
And so to the extent that there's a difference in the objective function, right, the thing the the goal that we have in training that system and the kinds of computational problems that human minds have evolved to solve, then we're going to expect the kinds of solutions that they find to look quite different.
而这正是我们行为出现不匹配的部分原因。
And that's part of where we get this mismatch in behavior.
你之前提到过一个很有争议的观点,我还没来得及深入探讨,是关于神经网络或神经元的几何结构或空间结构,以及它在思维规律中所起的作用?
You mentioned earlier this provocative thing, which I haven't had a chance to follow-up on about, was it the geometry or the spatial structure of neural networks or or neurons more generally and and the role that that plays in the laws of thought?
所以第三个线索是,我们之前讨论了逻辑和概率理论。
So the the third thread so we talked about logic and probability theory.
对吧?
Right?
所以这里的第三个线索是,把思维看作空间中的点,运用我们用来思考空间的数学方法。
So the third thread here is this kind of idea of thinking about, yeah, thought in terms of, you know, points in space, right, and using the sort of math that we use for thinking about spaces.
这个想法是在二十世纪发展起来的。
And that was an idea that developed in the twentieth century.
对吧?
Right?
所以我说,你知道,纽厄尔和西蒙、乔姆斯基已经证明了逻辑在为我们提供关于思维或语言如何运作的理论表达方式方面非常有效。
So I said, you know, we had Newell and Simon and Chomsky demonstrating that things like logic were really effective for giving us sort of ways of expressing theories about how something like thought or language might work.
但那是五十年代的主要理念,并由此延续下去。
But then, you know, that was sort of like the the big idea of the nineteen fifties and carried forward from there.
到了七十年代,心理学家们开始意识到,这种观点存在一些漏洞。
And then in the nineteen seventies, psychologists sort of started to realize that, you know, there are some some some gaps in this.
对吗?
Right?
于是,我们来到了这样一个问题:当你把这些数学理论(如逻辑)与人类行为进行严格对比时,会发生什么?
And so this is this is where we get to, okay, what happens when you take these mathematical theories like logic and then start comparing them rigorously against human behavior?
当你开始这样做时,就会发现这些有意义的差异。
And when you start to do that, you start to turn up, you know, these sort of meaningful discrepancies.
其中一组差异来自伊莱诺·拉什的研究,她是一位研究人们如何思考类别的心里学家。
And so one of these sets of discrepancies came from the work of Eleanor Rush, who was psychologist who explored how people think about categories.
对吗?
Right?
如果你从逻辑的角度来看类别,你会寻找一个能定义该类别的规则。
If you think about categories from the perspective of logic, you're looking for a rule that characterizes that category.
你寻找的是一种定义,它能明确告诉你成为该类别成员意味着什么。
You're looking for sort of like a definition that tells you, you you know, exactly what it is to be a member of that category.
你必须具备这些属性,不能具备那些属性,不管怎样。
You know, you have to have these properties, you have to not have these properties, whatever it is.
这就是定义,它告诉你成为某个类别的标准是什么。
That's that's the rule that tells you, you know, what it is for belonging to a category.
罗斯发现,实际上很少有人类类别具有这种结构。
And Ross showed that it really seems like, you know, very few human categories have that kind of structure.
如果你观察我们对‘什么构成家具’或‘什么构成交通工具’的直觉,你会发现,根本找不到逻辑学家所期望的那种明确界定标准。
So if you if you look at our intuitions about, you know, what makes something a piece of furniture or, you know, what makes something a vehicle, there aren't definitions that you can find that characterize those in the way that the the the logician would want you to have.
是的。
Yeah.
相反,这些类别似乎具有更模糊的结构,你可以肯定某些东西绝对是家具,比如一把扶手椅。
And And instead, it seems like they're characterized by a much more fuzzy structure where you can say certain things are definitely pieces of furniture, like a, you know, an armchair.
而其他一些东西,比如地毯,则可能是家具。
And other things are maybe pieces of furniture like a rug.
对吧?
Right?
而且从逻辑的角度来看,这种梯度很难被捕捉到。
And there's a sort of gradients that was hard to capture from that logical perspective.
因此,似乎需要一种不同的理论来捕捉这种现象。
And so it seemed like you needed a different kind of theory to be able to capture that.
于是心理学家开始思考,也许可以把物体看作空间中的点,用事物在空间中的接近程度来表征它们属于某个类别(比如家具)的程度。
And psychologists started to think about, well, maybe if you think about objects as points in space, then how close things are in in space as a way of characterizing, you know, their the extent to you know, you think about the you're you have another point that characterizes your category of furniture.
因此,扶手椅离那个点很近,而地毯则离那个点较远。
And so now, you know, an armchair is close to that point, and a rug is further away from that point.
这或许能帮助我们捕捉到这种梯度。
And maybe that gives us a way of capturing that that sort of gradients.
但接着你会遇到一个新问题:如果概念是空间中的点,那如何进行计算呢?
But then you end up with a new problem, which is if concepts are points in space, then how do you do something like computation?
对吧?
Right?
因此,逻辑学催生了数字计算机、图灵机等概念。
So with logic, that translated into the ideas behind digital computers, Turing machines, all of these things.
我们有了理解思维的方式,因为我们可以说:好吧。
We had a way of thinking about what thought was because we could say, okay.
如果你以逻辑方式表示某事物,那么,遵循这种无偏见的理念,我们就能制造出一台机器,它能执行我们的规则,并告诉我们结果是什么。
If you represent something logically, then, you know, fulfilling that idea of blindness, we can then make a machine that, like, executes our rules and sort of tells us, you know, what the consequences are.
但如果一个概念是空间中的一个点,那么接下来我们该往哪里走呢?
But if a concept's a point in space, then, you know, where do we go from there?
那我们该怎么做呢?
What do we do with that?
我们如何学习这些概念?又如何弄清楚它们的后果呢?
How do we how do we learn what those concepts are, and how do we, how do we sort of, like, work out what the consequences are?
而这个问题的答案来自神经网络。
And the answer to that came from neural networks.
人们从20世纪40年代起就开始思考神经网络了。
So people had been thinking about neural networks, you know, since the nineteen forties.
对吧?
Right?
麦卡洛克和皮茨最初做了一些工作,将布尔电路的思想进行了转化。
There was a sort of initial work by McCulloch and Pitts, which was translating the idea of a Boolean circuit.
对吧?
Right?
乔治·布尔所思考的那种逻辑结构。
The kinds of logical structures that George Boole had thought about.
他们提出了一种用神经元之间的运算来表达布尔电路的方法。
They came up with a way of expressing Boolean circuits in terms of operations between neurons.
对吧?
Right?
你可以以某种方式连接神经元,使它们能够表示逻辑与、或、非等操作。
You could connect neurons up in such a way that they could represent logical ands and ors and nots and so on.
由此,你可以构建出复杂的神经网络结构。
And from that, you could build sort of complex neural circuits.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。