陶哲轩——开普勒、牛顿与数学发现的真正本质

本集简介

我们从开普勒发现行星运动定律的绝妙而令人惊讶的方式开始这一集。人们有时说，人工智能在科学发现上会取得特别快速的进展，因为其验证循环非常紧密。但我们发现太阳系形状的过程表明，正确理论的验证循环可能长达数十年（甚至数千年）。在这段时间里，今天我们视为更优的理论，实际上可能做出更差的预测。而它能熬过这种认知炼狱的原因，是一些我们甚至无法清晰表达、更不用说编码进强化学习循环的判断与启发式方法。希望你喜欢！在YouTube上观看；阅读文字稿。赞助商 - Jane Street 喜欢用各种创意谜题挑战我的听众。我的一位听众 Shawn 解决了 Jane Street 的 ResNet 挑战，并在 X 上发布了一篇精彩的解题过程。如果你想亲自尝试这类谜题，现在有一个正在运行的挑战，地址是 janestreet.com/dwarkesh。 - Labelbox 可以为你提供基于评分标准的评估，无论你身处哪个领域。这些评分标准让你能对模型在所有你关心的维度上提供反馈，从而训练它的思维方式，而不仅仅是它思考的内容。无论你专注的是数学、物理、金融、心理学还是其他领域，Labelbox 都能帮上忙。了解更多请访问 labelbox.com/dwarkesh。 - Mercury 刚刚推出了一项名为“洞察”的新功能。洞察会总结你的收支情况，显示你最大的交易，并提醒你值得关注的事项。这是一种极低摩擦的方式，让你轻松掌握业务动态。了解更多请访问 mercury.com/insights。时间戳 (00:00:00) – 开普勒是一个高温大语言模型 (00:11:44) – 如果海量AI垃圾中出现了一个新的统一概念，我们如何发现它？ (00:26:10) – 演绎滞后 (00:30:31) – 报道的AI发现中的选择偏差 (00:46:43) – AI让论文更丰富、更广泛，但不更深入 (00:53:00) – 如果AI解决了问题，人类能从中获得理解吗？ (00:59:20) – 我们需要一种半形式化的语言，来描述科学家之间真实的交流方式 (01:09:48) – Terry 如何利用他的时间 (01:17:05) – 人机混合体将在很长时间内主导数学领域获取 Dwarkesh 播客完整内容，请访问 www.dwarkesh.com/subscribe

We begin the episode with the absolutely ingenious and surprising way in which Kepler discovered the laws of planetary motion. People sometimes say that AI will make especially fast progress at scientific discovery because of tight verification loops. But the story of how we discovered the shape of our solar system shows how the verification loop for correct ideas can be decades (or even millennia) long. During this time, what we know today as the better theory can actually make worse predictions. And the reasons it survives this epistemic hell is some mixture of judgment and heuristics that we don’t even understand well enough to actually articulate, much less codify into an RL loop. Hope you enjoy! Watch on YouTube; read the transcript. Sponsors - Jane Street loves challenging my audience with different creative puzzles. One of my listeners, Shawn, solved Jane Street’s ResNet challenge and posted a great walk-through on X. If you want to try one of these puzzles yourself, there’s one live now at janestreet.com/dwarkesh. - Labelbox can get you rubric-based evals, no matter your domain. These rubrics allow you to give your model feedback on all the dimensions you care about, so you can train how it thinks, not just what it thinks. Whatever you’re focused on—math, physics, finance, psychology or something else—Labelbox can help. Learn more at labelbox.com/dwarkesh. - Mercury just released a new feature called Insights. Insights summarizes your money in and out, showing you your biggest transactions and calling out anything worth paying attention to. It’s a super low-friction way to stay on top of your business. Learn more at mercury.com/insights. Timestamps (00:00:00) – Kepler was a high temperature LLM (00:11:44) – How would we know if there’s a new unifying concept within heaps of AI slop? (00:26:10) – The deductive overhang (00:30:31) – Selection bias in reported AI discoveries (00:46:43) – AI makes papers richer and broader, but not deeper (00:53:00) – If AI solves a problem, can humans get understanding out of it? (00:59:20) – We need a semi-formal language for the way that scientists actually talk to each other (01:09:48) – How Terry uses his time (01:17:05) – Human-AI hybrids will dominate math for a lot longer Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

好的。

Okay.

Speaker 0

今天，我正在和陶哲轩聊天，他需要一些介绍。

Today, I'm chatting with Terence Tao, who needs some introduction.

Speaker 0

特伦斯，我想先请你复述一下开普勒如何发现行星运动定律的故事，因为我认为这将是一个很好的切入点，来讨论人工智能在数学中的应用。

Terence, I wanna begin by having you retell the story of how Kepler discovered the laws of planetary motion because I think this will be a great jumping off point to talk about AI for math.

Speaker 1

好的。

Okay.

Speaker 1

是的。

Yeah.

Speaker 1

我一直对天文学有业余的兴趣，因此我非常喜爱早期天文学家如何揭示宇宙本质的故事。

So I've always had amateur interest in astronomy, and so I've I've I've loved stories of how the early astronomers worked out, the nature of the universe.

Speaker 1

开普勒是在哥白尼工作的基础上进行的，而哥白尼本人又是建立在阿里斯塔克斯的研究之上。

So, Kepler was building on the work of Copernicus, who was himself building on the work of Aristarchus.

Speaker 1

哥白尼非常著名地提出了日心说模型，即行星和太阳并不是围绕地球运转，而是太阳位于太阳系的中心，其他行星围绕太阳运转。

So Copernicus very famously proposed the heliocentric model that, instead of the planets and sun going around the earth, that the sun was at the center of the solar system and the other planets were going around, the sun.

Speaker 1

哥白尼提出行星的轨道是完美的圆形。

And Copernicus proposed that the orbits of the planets were perfect circles.

Speaker 1

他的理论在一定程度上符合希腊人、阿拉伯人和印度人几个世纪以来积累的观测数据。

And his theory kind of fits, the observations that, the the Greeks and the Arabs and Indians had worked out over over centuries.

Speaker 1

我认为开普勒对此产生了兴趣。

I think Kepler got interested.

Speaker 1

他在学习过程中了解了这些理论，并注意到，根据这些理论预测出的轨道尺寸比例似乎具有某种几何意义。

He like, he learned about these these theories in his in his studies, and he made this observation that the ratios of the size of the orbits that could predicted seem to have some geometric meaning.

Speaker 1

我想，他开始提出，比如说，如果你取地球的轨道，并用一个立方体将其包围，那么这个立方体的外接球几乎完美地与火星的轨道吻合，以此类推。

I think, yeah, he he started proposing that, you know, if you if you take, say, the orbit of of, say, the Earth and you enclose it in, think, maybe a cube, the the outer sphere of that that encloses the cube almost matched perfectly the orbit of Mars and so forth.

Speaker 1

当时有六颗行星，它们之间有五个空隙，而恰好有五种正多面体：立方体、四面体、二十面体、八面体和十二面体。

And there were six planets, none of the time, five gaps between them, and there were five perfect platonic solids, the cube, the tetrahedron, isocretian, octahedron, and dodecahedron.

Speaker 1

因此，他提出了一个他认为极其优美的理论：他可以在行星轨道的球体之间内接这些正多面体，而且这个模型似乎与观测数据吻合。

And so he had this this theory, which he thought was absolutely beautiful, that he he could inscribe these platonic solids between the spheres of the planets, and it seemed to fit.

Speaker 1

在他看来，这似乎表明上帝设计行星的方式，正是契合了正多面体所体现的数学完美性。

And it it it seemed to him like, you know, god's design of the planets was was matching this mathematical perfection of the platonic solids.

Speaker 1

所以他需要数据来验证这一理论。

So he needed data to, confirm this theory.

Speaker 1

当时，几乎只有一个高质量的数据集，那就是第谷·布拉赫——这位富有的、性格古怪的丹麦天文学家，成功说服了丹麦政府资助建造了这座极其昂贵的天文台。

And at the time, there was only one really high quality dataset, almost in existence, okay, which was the so, Tycho Brahe, this Danish astronomer, very wealthy eccentric astronomer, had managed to convince the Danish government to fund this extremely expensive observatory.

Speaker 1

事实上，那是一座完整的岛屿，他在那里花了数十年时间，对所有行星——火星、木星等——在每一个天气晴朗的夜晚进行观测。

This in fact, an entire island, where he had taken decades of observations of all the planets, Mars, Jupiter, every night of at least every night for which the weather was clear.

Speaker 1

实际上，他是最后一位仅凭肉眼观测的天文学家。

With the naked eye, actually, this is, he was last of the of the naked eye astronomers.

Speaker 1

因此，他积累了大量数据，开普勒可以利用这些数据来验证他的理论。

And so he had all this data which Kepler could use to confirm his theory.

Speaker 1

于是开普勒开始与第谷合作，但第谷对自己的数据非常吝啬。

And so Kepler started working with with Tycho, but Tycho was very jealous of the data.

Speaker 1

他每次只给开普勒一点点数据。

He only gave a little bit bits of it at a time.

Speaker 1

我认为开普勒最终干脆偷走了这些数据。

And I think Kepler eventually just stole the data.

Speaker 1

他抄录了数据，并与布拉赫的后人发生了争执。

He he copied it and and had to have a fight with with Brahi's descendants.

Speaker 1

但他最终还是推导出来了。

But he did work out.

Speaker 1

他得到了数据，结果令他失望的是，他那优美的理论并不完全成立。

He did get the data, and then he worked out to kind of his disappointment that his beautiful theory didn't quite work.

Speaker 1

比如，数据与他的正多面体理论相比，偏差大约有10%左右。

Like, the data was sort of off from his platonic solid theory by, you know, about 10% or something.

Speaker 1

他尝试了各种调整，比如移动圆圈之类的办法。

And he had all kinds of fudges moving the circles around and things.

Speaker 1

但就是不太对劲。

It it didn't quite work.

Speaker 1

但他为此问题钻研了许多年。

But he worked on this problem for for for years and years.

Speaker 1

最终，他找到了利用这些数据推算出行星真实轨道的方法。

And, eventually, he figured out how to use the data to to work out the actual orbits of, of the planets.

Speaker 1

这真是极其聪明，堪称天才的数据分析。

And that was incredibly clever, genius amount of data analysis.

Speaker 1

然后，是的，他最终发现，实际上轨道是椭圆而不是圆形，这让他感到震惊。

And, yeah, and then he eventually worked out that the the the also, actually, ellipses, not circles, which was shocking to him.

Speaker 1

接着，他推导出了行星运动的两条定律：椭圆轨道和面积速度相等，即在相同时间内扫过相等的面积。

And then he worked out so he worked out the two laws of planetary motion of the ellipses, also equal areas, super out equal times.

Speaker 1

十年后，是的，在收集了大量数据之后，最远的行星，比如土星和木星，对他来说最难推算。

And then ten years later, yeah, after collecting a lot of data, the the the the furthest planets like, like Saturn and Jupiter were the hardest for him to to work out.

Speaker 1

但最终，他还是发现了第三条定律：行星绕太阳公转周期的平方与它到太阳距离的立方成正比。

But then he he finally worked out this third law also that that the the orbits the the the time it takes for a planet to complete its orbit was proportional to some power of of the distance to the sun.

Speaker 1

这三条就是著名的开普勒行星运动定律，而他本人并不知道其背后的原理。

And these are the three famous Kepler's laws of motion, and he had no explanation for them.

Speaker 1

这一切完全基于实验数据，直到一个世纪后，牛顿才提出了能够同时解释这三条定律的理论。

It it it it it was just all driven by by experiment, and it took Newton a century late later to give a theory that explained all three laws at once.

Speaker 0

我想让你听听我的看法，嗯。

The take I wanna try on you Mhmm.

Speaker 0

我的理解是，开普勒就像一个高温的大语言模型，而牛顿则提出了这三大行星运动定律为何必然成立的解释。

Is that Kepler was a high temperature LLM, where Newton comes up with this, explanation of why the three laws of planetary motion must be true.

Speaker 0

当然，开普勒发现行星运动定律、推算出不同行星相对轨道的方式，正如你所说，是一种天才的杰作。

And, of course, the way that Kepler discovers the laws of planetary motion or figures out the relative orbits of the different planets is, as you say, a work of genius.

Speaker 0

但在他的一生中，他只是在尝试各种随机的关系。

But then, you know, he's through his career, he's just trying random relationships.

Speaker 0

事实上，在他写下第三大行星运动定律的那本书中，这只是一个关于世界和谐的附带讨论，这本书讲的是，你知道的，所有这些行星各自拥有不同的音律。

And in fact, the in the book in which he writes down the third law of planetary motion, it's sort of an aside on the harmonics of the world, which is this book about, you you know, all these different planets have these different harmonies.

Speaker 0

而地球上之所以有如此多的饥荒与苦难，是因为地球的音符是 mi fa mi。

And the reason there's so much famine and misery on earth is because the earth is mi fa mi.

Speaker 0

这就是地球的音符。

That's the note of earth.

Speaker 0

因此，这一切都是随机的占星术。

And so all this random astrology.

Speaker 0

但其中却蕴含着立方平方定律，它揭示了行星公转周期与它到太阳距离之间的关系，正如你所详述的，如果你把这个定律与牛顿的 F=ma 和向心加速度公式结合起来，就能推导出平方反比定律。

But in there is the cube square law, which tells you what relationship the, the period has to a planet's distance from the sun, which is, as you're detailing, if you add that to Newton's f equals m a and then the equation for centripetal acceleration, you get the inverse square law.

Speaker 1

是的

Mhmm.

Speaker 0

于是牛顿推导出了这一点。

And so Newton works that out.

Speaker 0

但我认为这个故事有趣的原因是，我觉得大语言模型可以做到类似二十年前的事情。

But the reason I, I think this is an interesting story is I feel like LLMs can do the kind of thing of, like, twenty years.

Speaker 0

尝试一些随机的关系，其中一些毫无意义。

Let's try random relationships, some of which make no sense.

Speaker 0

只要有一个可验证的数据集，比如第谷的数据集。

As long as there's a verifiable data bank, like Brahi's dataset

Speaker 1

是的

Mhmm.

Speaker 0

我会尝试一些关于音符的随机想法。

Where, okay, I'm gonna try out random things about like musical notes.

Speaker 0

我会尝试一些关于柏拉图立体的随机想法。

I'm gonna try out random things about platonic objects.

Speaker 0

我要尝试所有这些不同的几何结构。

I'm gonna All these different geometries.

Speaker 0

我有一种偏见，认为这些轨道的几何结构中有一些重要的东西。

I have this bias that there's some important thing about the geometry of these orbits.

Speaker 0

然后有一件事奏效了。

And then one thing works.

Speaker 0

只要能够验证它，这些经验规律就能推动真正的深层科学进展。

And as long as you can verify it, it can then draw These empirical regularities can then drive actual deep scientific progress.

Speaker 1

传统上，当我们谈论科学史时，创意生成一直是科学中最具声望的部分。

Traditionally, when we talk about the history of science, idea generation has always been kind of the prestige part of science.

Speaker 1

我的意思是，一个科学问题包含许多步骤。

So, I mean, a scientific problem comes with there's many steps.

Speaker 1

你知道的？

You know?

Speaker 1

你必须先识别一个问题，然后还要找到一个值得研究的、有前景的问题。

You you have to identify a problem, and then you have to identify a good problem to work on, a fruitful problem.

Speaker 1

然后你需要收集数据。

And then you need to to collect data.

Speaker 1

你需要制定一种分析数据、提出假设的策略。

You need to figure out a strategy to analyze the data, to make a hypothesis.

Speaker 1

在这一点上，你需要提出一个良好的假设，然后进行验证。

And at this at this point, you need to propose a good hypothesis, and then you need to validate.

Speaker 1

是的。

Yeah.

Speaker 1

所以，接下来你需要整理并解释这些内容。

So this and then you need to write things up and explain.

Speaker 1

这中间有十几种不同的环节。

There's this there's a dozen different components.

Speaker 1

但我们所推崇的，往往是那些灵光一现的天才时刻——也就是创意的产生。

But, yeah, the the ones we celebrate are these of eureka genius moments of of idea generation.

Speaker 1

对。

And yeah.

Speaker 1

所以，开普勒确实像你说的那样，必须反复尝试许多想法，其中不少都行不通，我猜还有很多他根本没发表，因为它们就是不符合。

So so Kepler certainly had to to, as you say, cycle through many ideas and and several which didn't work, and and and I bet many that he didn't even publish at all because, yeah, they they they just didn't fit.

Speaker 1

而这也是过程中的重要部分——尝试各种随机的方法，看看哪些能奏效。

And that's an important part of the process, trying all kinds of of of random things and seeing if they worked.

Speaker 1

但正如你所说，这必须有同等程度的验证来匹配。

But as you say, the, you know, the it it have to be matched by an equal amount of verification.

Speaker 1

否则，这就只是空谈。

Otherwise, it's it's slot.

Speaker 1

我的意思是，我们歌颂开普勒，但也应该歌颂第谷，因为他进行了极其细致的数据收集，其精确度比以往任何观测都高出十倍。

I I mean, we we celebrate Kepler, but we should also celebrate Brahi for for his his his insidious data collection with which was 10 times more precise than than any previous observation.

Speaker 1

正是这多出的一位小数精度，才真正让开普勒得以得出他的结论。

And it it was that extra decimal point of accuracy was actually essential for for Kepler to get his his his results.

Speaker 1

而且，你知道，他当时用的是欧几里得几何，以及他那个时代所能使用的最先进数学工具，来让他的模型与数据相匹配。

And, you know, and he was using, you know, Euclidean geometry and and and, like like, the most advanced mathematics he could use at the time to to match his his models with the data.

Speaker 1

所以，所有方面都必须协同发挥作用——数据、理论和假设的生成。

So, like, all aspects had to be in play, you know, the the the data and the theory and the the hypothesis generation.

Speaker 1

我不确定在当今时代，假设生成还是瓶颈所在。

I'm I'm not sure nowadays that hypothesis generation is the bottleneck anymore.

Speaker 1

自那以来，科学已经发生了变化。

Sciences has changed in in in the century since.

Speaker 1

传统上，科学的两大范式是理论和实验。

So, classically, sort of the the two big paradigms for for for science were theory and experiment.

Speaker 1

到了二十世纪，数值模拟出现了，因此你也可以通过计算机模拟来检验理论。

Then in the twentieth century, numerical simulation came along, and so you can also do do computer simulations of of of to test theories.

Speaker 1

但到了二十世纪末，我们迎来了大数据时代。

But then finally, in the late twentieth century, we had big data.

Speaker 1

现在我们进入了数据分析的时代。

Now we we had the the the era of data analysis.

Speaker 1

因此，许多新的进展实际上是通过首先分析海量数据集、收集大型数据集，然后从中提炼出模式来推导结论而实现的，这与过去科学的方式略有不同——过去是先做少量观察或突然冒出一个想法，然后再收集数据来验证你的想法。

And so a lot of new progress is actually driven now by analyzing massive datasets first, collecting large datasets, and then drawing the patterns from them to to deduce those, which is a little bit different from how science used to work where you you make a few observations or you just have one out of the blue idea, and then you collect data to test your idea.

Speaker 1

这就是经典的科学方法。

That's the classic scientific method.

Speaker 1

现在情况几乎反过来了。

Now it's almost reverse.

Speaker 1

你先收集大数据，然后试图从中得出假设。

You collect big data first, and then you you try to to get hypotheses from it.

Speaker 1

我的意思是，开普勒可能是最早的数据科学家之一，但他甚至没有从第谷的数据集开始分析。

I mean, Kepler was maybe one of the first early data scientists, but but even even he didn't start with Tycho's dataset and and analyze it.

Speaker 1

他首先有一些预设的理论。

He he had he had some preconceived theories first.

Speaker 1

但看起来，我们取得进展的方式越来越不是这样了。

But it's it seems like this is less and less the way we make progress in in in yeah.

Speaker 1

只是因为数据实在太庞大了。

Just because, yeah, the data is is just so much more massive.

Speaker 1

它实在太有用了。

It's just so much more useful.

Speaker 0

哦，有意思。

Oh, interesting.

Speaker 0

我实际上觉得，你所描述的二十世纪科学模式，用开普勒来说明非常贴切，因为他确实有这些想法。

I I I actually feel like the mold of twentieth century science that you're describing is actually very well described with Kepler, where he did have these ideas.

Speaker 0

1595年和1596年，他提出了多边形理论，接着是正多面体理论，但这些理论都是错误的。

1595 and '96 is where he comes up with first polygons and then, platonic objects theory, but they were wrong.

Speaker 0

几年后，他才得到了他的数据。

And then a few years later, he gets brought his data.

Speaker 0

只有经过二十年不断尝试各种方法后，他才发现了这种经验性的规律。

And it's only after twenty years of just trying random things that he gets this empirical regularity.

Speaker 0

因此，布拉赫的数据实际上与某些大规模的数据模拟非常相似。

And so it actually feels a bit closer to Brahe's data is analogous to some massive data vanco simulations.

Speaker 0

而既然你已经有了数据，就可以继续不断尝试各种随机的方法。

And then we he now he now that you've got the data, you can keep trying random things.

Speaker 0

但如果没有这些数据，开普勒可能只会一直在那里写关于和谐与正多面体的书，而没有任何实际依据可以验证。

But if it wasn't, Kepler would be out there just writing books about harmonics and the platonic objects, and there would be nothing to actually verify against.

Speaker 1

是的。

Yeah.

Speaker 1

是的

Yep.

Speaker 1

对

Yeah.

Speaker 1

所以数据非常重要。

So the the the data was extremely important.

Speaker 1

但我之前想强调的是，传统上你是先提出假设，然后再用数据来检验它。

But the distinction I was trying to make was that sort of traditionally, you make a hypothesis and then you test it against data.

Speaker 1

是的

Yeah.

Speaker 1

但现在有了机器学习、数据分析和统计学，你可以从数据出发，通过统计方法发现以前不存在的规律。

But now with machine learning and data analysis and statistics and something, you can you can start with data and through, say, statistics, work out, laws that, were not present before.

Speaker 1

所以开普勒的第三定律有点像这样，只不过对于第三定律，开普勒拥有的不是布拉迪那样的上千个数据点，而是只有六个数据点。

So and so Kepler's Kepler's third law is a little bit like this, except that, for the third law, instead of having the thousand data points that Brady had, Kepler had, like, six data points.

Speaker 1

每个行星，你都知道它的轨道长度和到太阳的距离。

To like, every planet, you knew the length of the orbit, and the the distance of the sun.

Speaker 1

当时只有五六个数据点，他做了我们现在称之为回归分析的工作。

And there was, like, five or six data points, and he did, what we would now call regression.

Speaker 1

你知道吗？

You know?

Speaker 1

他能够为这六个数据点拟合出一条曲线，并得出了一个平方反比定律，这真是太惊人了。

He he could fit a curve to these six data points, and he got a square coupe law, which was amazing.

Speaker 1

但事实上，他相当幸运，因为这六个数据点让他得出了正确的结论。

But, actually, he was quite lucky, I mean, that these six data points gave him the right conclusion.

Speaker 1

你知道，这些数据量根本不足以保证可靠性。

You know, it's, that's not enough data to be really reliable.

Speaker 1

后来有一位天文学家叫约翰内斯·波德，他使用了同样的数据——实际上是行星到太阳的距离。

There was a later astronomer, Johannes Bode, who took the same the same data, actually, the the distances to to the planets.

Speaker 1

受开普勒启发，我认为他预测行星距离大致构成一种平移的几何级数，他也为此拟合了一条曲线。

And inspired by Kepler, I think, he had a prediction that the the the the distances to the planets formed basically a shifted geometric progression, that he also fit a curve.

Speaker 1

但有一个数据点缺失了。

Except that there was one there was one point missing.

Speaker 1

好吧？

Alright?

Speaker 1

所以火星和木星之间有一个很大的空隙。

So there was a big gap between Mars and Jupiter.

Speaker 1

他的定律预测有一颗缺失的行星。

His law predicted that there was a missing planet.

Speaker 1

所以这算是一种古怪的理论，但当赫歇尔发现天王星时，天王星的距离恰好符合这个模式。

So, it was a kind of a a crank theory, except, when Uranus was discovered by Herschel, the the distance Uranus fit exactly this this pattern.

Speaker 1

然后谷神星被发现了，这颗小行星位于小行星带中，我想，它也符合这个模式。

And then Ceres was discovered, this asteroid between, I think, in in the asteroid belt, and it also fit the pattern.

Speaker 1

所以人们非常兴奋，认为波德发现了这个惊人的自然新定律。

So people got really excited that that that that that board had discovered this this amazing new law of of of nature.

Speaker 1

但后来海王星被发现了，它的位置完全偏离了这个模式。

But then Neptune was discovered, and was it was completely, like, way off.

Speaker 1

嗯。

Mhmm.

Speaker 1

而且，你知道，基本上这只是一个数值上的巧合。

And, you know, and and, basically, it was just a numerical fluke.

Speaker 1

你知道，当时有

You know, there were

Speaker 0

有六个

there were six

Speaker 1

六个数据点。

six data points.

Speaker 1

是的。

Yeah.

Speaker 1

所以，也许凯普勒没有像强调前两条定律那样突出他的第三条定律，原因之一是，即使他没有现代统计学知识，他本能地意识到，只有六个数据点，他对结论的推断必须保持一定的谨慎。

So maybe one reason why Kepler didn't, highlight his third law as much as the first two laws is that maybe instinctively, even though he didn't have modern statistics, he kind of knew that with six data points, he had to be somewhat tentative with with, with the conclusions.

Speaker 0

但也许更明确地问一下这个类比。

But maybe to ask the question about the analogy more explicitly.

Speaker 0

如果我们未来拥有越来越智能的AI，并且有数百万个AI，它们可以出去寻找所有这些经验规律，这个类比还成立吗？

Does this analogy make sense to if we have, you know, in the future, we'll have smarter and smarter AIs and we'll have millions of them, and then they can go out and hunt for all these empirical regularities.

Speaker 0

听起来你不认为科学的瓶颈在于为每个领域发现更多类似于行星运动第三定律的东西，以便后来有人可以说：哦，我们需要一种方法来解释这个。

It sounds like you don't think the bottleneck in science is finding more things that are for each given field their equivalent of the third law of planetary motion so that then later on somebody can say, oh, we need a way to explain this.

Speaker 0

让我们来推导出数学公式。

Let's work out the math.

Speaker 0

这就是万有引力的平方反比定律。

Here here's the inverse, square law of gravity.

Speaker 1

对。

Right.

Speaker 1

我认为人工智能基本上已经将创意生成的成本降到了几乎为零。

So I think AI has basically driven the cost of idea generation down to almost zero Yeah.

Speaker 1

这与互联网将通信成本降至几乎为零的方式非常相似。

In a very similar way to how the Internet drove the cost of communication down to almost zero Yeah.

Speaker 1

这确实是一件了不起的事，但它本身并不会创造丰裕。

Which is an amazing thing, but it, you know, it it it doesn't make it doesn't create abundance by itself.

Speaker 1

是的。

Yeah.

Speaker 1

所以现在瓶颈已经不同了。

So now the bottleneck is is is different.

Speaker 1

我们现在面临一种情况，人们可以为某个科学问题生成成千上万种理论。

So we're now in a situation where suddenly people can generate thousands of theories, for a a a a given scientific problem.

Speaker 1

现在我们必须验证和评估这些理论。

And now we have to to verify them, evaluate them.

Speaker 1

这要求我们改变科学的结构，才能真正解决这个问题。

And this is something which we we have to to change our structures of science to actually sort this out.

Speaker 1

所以，传统上，我们建立起了壁垒。

So, you know, in fact, traditionally, we we build walls.

Speaker 1

你知道吗？

You know?

Speaker 1

在过去，还没有AI泛滥的时候，我们有一些业余科学家，他们自己提出各种宇宙理论，其中大多数几乎毫无价值。

So in in the past, you know, before we had AI slop, you know, we we had sort of amateur scientists, you know, create you know, have their own theories of the universe, many of which were basically of very little value.

Speaker 1

是的。

Yeah.

Speaker 1

所以我们建立了同行评审和出版系统，试图过滤掉低价值信息，筛选出高价值的创意进行检验。

And so we bought these, like, you know, peer review publication systems and things to kinda filter out and try to to isolate the high signal ideas to to test.

Speaker 1

但现在我们能够以海量规模生成这些可能的解释，其中一些很好，但大多数都很糟糕。

But but now that we can generate these these these these possible explanations at massive scale, and some of them are good and a lot are terrible.

Speaker 1

我的意思是，人类评审员实际上已经不堪重负了。

I mean, human reviewers, we just it's just they're already being overwhelmed, actually.

Speaker 1

我的意思是，许多期刊都报告在投稿中发现了AI生成的内容。

I mean, many, many journals are reporting AI during submissions.

Speaker 1

我只是在不断向他们的投稿系统灌输大量内容。

I just I just I'm just flooding their their submissions.

Speaker 1

虽然现在用AI能生成各种各样的东西很棒，但这意味着科学的其他方面必须跟上步伐。

So it's great that we can generate all kinds of things now with AI, but it it means that we have to the rest of the rest of the aspects of science have to catch up.

Speaker 1

是的。

Yeah.

Speaker 1

因此，验证、确认以及评估哪些想法真正推动了学科发展，哪些是死胡同或误导性方向，变得至关重要。

So verification, validation, and and assessing, what ideas actually move the subject forward and and what which ones are dead ends or or or red herrings.

Speaker 1

但这并不是我们已经知道如何大规模实现的事情。

And that's that's not something where we've we know how to do at scale.

Speaker 1

对于每一篇论文，我们可以与你讨论，让科学家们展开辩论，几年内达成共识。

You know, for each individual paper, we can discuss it with you know, have a debate among scientists and get to a consensus in a few years.

Speaker 1

但当我们每天生成上千篇这样的论文时，这种方式就行不通了。

But when we're generating, you know, a thousand of these every day, yeah, this doesn't work.

Speaker 0

是的。

Yeah.

Speaker 0

所以我认为，这里有一个非常有趣的问题：你拥有数十亿个AI科学家。

So I think there is this incredibly interesting question of you have billions of AI scientists.

Speaker 1

嗯。

Mhmm.

Speaker 0

不仅如何判断哪些是真正的进展，而且这其实也是人类科学家曾经面临并 somehow 解决的问题。

Not only how do you gauge which ones are real progress, but how do you, I mean, is actually a question that human scientists had to face and we've solved somehow.

Speaker 0

我不确定我们究竟是怎么解决这个问题的，但在任何一个领域，比如20世纪40年代，如果你在贝尔实验室，或者只是普遍地面对新兴技术，比如脉冲编码调制——如何传输信号、如何将信号数字化、如何通过模拟线路传输它们？

And I'm what, I actually am not sure how we solve this, but in any given field, let's say in the 1940s and there's, if you're at Bell Labs or if you're just generally trying to there's these new technologies coming out, pulse code modulation, basically how do you transfer signals, how do you digitize signals, how do you transfer them over analog wires?

Speaker 0

但还有大量论文讨论了其中的工程限制和细节。

And then but there's like all these papers about the engineering constraints there and the details.

Speaker 0

然后有一篇论文提出了‘比特’这个概念。

And then there's one which is like comes up with the idea of the bit.

Speaker 0

嗯。

Mhmm.

Speaker 0

这个概念对许多不同领域都有深远影响。

Which has implications across many different fields.

Speaker 0

你需要一个系统，能够识别出这一点，并说：好吧，我们需要将它应用到概率论中。

And you need some system which can then look at that and say, okay, we need to apply this to probability.

Speaker 0

我们需要将它应用到计算机科学等领域。

We need to apply this to computer science, etcetera.

Speaker 0

在未来，人工智能会提出这种统一性概念的下一代版本。

And for in the future, the AIs are coming up with, you know, the next version of this kind of unifying concept.

Speaker 0

而在数百万篇论文中，你如何识别出哪些真正构成了进展，尽管它们的普遍性可能低得多？

And how would you identify it among millions of papers which might actually constitute progress, but which have much less general Right.

Speaker 0

统一性的理念。

Unifying ideas.

Speaker 1

所以很多东西都需要时间的检验。

So a lot of it is the test of time.

Speaker 1

很多伟大的想法在刚提出时并没有得到很好的反响。

So so many great ideas didn't actually get a great reception at the time that they were first proposed.

Speaker 1

直到其他科学家意识到，他们可以进一步发展并将其应用到自己的领域中。

It was only after some other scientists realized that that, they could take it further and apply them to their own.

Speaker 1

你知道，深度学习本身长期只是人工智能中的一个小众领域。

You know, deep learning itself, was, like, a niche area of AI for a long time.

Speaker 1

通过数据训练而非第一性原理推理来获取知识的想法曾经极具争议，花了很长时间才开始取得成果。

That this the idea of of getting access entirely through training on data and and not through first principles, you know, reasoning was was was very controversial, and they were just took a long time before it actually started bearing fruit.

Speaker 1

你知道吧？

You know?

Speaker 1

你提到了比特。

You mentioned the bit.

Speaker 1

你知道的。

You know?

Speaker 1

我的意思是，除了当今普遍使用的二进制架构，还有其他计算机架构的提案。

I mean, there are there are other proposals for computer architectures than the zero one that is universal today.

Speaker 1

我认为曾经有过一些其他思路，比如二进制，还有三值逻辑。

I think there there were there were traits, you know, zero one, you know, three valued logic.

Speaker 1

在另一个平行宇宙中，也许会出现不同的范式。

And, you know, in a alternate universe, maybe a different paradigm would have would have showed up.

Speaker 1

有人认为，例如，Transformer 是所有现代大型语言模型的基础。

People have argued that, you know, the transformer, for example, is is the foundation of all modern large language models.

Speaker 1

它是第一个真正足够复杂以捕捉语言的深度学习架构，但事情本不必如此。

And it was the first, deep learning architecture that really was was sophisticated enough to capture language, but it didn't have to be that way.

Speaker 1

本可能有其他架构率先实现这一点。

There there could have been some other architecture that, was the first to do it.

Speaker 1

一旦某种架构被采纳，它就会成为标准。

And once that was adopted, it would become the standard.

Speaker 1

所以我认为，很难评估一个想法是否会有成果，原因之一是它取决于未来。

So I think, one reason why, it's hard to assess whether a given idea is gonna be fruitful is that it it it depends on the future.

Speaker 1

它也取决于文化和社会环境，比如哪些会被采纳，哪些不会。

It it depends on and it it it depends on also on the culture and society, like like which ones get adopted, which ones don't.

Speaker 1

你知道，在数学中，十进制系统非常有用，比罗马数字系统好得多。

You know, the base 10 numeral system in in mathematics is extremely useful, much better than the Roman numeral system, for instance.

Speaker 1

但同样，10本身并没有什么特别之处。

But, again, there's nothing special about 10.

Speaker 1

它之所以对我们有用，是因为别人都在用，我们已经将其标准化，并围绕它构建了所有的计算机和数字表示系统，所以我们现在实际上被它束缚住了。

It's it's it's a system that we it's useful for us because everyone else uses it, and we've standardized it, and we've built all our computers and our and our our number of representation systems around And so we're stuck with it now, actually.

Speaker 1

你知道，有些人偶尔会提倡使用十进制以外的系统，但根本就缺乏变革的动力。

You know, people are some people occasionally push for other systems than decimal, but, it's there's there's there's no there's just no, there's too much inertia.

Speaker 1

因此，你不能孤立地看待任何一项科学成就，而不了解过去和未来的背景就给出客观的评价。

So you you can't look at any given scientific achievement purely in isolation and give it an objective grade without being aware of the context both in the past and the future.

Speaker 1

所以，这可能永远无法像处理更局部的问题那样，通过简单的强化和学习来掌握。

And so it it it may never be something that you can just reinforce and learn the same way that, that that you can for much sort of more localized problems.

Speaker 1

是的

Yeah.

Speaker 0

在科学史上，当一种新理论出现时，我们事后往往发现它是正确的，但它当时似乎会得出一些毫无意义的推论，因为那些推论看起来是错的。

It seems often in the history of science when what when a new theory comes up that in retrospect we realize is correct, it seems to make implications that just either make no sense because they're wrong.

Speaker 0

我们后来才明白它们为什么错，或者为什么对，但在当时这些推论显得极其不可能。

And we realize later on why they're wrong or they're correct, but seem wildly impossible at the time.

Speaker 0

正如你提到的，阿里斯塔克斯早在公元前3世纪就提出了日心说。

So as you've talked about, Aristarchus, had heliocentrism in the third century, BC.

Speaker 0

当时的古希腊人认为这不可能，因为如果地球围绕太阳运转，我们应当能观察到恒星的相对位置随着地球公转而发生变化。

And then, the ancient Athenians were like, this can't be because it would if the Earth is going around the sun, we should see the relative position of the stars change as we're going around the sun.

Speaker 0

而唯一能解释这种现象不发生的原因，是恒星距离我们太远，以至于我们察觉不到任何视差——这实际上才是正确的推论。

And the only way that wouldn't be the case is if they're so far away that, that you don't notice any parallax, which is actually the correct implication.

Speaker 0

但有时，推论本身是错误的，我们只是需要提升到更高级的理解层次。

But there's times when actually the implication is incorrect and we just need to graduate to a better level of understanding.

Speaker 0

所以莱布尼茨曾试图反驳牛顿，理由是牛顿的引力理论暗示了超距作用。

So Leibniz would, you know, try Newton and disagree with Newton's gravity on the basis that it implied action at a distance.

Speaker 1

嗯嗯。

Mhmm.

Speaker 0

然后我们不知道其机制。

And then there's we don't know the mechanism.

Speaker 0

牛顿本人对惯性质量与引力质量是同一量感到震惊。

Newton himself was sort of stunned that inertial mass and gravitational mass were the same quantity.

Speaker 0

所有这些都由爱因斯坦解决了。

So all these things were they were which were resolved by Einstein.

Speaker 1

是的。

Yes.

Speaker 1

是的。

Yes.

Speaker 0

但那仍然是进步。

But it was still progress.

Speaker 0

因此，对于一个系统来说，即使你能证伪一个理论，你如何注意到它相对于之前的东西仍构成进步？

And so the question for a system of would be, even if you can falsify a theory, how would you notice that it still constitutes progress relative to the thing before?

Speaker 1

是的。

Yeah.

Speaker 1

所以实际上，最终正确的理论最初往往在很多方面都更差。

So it it often, actually, the the ultimately correct theory initially is is worse in many ways.

Speaker 1

是的。

Yeah.

Speaker 1

所以哥白尼的行星理论，其精确度不如托勒密的理论。

So Copernicus' theory of of the planets, it was less accurate than Tomlin's theory.

Speaker 1

你知道吗？

You know?

Speaker 1

所以地心说在当时已经发展了上千年，人们做了大量调整和越来越复杂的特设性修正，以使其越来越精确。

So so geocentrism had been developed for for, you know, a millennium by that point, and they had they had made many, many tweaks and and and very increasingly complicated ad hoc fixes to to make it more and more accurate.

Speaker 1

而哥白尼的理论简单得多，但精确度却低得多。

And Copernicus' theory was a lot simpler, but but much as accurate.

Speaker 1

只有开普勒才使它比托勒密的理论更精确。

There was only Kepler that made it more accurate than Thomley's theory.

Speaker 1

我的意思是，科学永远都在不断发展中。

I mean, science is always a work in progress.

Speaker 1

你知道的？

You know?

Speaker 1

所以，没错，当你只得到了部分解决方案时，它看起来反而比一个错误但已经完整到似乎能回答所有问题的理论更糟糕。

So, yep, so when you only get part of of the solution, it it looks worse than than a a theory which is incorrect, but somehow, you've you've has been completed to the point where it it it kind of answers all the questions.

Speaker 1

正如你所说，牛顿的理论确实存在很多谜团，比如超距作用，这些直到几个世纪后才被一种截然不同的概念方式所解决。

As you say, you know, Newton's theory had, yeah, had big mysteries, you know, the the and action at distance, which were only resolved with a very conceptually different, approach centuries afterwards.

Speaker 1

很多时候，进步并不是通过增加更多理论实现的，而是通过删除你头脑中的一些假设。

Often, progress has been made, I can not by adding more theories, but by deleting some assumptions that you you have in in in in your mind.

Speaker 1

所以，地心说能持续这么长时间的一个原因是，我们一直认为物体天然倾向于保持静止。

So, you know, one reason why geocentrism held on for so long is is we we had this idea that that objects naturally want to stay at rest.

Speaker 1

这是亚里士多德的物理学观念。

This is the Aristotelian notion of physics.

Speaker 1

所以，地球在运动这个想法，你知道的，难道不会导致所有东西都东倒西歪吗？

So the idea that the earth was moving, you know, how can we want all sort of all all falling over?

Speaker 1

你知道，一旦你有了新的运动形式，比如物体保持运动状态等等，这一切就说得通了。

You know, once you have new forms of motion, you know, object motion remains in motion and so forth, then then then it it it makes sense.

Speaker 1

但从概念上讲，要意识到地球是在运动的，这需要巨大的思维跃迁。

But you had to so conceptually, it it's it's a very big conceptual leap to to realize that that that the Earth is is in motion.

Speaker 1

它感觉不到自己在运动。

It doesn't feel like it's in motion.

Speaker 1

而最大的突破之一，就是达尔文的进化论，即物种并非静止不变的。

And, like, the biggest advance is, you know, Darwin's theory of evolution, you know, is the the idea that that species are are not static.

Speaker 1

但这一点并不明显，因为你一生中看不到进化过程。

But, you know, it's it's not this is not obvious because you you you don't see evolution in in in in your lifetime.

Speaker 1

现在我们实际上可以观察到了，但你知道，它看起来似乎永恒而静止。

Well, now we actually can, but but but, you know, it's it's it it it seems it seems permanent and static.

Speaker 1

现在，我们正经历一场认知层面的哥白尼革命：过去我们以为人类智能是宇宙的中心，但现在我们发现，存在着各种截然不同的智能形式，它们各有优劣。

You know, right now, we're going through an an cognitive version of the Copernican revolution where we used to think that human intelligence is the center of the universe, and now we're actually seeing that there's there's very different types of intelligence that that that are out there with very different strengths and weaknesses.

Speaker 1

因此，我们必须重新评估哪些任务需要智能，哪些不需要。

And so our assess assessment of which tasks require intelligence, which ones don't, has to be, reordered quite a bit.

Speaker 1

所以，我们需要尝试把人工智能融入到我们对科学进步的理解中，弄清楚什么难、什么容易。

And so, you know, it's trying to fit AI into sort of our theories of scientific progress and and and and what is hard and what is easy.

Speaker 1

我们正为此感到非常困惑。

We're struggling quite a lot.

Speaker 1

我们必须提出一些以前从未需要问过的问题——也许哲学家们曾经探讨过，但现在我们都必须面对这些问题。

And we have to ask questions that we've never really had to, ask before or maybe maybe the philosophers had, but now we all have to deal with it.

Speaker 0

这实际上引出了一个我一直非常好奇的话题。

This actually brings up a topic I've I've I've been very curious about.

Speaker 0

你提到了达尔文的进化论。

So you mentioned Darwin's Theorem Evolution.

Speaker 0

有一本书叫《钟表宇宙》，作者是爱德华·达尔内希特，书中涵盖了我们正在讨论的这段历史。

There's this book, The Clockwork Universe by Edward Dahlnecht, covers a lot of this era of history we're talking about.

Speaker 1

是的。

Mhmm.

Speaker 0

他在这本书中有一个有趣的观察：《物种起源》出版于1859年。

And he has this interesting observation in there that, The Origin of Species is published in 1859.

Speaker 0

嗯。

Mhmm.

Speaker 0

《自然哲学的数学原理》于1687年出版。

The Principia Mathematica is published in 1687.

Speaker 0

嗯。

Mhmm.

Speaker 0

因此，《物种起源》的问世基本上是在《自然哲学的数学原理》之后两个世纪。

So The Origin of Species comes out basically two centuries after The Principia.

Speaker 0

从概念上讲，达尔文的理论似乎更简单。

And conceptually, it seems like Darwin's theory is simpler.

Speaker 0

达尔文有一位同时代的生物学家，名叫托马斯·赫胥黎，他阅读了《物种起源》。

There's a contemporaneous biologist to Darwin who reads The Origin of Species, Thomas Huxley.

Speaker 0

他说：‘怎么没想到这一点，真是愚蠢。’

And he says, how stupid not to have thought of that.

Speaker 0

嗯。

Mhmm.

Speaker 0

但从来没有人对‘友谊主义’这么说。

And nobody ever says that about friendshipia.

Speaker 0

他们是在责备自己为什么没能比牛顿更早发现

They're chiding themselves for not having beaten Newton to

Speaker 1

对。

Right.

Speaker 0

万有引力。

To gravity.

Speaker 0

所以这就引发了一个问题：为什么花了这么长时间？

And so there's a question of, well, why did it take longer?

Speaker 0

看起来主要原因之一是，自然选择的证据是累积性和回顾性的，而牛顿可以直接说：看，这是我的方程式。

It seems like a big part of the reason is that the evidence for natural selection is cumulative and retrospective, whereas Newton can just like, here's, here's my equations.

Speaker 0

让我看看月亮的公转周期和它的距离。

Let me see the moon's orbital period and its, distance.

Speaker 0

如果数据吻合，我们就取得了进展。

And if it lines up, then we've made progress.

Speaker 0

所以卢克莱修实际上在公元前一世纪就提出了物种适应环境的观点，但直到达尔文时代才有人真正讨论它，因为卢克莱修无法做实验来迫使人们关注。

And so Lucretius actually had the idea, this idea that species adapted their environment in the first century BC, but nobody ever, like, really talks about it until Darwin because there's Lucretius can't run some experiment and people are forced to pay attention.

Speaker 0

因此，我不禁想，从后见之明来看，我们是否会发现，在那些具有紧密数据反馈循环的领域中，即使它们在概念上更复杂，进步却更加显著。

And so I wonder if we'll, in retrospect, end up seeing much more progress in domains which are have this kind of tight data loop where you can verify them quite, easily, even though they're conceptually much more difficult.

Speaker 1

我认为科学的一个方面不仅是创造新理论并验证它，还包括向他人传达它。

I think one one aspect of science is is not just creating new theory and validating it, but communicating it to others.

Speaker 1

所以达尔文实际上是一位杰出的科学传播者。

So so Darwin was actually an amazing science communicator.

Speaker 1

他用英语和自然的语言进行写作。

He wrote in English and natural in natural language.

Speaker 1

我就在这样说话。

I'm speaking like that.

Speaker 1

在《非简约》中。

In No Lean.

Speaker 1

好吧，我得暂时放下我的技术思维。

Okay, I have to sort of get out of my technical mindset.

Speaker 1

他用通俗的英语表达，没有使用公式，并且整合了许多不同的因素。

He spoke in plain English, didn't use equations, and he synthesized a lot of disparate factors.

Speaker 1

过去已经有人对进化的一些小部分进行了研究，但他提出了一个非常有说服力的愿景。

So little pieces of evolution had been worked out in the past, but he had this very compelling vision.

Speaker 1

当然，他仍然遗漏了一些东西，比如他不知道遗传的机制，他没有DNA的概念。

Again, still missing things, like he didn't know the mechanism for hereditary he he didn't have DNA.

Speaker 1

好的。

Okay.

Speaker 1

而且是的。

And yeah.

Speaker 1

但他的写作风格很有说服力，这帮助很大。

But his writing style was persuasive, and that that helped a lot.

Speaker 1

牛顿用拉丁文写作。

Newton wrote in Latin.

Speaker 1

他为了阐释自己的观点，发明了全新的数学领域。

He he he had invented, you know, entire new new areas of mathematics just to explain what he was doing.

Speaker 1

他所处的时代，科学家们更加保密和竞争激烈。

He was also from an era which was, where scientists were much more secretive and competitive.

Speaker 1

所以，你知道，学术界至今仍然充满竞争，但在牛顿那个时代，情况更糟。

So, you know, academia is still competitive, it was even worse, back in Newton's day.

Speaker 1

因此，他隐瞒了一些最出色的见解，因为他不想让对手获得任何优势。

So he he held back some of his best insights because he didn't want his rivals to get any advantage.

Speaker 1

据我了解，他实际上也是一个有点令人不快的人。

He was also, actually, somewhat unpleasant person from what I what I what I gathered, actually.

Speaker 1

事实上，仅仅在牛顿去世几十年后，其他科学家才用更简单的语言解释了他的工作，这些观点才得以广泛传播。

So it was actually only a couple decades after Newton where other scientists explained his work in much simpler terms that they became widespread.

Speaker 1

所以，是的，表达艺术、论证和构建叙事也是科学中非常重要的部分。

So, yeah, the the art of exposition and making a case and creating a narrative is is also a very important part of science.

Speaker 1

如果你有数据，这当然有帮助，但人们需要被说服。

And if you have the data, it it helps, but but people need to be convinced.

Speaker 1

否则，他们不会继续推动它。

Otherwise, they will not push it further.

Speaker 1

他们需要投入初始资源来学习你的理论，并深入探索它。

Well, they wanna take initial investment to to to learn your theory and really and really explore it.

Speaker 1

这又是另一件很难去强化和掌握的事情。

And that's another thing which is really hard to reinforce and learn on.

Speaker 1

是的。

Yeah.

Speaker 1

你如何衡量自己有多有说服力？

How can you score how persuasive you are?

Speaker 1

好吧。

Okay.

Speaker 1

嗯，好吧。

Well, okay.

Speaker 1

整个市场部门都在努力做这件事。

There's the entire marketing departments who are trying to do this.

Speaker 1

所以也许AI尚未被优化为具有说服力，这反而是好事。

So maybe it's good that AI are not not yet optimized to be persuasive.

Speaker 1

所以，是的，科学中存在社会层面的因素。

So, yeah, there's a social aspect to science.

Speaker 1

尽管我们以科学具有客观性而自豪，比如有数据、实验和验证，但我们仍需讲述故事，说服我们的同行科学家。

Even though we pride ourselves on having an objective side to it where there's data and there's experiment and validation, we we still have to tell stories and convince our fellow scientists.

Speaker 1

这是一种柔软、模糊的东西。

And that's a a soft, squishy thing.

Speaker 1

也就是说，你知道，这是数据和构建叙事的结合。

Like, it it's, you know, it's it's a combination of data and, yeah, and painting a narrative.

Speaker 1

而且我们并不是完全无懈可击的。

And and it's a we're not too with gaps.

Speaker 1

你知道吗？

You know?

Speaker 1

我的意思是，正如你所知，即使达尔文，正如我所说，他的理论中也有一些他无法解释的部分，但他仍然能论证：未来人们会发现过渡形态，会找到遗传机制——而他们确实找到了。

I mean, as you know, so so even Darwin, as I said, there there there are pieces of his theory he cannot explain, but he could still make a case that, you know, in the future, people would would would find transitional forms, that they would find the mechanism of inheritance, and they did.

Speaker 1

是的。

Yeah.

Speaker 1

我不明白你怎么能以如此精确的方式量化这一点，从而开始强化某种东西。

I don't know how you can quantify that in such a precise way that you can start to reinforce something.

Speaker 1

也许这永远会是科学中的人性一面。

Maybe that will be forever the human side of science.

Speaker 0

我从阅读和观看你关于宇宙距离阶梯的内容中获得的一个启示是——顺便说一句，我强烈推荐大家观看你与Through the One Round合作的《宇宙距离阶梯》系列。

One takeaway I had from, reading and watching your stuff on the cosmic distance ladder by way, I highly, highly, highly recommend people watch your series with Through the One Round on the Cosmic Distance Ladder.

Speaker 0

但其中一个启示是，许多领域中的演绎性推断空间可能比人们意识到的要大得多；如果你对如何研究某个问题有了正确的洞察，你可能会惊讶于自己能从世界中学到多少东西。

But, one takeaway was that the deductive overhang in many fields could be so much bigger than people realize where if, if you just had the right insight about how to study a problem, you might be surprised at how much more you could learn about the world.

Speaker 0

我想知道，你认为这是天文学在你所研究的历史特定时期特有的产物，还是仅仅因为目前地球上接收到的数据，我们实际上能推断出比我们已知的多得多的信息？

And I wonder if you think that's sort of a product of astronomy at the particular times in history that you're studying, or is just that based on the data that is incident on the earth right now, we could actually divine a lot more than we happen to know.

Speaker 1

没错。

Right.

Speaker 1

天文学是最早真正拥抱数据分析、并从已有数据中榨取每一丝信息的学科之一，因为数据本身就是瓶颈。

So astronomy was one of the first scientists to really embrace data analysis and and and squeezing every last possible drop of information out of information they had because because data was the bottleneck.

Speaker 1

我的意思是，数据至今仍然是瓶颈。

I mean, it still is the bottleneck.

Speaker 1

我的意思是，收集天文数据真的非常困难。

I mean, it's it's really hard to to collect astronomical data.

Speaker 1

所以天文学家在从少量数据痕迹中提取各种结论方面，几乎是世界顶尖的。

So astronomers are the best, you know, almost world class in in extracting, you know, almost like, all kinds of conclusions from little traces of data.

Speaker 1

我听说很多量化对冲基金都非常倾向于招聘天文学博士。

I hear that that a lot of quant hedge funds, they their their preferred hires in astronomy PhD.

Speaker 0

这真有意思。

How interesting.

Speaker 1

他们之所以对从各种零散数据中提取信号也充满兴趣，还有其他原因。

That that they also are very interested for other reasons in extracting signals from various random bits of data.

Speaker 0

明白了。

Okay.

Speaker 0

说到巧妙的点子，我的一位听众肖恩解决了简街为我的观众设计的谜题，并在X上发布了一份精彩的解题指南。

Speaking of clever ideas, one of my listeners, Shawn, solved the puzzle that Jane Street made for my audience and posted a great walkthrough on X.

Speaker 0

背景是，简街训练了一个ResNet模型，然后将全部96层打乱，接着挑战人们仅凭模型的输出和训练数据，将这些层重新排列回正确顺序。

For context, Jane Street trained ResNet and then shuffled all 96 layers and then challenged people to put them back in the right order using only the model's outputs and training data.

展开剩余字幕（还有 480 条）

Speaker 0

你不可能靠暴力枚举来解决这个问题。

You can't brute force this.

Speaker 0

可能的排列方式比宇宙中的原子还要多。

There's more possible orderings than atoms in the universe.

Speaker 0

所以肖恩把这个问题分成了两个部分：首先，将层配对成48个不同的块。

So Shawn broke the problem into two different parts: First, pair the layers into 48 different blocks.

Speaker 0

其次，把这些块按正确的顺序排列。

And second, put those blocks in the right order.

Speaker 0

在配对时，肖恩发现，在一个训练良好的ResNet中，残差块中两个权重矩阵的乘积会呈现出明显的负对角线模式。

For pairing, Shawn realized that in a well trained ResNet, the product of two weight matrices in a residual block should have a distinctive negative diagonal pattern.

Speaker 0

这种模式的出现是为了防止残差流无限制地增长。

And this arises as a way for the model to keep the residual stream from growing out of control.

Speaker 0

基于这一洞察，他成功恢复了正确的配对关系。

From this insight, he was able to recover the right pairings.

Speaker 0

在排序时，肖恩注意到，如果按各块残差贡献的大小对块进行排序，模型的表现会更好。

For ordering, Shawn noticed that the model seemed to improve if he sorted the blocks by the size of their residual contributions.

Speaker 0

从这个粗略的近似开始，他结合了一种巧妙的排序启发式方法和局部交换，恢复了完全正确的顺序。

Starting with that rough approximation, he combined a clever ranking heuristic with local swaps to recover the exact right order.

Speaker 0

他的完整讲解链接在描述中。

His full walkthrough is linked in the description.

Speaker 0

不过，如果你没来得及解决这个谜题，也不用担心。

Don't worry if you didn't get to this puzzle in time, though.

Speaker 0

还有一个关于后门大语言模型的谜题，连Jane Street都不知道该如何解决。

There's still one up about backdoor LLM's that even Jane Street doesn't know how to solve.

Speaker 0

你可以在janestreet.com/thorcash找到它。

You can find it at janestreet.com/thorcash.

Speaker 0

好了。

Alright.

Speaker 0

现在回到特伦斯。

Back to Terence.

Speaker 1

我们往往严重忽略了如何从各种信号中提取额外信息。

We we we do underexplore sort of how to extract extra information from from various signals.

Speaker 1

比如，我就随便举一个研究例子，我记得曾经读到过，有人研究了科学家们实际阅读他们所引用文献的频率。

Like, I I just to to pick one random study, I I remember reading once that that people had discovered we're trying to to measure how often scientists actually read these citations that the papers that they cite.

Speaker 1

那么你们是怎么测量这个的呢？

So how how do you measure this?

Speaker 1

好的。

Okay.

Speaker 1

你可以尝试去调查不同的科学家，但他们用了一个巧妙的办法。

You you you you could try to survey, different scientists, but they they had some clever trick.

Speaker 1

很多引用文献里都有一些小错误，比如数字错了，或者标点符号错了。

So so so many citations, have little typos like like like, you know, a number is wrong or or punctuation symbol is wrong.

Speaker 1

他们测量了这些错误从一个参考文献复制到下一个参考文献的频率，从而推断出作者是否只是在复制粘贴引用，而没有真正核对原文。

And they they measured how often a a typo got copied from one reference to to the next, and and they could infer whether an author was actually just copying it, cutting and pasting a reference without actually checking it.

Speaker 1

因此，他们能够据此推断出人们在多大程度上关注了这些引用内容。

And so from that, they they were able to infer some some measure of of sort of, how much attention people were paying.

Speaker 1

所以，还有许多巧妙的方法可以用来提取信息。

So there are also clever tricks to extract.

Speaker 1

你知道的。

You know?

Speaker 1

所以，你之前提出的那些问题——比如，我们如何评估一项科学进展是否富有成果、有趣或代表了真正的进步？

So these questions you you posed earlier of, you know, how can we assess whether, a scientific development is fruitful or, or interesting or or represents real progress?

Speaker 1

也许，这类现象在数据集中确实存在一些有用的数据指标或痕迹。

You know, maybe there are, really useful metrics and or or, footprints of this of this of this this phenomenon in in a data data set.

Speaker 1

我们可以分析引用情况，以及某项研究在会议中被提及的频率等等。

You know, we can we can examine citations and and, like, how often something is mentioned in a conference or something.

Speaker 1

而且，科学社会学领域还有很多研究空间，或许能真正发现这些现象。

And and maybe that there there's there's there's a lot of social sociology of science research to be to be done and and that could actually, detect these things.

Speaker 1

是的。

Yeah.

Speaker 1

也许我们通常在案例中得到的都是最错误的那些。

Maybe we usually get the most wrong ones on the case

Speaker 0

实际上。

actually.

Speaker 0

好的。

Okay.

Speaker 0

所以我认为，这很好地引出了外界看来AI在数学领域所取得的进展。

So I I think this brings us, nicely to the progress that from the outside, it seems like AI for math is making.

Speaker 0

嗯。

Mhmm.

Speaker 0

我认为你最近发过一篇帖子，指出在过去几个月里，AI程序已经解决了1100多个难题中的50个。

And I think you had a post recently where you pointed out that over the last few months, AI programs have solved 50 out of the 1,100 odd hurdles problems.

Speaker 0

但我觉得，不确定现在是否还准确，一个月前你说过，由于容易摘的果实已经被摘完了，进展暂时停滞了。

But then I think I don't know if it's still correct, but as of a month ago, you said that there had been a pause because the low hanging fruit had been picked.

Speaker 0

首先，我想知道，现在是否仍然如此——我们已经摘完了低垂的果实，目前正处于一个平台期？

First of all, I'm curious if actually that is still the case, that we have picked the low hanging fruit and now we're now we're at this plateau currently?

Speaker 1

看起来确实如此。

It it does seem so.

Speaker 1

我的意思是，虽然仍有活动，但确实如此。

I mean, there's still activity at the yeah.

Speaker 1

所以，已经有大约50个问题被AI系统解决了，这很不错，但还有大约600个待解。

So so 50 odd problems have been solved with AI systems, which is great, but there's, like, 600 to go.

Speaker 1

对。

Right.

Speaker 1

目前人们仍在一点一点地解决其中的一两个问题。

And people are still chipping away at at one or two of these right now.

Speaker 1

我们现在看到的纯AI解决方案少了很多，那种AI一次性直接解决整个问题的情况已经不多了。

We are seeing a lot fewer sort of pure AI solutions now where, the AI just one shots the problem.

Speaker 1

曾经有一个月里这种事发生过，但现在已经停止了。

So so there was a month where that happened, and and that has stopped.

Speaker 1

并不是因为缺乏努力。

Not for lack of trying.

Speaker 1

我知道有三个独立的尝试，试图让GPT模型同时攻击每一个问题。

I know three separate, attempts to get front g model AI to just attack every single one of the problems simultaneously.

Speaker 1

它们确实发现了一些细微的观察结果，或者可能找到了一些我从文献中已经见过的问题，但至今还没有出现任何进一步的纯AI驱动的解决方案。

And they pick up some minor observations or or maybe they they they found some problems I already saw from the literature, but there hasn't been any further AI purely powered solution yet.

Speaker 1

目前，人们大量使用人工智能。

People are using AI a lot, currently.

Speaker 1

因此，有人可能会使用AI生成一种可能的证明策略，然后另一个人会使用另一个AI工具来批评、重写它，或为它生成数值数据，或进行文献综述。

So someone might use AI to generate a a possible, proof strategy, and then another, person will use a separate AI tool to critique it or rewrite it or generate some numerical data for it or do a literature survey.

Speaker 1

有些问题正是通过大量人类与多个AI工具之间的持续对话得以解决的。

And and some problems have been solved by a a ongoing conversation between lots of humans and lots of AI tools.

Speaker 1

但看起来这似乎只是一次性的做法。

But it it it does seem like it it it was this this one off thing.

Speaker 1

所以，或许可以这样比喻这些问题：想象你身处一片山脉中，到处都是悬崖和峭壁，有的墙可能只有三英尺高，有的六英尺高，有的十五英尺高，还有的高达一英里。

So maybe one analogy to for for these problems is, like, imagine, like, there's there's there's all these that you're in some sort of mountain range with all kinds of of cliffs and walls, and and maybe there's a there's a there's a little wall, which is maybe, like, three feet high and one that's six feet high and then there's 15 feet high and then there's there's there's some mile high cliffs.

Speaker 1

你试图攀爬尽可能多的这些峭壁，但周围一片漆黑。

And you're trying to climb as many of these cliffs as possible, but it's in the dark.

Speaker 1

我们不知道哪些墙高，哪些墙矮。

We don't know which ones are tall, which ones are short.

Speaker 1

因此，我们试着点些蜡烛、画些地图，慢慢发现其中一些是可以攀登的。

And so, you know, we try to light some candles and make some maps, and and slowly, we we kind of figure out that some of them are are climbable.

Speaker 1

其中一些，我们能发现墙面上一些可以先触及的局部痕迹。

Some of them, we can identify some some partial track in the wall that you can reach first.

Speaker 1

而这些AI工具就像跳跃机器，能跳到两米高，比任何人类都跳得高。

And then these these AI tools, they're kinda like these jumping machines that can kinda jump, you know, two meters in the air, you know, higher than than any human.

Speaker 1

有时它们跳错了方向，有时会摔下来，但有时它们能到达那些我们以前够不到的最低墙顶。

And sometimes they jump in the wrong direction, then sometimes they they crash, but sometimes they they they can reach, the tops of of of, the lowest, you know, walls that we we couldn't reach before.

Speaker 1

所以我们基本上把它们放在这片山岭中四处跳跃，然后进入一个令人兴奋的阶段——它们真的找到了所有那些低矮的墙，并成功登顶。

And so we just basically set them loose in this mountain range hopping around, and, you know, and then there's this exciting period where they they could actually find all the all the low ones, and they they could reach them.

Speaker 1

但此后就没有进展了，我的意思是，也许下次模型有重大突破时，它们会再试一次，或许还能突破几座更高的墙。

But then there's been no I mean, maybe if the next time there's a big advance in the models, then they will try it again and maybe a a few more will be will be will be breached.

Speaker 1

但这是一种与传统数学方法不同的方式；通常我们会像爬山一样逐步推进，做些标记，寻找局部进展。

But it's a different style of doing mathematics than sort of the so normally we would hill climb and we would make little markers and try to identify partial things.

Speaker 1

这些工具要么成功，要么失败。

You know, these tools, they either succeed or they fail.

Speaker 1

它们在创造局部进展或识别出应优先关注的中间阶段方面，一直表现得很差。

And they they've been really bad at creating sort of partial progress or identifying intermediate stages that you you should focus on first.

Speaker 1

再次回到之前的话题，我们目前没有评估部分进展的方法。

Again, going back to to this this previous discussion, you know, we don't have a way of evaluating partial progress.

Speaker 1

是的。

Yeah.

Speaker 1

就像我们能够评估一次性成功或失败地解决一个问题那样。

The the same way we could we can evaluate a one shot success or failure of solving a problem.

Speaker 0

所以，对于你刚才说的，有两种不同的思考方式。

So there's two different ways to, think through what you've just said.

Speaker 0

一种对AI进展持更悲观的看法，认为AI只能达到某种墙高，而这种高度还不及人类能达到的水平。

And one of them is more bearish on AI progress and one of them is more bullish and bearish on being, oh, they're only getting to a certain height of wall, which is not as high as humans are reaching.

Speaker 0

另一种观点是，AI具有一个强大的特性：一旦达到某个水平线，它们就能解决所有处于该水平线上的问题，而人类却做不到——我们无法复制你一百万次，给每个人一百万美元的计算资源，让你在主观时间上同时进行一百年、一百个甚至一百万个问题的研究。

And the second is that, well, they have this powerful property that once they achieve a certain waterline, that they can fill every single problem that is available at that waterline, which we simply can't do with humans where we can't make a million copies of you and, give each of them a million dollars of inference compute and have you do a hundred years of subjective time research on, a 100 different problems at the same time or a million different problems at the same time.

Speaker 0

但一旦AI达到了特伦斯·陶的水平，它们就能做到这一点。

But once AI's reached Terence Tao level, they could do that.

Speaker 0

而一旦达到中级水平，它们也能实现相应层级的这种能力。

And then once they reach intermediate levels, they could do they could do the intermediate version of that.

Speaker 0

所以，我们如今应该持悲观态度的原因，也正是我们尤其应该持乐观态度的原因——即使在AI尚未达到超人智能时，只要它们达到人类水平的智能，其人类水平的智能在质上就比我们的更广泛、更强大。

So the same reason that we should be bearish now is the reason we should be especially bullish, not even when they achieve superhuman intelligence, but just when they achieve human level intelligence, because their human level intelligence is qualitatively wider and more powerful than our human level intelligence.

Speaker 1

我同意。

I I I agree.

Speaker 1

是的。

Yeah.

Speaker 1

所以AI擅长广度，而人类，至少人类专家，擅长深度。

So they excel at breadth, and humans excel at depth, like human experts at least.

Speaker 1

是的。

Yeah.

Speaker 1

所以我认为它们非常互补。

So, I think they're very complementary.

Speaker 1

但我们目前进行数学和科学研究的方式侧重于深度，因为人类的专业能力就在深度上，毕竟人类无法做到广度。

But our current, way of doing math and science is focus on depth because that that's where the human, expertise is because humans can't do breadth.

Speaker 1

但确实如此。

But yeah.

Speaker 1

所以我们必须重新设计科学研究的方式，以充分利用我们现在拥有的这种广度能力。

So we we have to redesign, the way we do science to take full advantage of, of this breadth capability that we now have.

Speaker 1

正如我所说，我们应该投入更多精力去创造大量广泛的课题，而不是只专注于一两个非常深入的重要问题。

So as I said, we do we should have a lot more effort in creating very broad classes of problems to work on rather than than one or two, really, deep important problems.

Speaker 1

我的意思是，我们仍然应该保留那些深入的重要问题，人类也仍应继续研究它们。

I mean, we should still have the deep important problems, and humans should still be working on them.

Speaker 1

但现在，我们有了另一种开展科学研究的方式。

But but now now we we have this other way of of of of doing of doing science.

Speaker 1

你知道吗？

You know?

Speaker 1

我的意思是，我们可以先让这些具备中等广度能力的AI去探索全新的科学领域，梳理出所有容易发现的观察结果，然后识别出某些难点区域，接着由人类专家介入深入研究。

I mean, we can explore entire new fields of science by by first getting the these broad, moderately competent AI to sort of map it out and clear out all the the ease make all the easy observations, okay, and then identify certain islands of difficulty, which, you know, then human experts can come and and work on.

Speaker 1

因此，我看到的未来是科学将呈现出高度互补的形态。

So I I I see very much a future of very complementary science.

Speaker 1

最终，你自然希望同时实现广度和深度，把两者的优势都发挥到极致。

Eventually, you would hope to get both breadth and depth, you know, and and somehow get the both best of best best of both worlds.

Speaker 1

但我认为我们需要在广度方面多加练习，因为这太新了。

But I think we we need practice with the breadth side because it's too new.

Speaker 1

我们甚至还没有真正的范式来充分利用它，但我们会的。

We don't even have the paradigms really to to, to make full advantage of it, but we will.

Speaker 1

之后，科学将会变得完全认不出来了，我认为。

And then science will be unrecognizable after that, I think.

Speaker 0

关于互补性这一点，程序员们注意到，由于这些AI工具，他们的工作效率大大提高。

To to this point about complementarity, the programmers have noticed that they're way more productive as a result of these AI tools.

Speaker 0

我不知道作为数学家的你是否也有同样的感受，但确实，Vibe编程和Vibe研究之间有一个很大的不同：在软件领域，你工作的根本目的就是要通过成果对世界产生影响。

And, I don't know if you as a mathematician feel the same way, but it does seem like one big difference between Vibe coding and Vibe researching is that with software, the whole point of the thing is to have some effect on the world through your work.

Speaker 0

如果你能更好地理解一个问题，或者提出一个清晰的抽象并体现在代码中，这本身就是实现最终目标的关键。

And if it leads to you better understanding a problem or you coming up with some clean abstraction to embody in your code, that is instrumental to the end goal.

Speaker 0

而或许在研究中，我们关心解决千禧年难题的原因，大概是因为在解决这些问题的过程中，我们会发现新的数学对象，或者更好的新方法，从而拓展人类对数学的理解。

Whereas maybe with research, the reason we care about solving the millennium papyrus problems is presumably that in the process of solving them are, are we discover new mathematical objects or better, new techniques and those who understand our civilization's understanding of mathematics.

Speaker 0

因此，证明本身是服务于中间过程的。

And so the proof is sort of instrumental to the inter intermediate, work.

Speaker 0

我不确定你是否同意这种二分法，或者它是否能解释软件与研究领域相对提升的差异。

I don't know if you agree with that dichotomy or if that in any way will explain the relative uplift we'll see in software versus research.

Speaker 1

对。

Right.

Speaker 1

是的。

Yeah.

Speaker 1

所以在数学中，过程往往比问题本身更重要。

So so certainly in in math, the process is is often more important than the problem itself.

Speaker 1

问题某种程度上是衡量你进展的代理指标。

The problem is kind of a proxy for for measuring your progress.

Speaker 1

我认为即使在软件领域，也存在不同类型的软件任务。

I think even in software, there's there's different types of software tasks.

Speaker 1

我的意思是，比如你只是创建一个和成千上万个网页功能相同的网页，那几乎没什么技能可学。

I mean, the you know, like, if you just kinda create a web page that does the same thing that a thousand other web pages do, there's there's sort of no skill to be learned.

Speaker 1

当然，程序员个人可能还是会学到一些技能。

Well, there's there's still some skill maybe that the individual programmer could pick up.

Speaker 1

但对于那种模板式的代码，毫无疑问，你应该把它交给AI来处理。

But, know, for for kind of a boilerplate type code, definitely, you know, it it's it's it's something that you should definitely offload offload to AI.

Speaker 1

但你知道，有时候代码写完之后，你仍然需要维护它，还存在升级和使其与其他系统兼容的问题。

But, you know, sometimes once you make the code, know, you still maintain it and and and and there's issues with upgrading it and making it compatible with other things.

Speaker 1

我认为我听说过，即使是AI能创建出某个工具的首个原型，要让它与所有其他系统整合，并以人们期望的方式与现实世界互动，这仍然是一个持续的过程。

And and that, I think, I've I've heard that that program is our reporting, you know, that even if if if an AI can create the first prototype of of a tool, making it mesh with everything else and and making it interact with the real world in the way they want, I mean, it's that's an ongoing process.

Speaker 1

如果你没有通过编写代码所培养出的那些技能，可能会影响你未来维护它的能力。

And if you didn't have the the skills of that you pick up from from from writing the code, that that may that may impact your ability to maintain it down the road.

Speaker 1

所以，当然，数学家们一直通过解决问题来建立直觉，训练人们判断什么是真实的、可预期的、可证明的，以及什么是困难的。

So, certainly, mathematicians, you know, we've we've used problems to build intuition and to train people to have a good idea as what's true, what to expect, what is provable, what difficult.

Speaker 1

因此，直接得到答案反而可能会阻碍这一过程。

So just getting the answers right away may actually inhibit that process.

Speaker 1

正如我之前区分过理论和实验一样。

I mean so as I made distinction between theory and experiment before.

Speaker 1

在大多数科学领域中，理论和实验两方面是平等划分的。

So in most sciences, there's an equal division between there's a theoretical side and experimental side.

Speaker 1

但在数学中，这种情况几乎是独一无二的。

But in math has been almost unique.

Speaker 1

它几乎是完全理论性的。

It's that it's almost entirely theoretical.

Speaker 1

我们非常重视构建连贯、清晰的理论，以解释事物为何为真或为假。

We we we pay the premium on sort of trying to to to have coherent clean theories of of of why things are true and and false.

Speaker 1

我们很少做实验，比如，你知道，也许我们有两种不同的方法来解决一个问题。

And we haven't done much experiments as to the like, you know, maybe we have two different ways to solve a problem.

Speaker 1

哪种方法更有效？

Which one is is more effective?

Speaker 1

我们有一些直觉，但还没有进行大规模研究，比如取一千个问题逐一测试。

We have we have some intuition, but we haven't done large scale studies where we take a a thousand problems and we and we we just test them.

Speaker 1

但现在我们可以做到这一点。

But we can do that now.

Speaker 1

所以我认为，AI工具将真正革新数学的实验层面——在那里，你并不那么关注单个问题及其解决过程，而是希望收集大量数据，了解哪些方法有效、哪些无效。

So I think AI type tools, we really will will actually revolutionize the the experimental side of math where where, you don't care so much about, individual problems and and the process of solving them, but, yeah, you you you wanna gather just large scale data about about what things work, what things don't.

Speaker 1

你知道，就像如果你是一家软件公司，想要推出一千款软件，你当然不想手工打造每一款，从每一款中吸取经验。

You know, same way that if if you wanted to if if you're a software company and and you wanted to roll out a thousand pieces of software, you know, you don't really wanna handcraft each one and learn lessons from each.

Speaker 1

你只想找到哪些工作流程是可以扩展的？

You just wanna find what are the workflows you scale?

Speaker 1

所以，目前我们还没有实现大规模开展数学研究，这个想法还处于起步阶段，但正是AI将真正革新这一领域。

So we we don't yet we we we the the idea of doing mathematics at scale is at its infancy, but that's where AI is really gonna revolutionize the subject.

Speaker 0

有意思。

Interesting.

Speaker 0

我觉得，在这些关于AI在科学中能有多好的讨论中，一个关键点是你刚才提到的。

I feel like a big crux in these conversations about how much how good AI will be for science is I think you said this.

Speaker 0

就是说，他们只是在使用现有技术并对其进行修改。

It's like, oh, they they're using existing techniques and modifying them.

Speaker 0

了解仅靠使用现有技术能取得多少进展，会很有意思。

And it would be interesting to understand how much progress one can make simply from using existing techniques.

Speaker 0

比如，如果我看看顶级数学期刊，其中有多少论文是提出新方法，又有多少是将现有方法应用到新问题上？

Like how much of if I looked at the top math journals, how many of them are how many of the papers are coming up with whatever coming up with the technique means doing that versus using existing techniques in, in new problems.

Speaker 0

而所谓的剩余空间在于，你可以将所有已知的技术应用到每一个开放性问题上，这会极大地提升人类文明的知识水平吗？还是说这种提升并没有那么显著和有用？

And what the overhang is where you could just apply every known technique to every open problem, would that just constitute a humongous uplift in our civilization's knowledge, or would that not be that impressive and useful?

Speaker 1

这是个非常好的问题。

It's this is a great question.

Speaker 1

我们目前还没有足够的数据来完全回答它。

We don't have the data to fully answer it yet.

Speaker 1

当然，人类数学家所做的很多工作，当你面对一个新问题时，我们首先会去寻找过去在类似问题上成功过的所有标准方法，然后逐一尝试。

Certainly, a lot of work that human mathematicians do, you know, when you when you take a new problem, one of first things we do is we just find we we look at all the standard things that have worked on similar problems in the past, we try them one by one.

Speaker 1

有时候这些方法有效，即使如此，也值得发表，因为这个问题本身很重要。

And sometimes that works, and that's still worth publishing sometimes because the the question was important.

Speaker 1

有时候这些方法几乎奏效，你只需要再添加一点调整，这同样很有意思。

Sometimes they almost work, you have to add one more wrinkle to it, and that's also interesting.

Speaker 1

但那些发表在顶级期刊上的论文，通常都是这样：现有的方法能解决大约80%的问题，但剩下的20%顽固地无法解决，因此必须发明一种新方法来填补这些空白。

But then, you know, the papers that go into the top journals are usually ones where you you know, the existing methods can kinda solve, you know, 80% of the problem, but then that because this 20% which is resistant, and and a new technique has to be invented to to fill in the gaps.

Speaker 1

现在，几乎不可能出现一个问题完全不依赖以往文献、所有思路都凭空而来就能被解决的情况。

It's it's very, very rare now that a a a problem gets solved with sort of no reliance on past literature where where all the ideas come out of of of of nowhere.

Speaker 1

你知道，过去这种情况更常见，但现在数学已经非常成熟了，如果不先利用现有文献，反而会成为巨大的障碍。

You know, that was more common in the past, but but math is so mature now that it's it's it would it's just so much of a handicap to to to to to not use the literature first.

Speaker 1

所以，是的，AI工具在完成这一部分任务上表现得越来越好，就是尝试所有标准方法来解决一个问题，现在实际上在实现这些方法时犯的错误比人类还少。

So, yeah, AI tools are really good at are getting really good at the first part of that, just trying all the standard techniques on a problem, often now actually making fewer mistakes in implementing them than than than humans.

Speaker 1

它们仍然会出错，但我已经测试过这些工具，比如在我能独立完成的小任务上，有时它们能发现我犯的错误，有时我也能发现它们的错误。

It's it's they still make mistakes, but but I've I've tested these tools, you know, on on on on, like, little tasks that I can do, and and sometimes they pick up errors that I make, sometimes I pick up errors that they make.

Speaker 1

目前来看，双方基本上打平。

It's about a tie right now.

Speaker 1

我还没看到它们能迈出下一步，也就是当论证中出现漏洞，所有现有方法都失效时，你该怎么办？

I haven't yet seen them take the next step, so when there are holes in in in the argument where none of the things are working to to then what do you do?

Speaker 1

然后它们可能会提出一些随机的想法，但往往我发现，试图追着这些想法去实现，最后却发现行不通，反而浪费了更多时间。

And then they can kind of suggest random things, and it it but it it it often, I find that trying to chase them down and make them work and finding they don't work, it wastes more time than it saves.

Speaker 1

是的。

Yeah.

Speaker 1

所以我认为，我们目前认为有些困难的问题，将来会通过这种方法被解决。

So now so I think some fraction of problems that we currently think are hard will will fall from this this method.

Speaker 1

我的意思是，尤其是那些一直得不到足够关注的问题。

I mean, especially the ones that haven't received enough attention.

Speaker 1

比如埃尔德里奇问题，你知道，被人工智能解决的50个问题中，几乎都是那些根本没有任何文献支持的。

So, like, with the Erdrich problems, you know, like, almost all of the 50 problems that were solved by AIs were ones for which basically there was no literature.

Speaker 1

我的意思是，埃尔德里奇问题只被提出过一两次。

I mean, Erdrich post post problem once or twice.

Speaker 1

我想可能有些人随便尝试过，但没能解决，也就没写成论文。

I think maybe some people tried it casually and they they couldn't do it, but they never wrote up anything.

Speaker 1

但结果发现，其实是有解的，只是可能把某个不为人知的冷门技巧和文献中的其他结果结合起来而已。

But it turned out that that there was a solution, and it was just, you know, maybe combining this one obscure technique that that not many people know about with some other result in the literature.

Speaker 1

而这正是人工智能所能达到的中等水平的成就。

And that's the kind of love the the median level of what AI can accomplish.

Speaker 1

这真的非常棒。

And that that's really great.

Speaker 1

它清除了这50个问题。

It clears out 50 of these problems.

Speaker 1

所以我认为你会看到一些孤立的成功案例。

So I think you will see some isolated successes.

Speaker 1

但这是我们发现的，人们必须对这些早期问题进行大规模的排查。

But this is but what we found so people have to have done large scale sweeps of these early problems.

Speaker 1

而且，如果你只关注那些在社交媒体上被广泛传播的成功故事，那看起来简直太棒了。

And, like, if you only focus on the success stories, the ones that that get broadcast on social media, that looks amazing.

Speaker 1

你知道，那些几十年来一直未被解决的问题，现在一个个都被攻克了。

You know, like, all these problems that haven't been solved before for decades, now that now they're falling.

Speaker 1

但每当我们进行系统性研究时，对于任何一个具体问题，AI工具的成功率可能只有1%到2%。

But whenever we do a systematic study, any given problem, an AI tool has a success rate of maybe one or 2%.

Speaker 1

只是因为它们可以大规模尝试，而如果你只挑出赢家，看起来就非常出色。

It's just that it's just that they can buy a scale, and and if you just pick the winners, it looks great.

Speaker 1

所以我认为在那些数百个极其著名且困难的数学问题上，也会发生类似的情况。

So I think it'll be a similar thing happening with, you know, there there are hundreds of of of really prestigious difficult math problems out there.

Speaker 1

可能会有那么一两个问题，AI碰巧运气好给解决了，而这些问题是其他人遗漏的某种后门解法，这会获得大量关注。

A couple may make, you know, some AI may get lucky and actually solve them, and there was there was some some backdoor to solve the problem that that that everyone else missed, and that will get a lot of publicity.

Speaker 1

但人们会用自己的最爱问题尝试这些花哨的工具，再次遭遇1%到2%的成功率。

But then people will try these fancy tools on their own favorite problem, and they will, again, experience the one to two percent success rate.

Speaker 0

对。

Right.

Speaker 1

所以在那些有效和无效的时刻之间，会充斥大量噪音。

So there'll be a lot of noise amongst the signal of sort of when they're working, when they're not.

Speaker 1

我们必须这么做。

We have to do yeah.

Speaker 1

收集这些高度标准化的数据集将变得越来越重要。

It's it's it's it's increasing will be increasingly important to do collect these really standardized datasets.

Speaker 1

目前已有努力在创建一套标准的AI挑战问题，而不仅仅依赖AI公司只公布他们的成功案例，却不披露负面结果。

You know, there are efforts now to create a standard set of challenge problems for for AI to solve, and not just rely on the AI companies to only publish their wins and and and and not and not disclose their their negative results.

Speaker 1

这或许能让我们更清晰地了解我们实际所处的位置。

So that will maybe give more clarity as to, where where we're actually at.

Speaker 0

不过我认为值得强调的是，AI已经取得了多大进展，如今的模型已经能够应用一些前所未有的技术

Although I think it's worth emphasizing how much progress in AI constitutes already to have models that are capable of applying some technique that nobody

Speaker 1

是的。

Yeah.

Speaker 1

已经写下了适用于这个特定问题的方案。

Had written down as applicable to this particular problem.

Speaker 1

进展既令人惊叹又令人失望。

The progress is simultaneously amazing and disappointing.

Speaker 1

看到这些工具实际运作时，感觉非常奇怪，但人们确实很快就会适应。

It it is it is a very strange feeling to to to see these tools in action and and that, you know, but also people acclimatize really quickly.

Speaker 1

你知道吗，我记得二十年前谷歌的网页搜索刚推出时，它彻底超越了其他所有搜索引擎。

You know, I remember when when when Google's web search came out twenty years ago, and it just blew all the others all that searches out of the water.

Speaker 1

你直接在首页就能得到相关的搜索结果，几乎完美地符合你想要的内容。

Like, you're just getting relevant hits on the front page, like, perfectly, almost, you know, exactly what you wanted.

Speaker 1

那真是太棒了。

And it was amazing.

Speaker 1

但几年后，你就理所当然地认为，你可以随时用谷歌搜索任何东西。

And then after a few years, you just took for granted that that you could you could just Google anything.

Speaker 1

是的。

And yeah.

Speaker 1

所以很多，嗯。

So a a lot of yeah.

Speaker 1

我的意思是，2026年的AI在2021年看来会令人惊叹，而很多技术，比如人脸识别、自然语音，还有解决大学水平的数学题，我们现在都习以为常了。

I mean, 2026 level AI would be stunning in 2021, and a a lot of it, know, face recognition, natural speech, yeah, do doing, you know, college level math problems we just take for granted now.

Speaker 1

对。

Right.

Speaker 1

是的。

Yeah.

Speaker 0

好的。

Okay.

Speaker 0

说到2026年，是的，你曾在2023年做出过一个预测，我认为到2026年，它会像数学领域的同事一样。

So speaking of 2026, yeah, you made a prediction in 2023 that I think by 2026, what was it that it would it would be, like like, a colleague in mathematics or Yeah.

Speaker 1

如果使用得当，一个值得信赖的合著者。

A trustworthy co author if used correctly.

Speaker 1

也就是说

Which is

Speaker 0

回过头来看，表现相当不错。

looking pretty good in retrospect.

Speaker 1

是的。

Yeah.

Speaker 1

我感到非常满意。

I'm, I'm, I'm pretty pleased.

Speaker 0

对。

Yeah.

Speaker 0

所以，让我们看看能不能继续保持这个势头。

So, you know, let, let, let's see if we can continue the streak.

Speaker 0

你个人因为人工智能的辅助，工作效率提高了两倍。

You personally are 2x more productive as a result of AI.

Speaker 0

你认为是哪一年达到的？

What year would you say that?

Speaker 1

是的。

Yeah.

Speaker 1

所以，我认为生产力并不是一个单一维度的量。

So productivity, I think, is not quite a one dimensional quantity.

Speaker 1

比如，我明显感觉到自己做数学的方式发生了很大变化，我所从事的内容也不同了——例如，我现在论文里的代码和图表多了很多，因为现在生成这些东西太容易了。

Like, I'm definitely noticing that the style in which I do mathematics is changing quite a bit and the type of things I do so, for example, my my papers now have a lot more code, a lot more pictures, because it's so easy to to generate these things now.

Speaker 1

以前需要花几个小时做的图表，现在几分钟就能完成？

So some plot which have taken me hours to do now, I can I can do in minutes?

Speaker 1

但过去，我根本不会在论文里放这些图表。

But in the past, I just wouldn't have put the plot in my paper in the first place.

Speaker 1

我只会用文字来描述它们。

I I would just talk about it in words.

Speaker 1

很难衡量‘两倍’到底意味着什么。

It's hard to measure to measure what two x means.

Speaker 1

所以，一方面，我觉得如果今天我要写的这些论文没有AI辅助，肯定要花五倍的时间。

So, yeah, on the one hand, you know, I I think the type of papers that I would write today, if I had to do them without AI assistance, they would definitely take five times longer.

Speaker 0

但很有趣。

But Interesting.

Speaker 1

但我不会以这种方式写我的论文。

But I would not write my papers that way.

Speaker 0

五倍？

Five x?

Speaker 0

所以是的。

So Yeah.

Speaker 0

那是

That's

Speaker 1

但这是因为这些都属于辅助性的，我的意思是，你知道，像那些，嗯，比如。

But but it's it's because but the the these are sort of auxiliary I mean, it you know, the, you know, so so things that yeah.

Speaker 1

比如，进行更深入的文献调研，提供更多的数值计算。

Things like like, like, doing a much deeper literature search, supplying a lot more numerics.

Speaker 1

是的。

Yeah.

Speaker 1

我的意思是，这些内容丰富了论文。

I mean, they they they they enrich the paper.

Speaker 1

所以，是的，我所做工作的核心——比如真正解决数学问题中最困难的部分——并没有太大变化。

So, yeah, the the the the core of what I do, like, actually solving the most difficult part of of a math problem, that hasn't changed too much.

Speaker 1

我仍然用纸笔来做这些。

I still use pen and paper for that.

Speaker 1

但你知道，有很多琐碎的事情。

But, you know, there's lots of there's lots of of silly things.

Speaker 1

我现在使用一个AI代理来重新格式化。

I I use an an AI agent now to to reformat.

Speaker 1

比如，有时候我的所有括号大小都不太对，我会手动调整，但现在我可以让AI代理在后台很好地完成所有这些工作。

Like, sometimes, if all my parentheses are not quite the right size, you know, I just manually change them in my hand, and I I can get an AI agent to sort of do all that quite nicely now in the background.

Speaker 1

所以，是的，它们大大加快了许多次要任务的进度。

So, yeah, they they they really sped up lots of secondary tasks.

Speaker 1

它们还没有真正加快我核心工作的速度，但它们让我能够为论文添加更多内容。

They haven't yet sort of, sped up the the core thing that I do, but it it's allowed me to sort of add more things to to to my papers.

Speaker 1

是的

Yeah.

Speaker 1

但同样地，如果我重新写一篇2020年写的论文，不添加这些额外功能，只保留相同水平的功能性。

But, by the same token, like, if I were to write a paper I wrote in 2020 again and not add all these extra features, but just have something on the same sort of level functionality.

Speaker 1

是的

Yeah.

Speaker 1

那么说实话，这并没有节省太多时间。

Then that doesn't have hasn't saved that that much, to be honest.

Speaker 1

是的

Yeah.

Speaker 1

所以它让论文变得更丰富、更广泛，但不一定更深奥。

So it's it's made made the papers sort of richer and broader, but not necessarily deeper.

Speaker 0

你区分了人工聪明和人工智能。

You made this distinction between artificial cleverness and artificial intelligence.

Speaker 1

嗯

Mhmm.

Speaker 0

我想更好地理解这些概念。

And I would like to better understand those concepts.

Speaker 0

什么是不只属于机巧的智能的一个例子？

What is an example of, intelligence that is not just cleverness?

Speaker 1

是的。

Yeah.

Speaker 1

智能一向很难定义。

So it's intelligence is famously hard to define.

Speaker 1

这是一种你看到时就能认出来的东西。

It's one of these things that you you kind of know it when you see it.

Speaker 1

但当我跟别人交谈，我们一起尝试合作解决一个数学问题时，会有这样的对话：起初我们俩都不知道怎么解，但其中一人有了一个想法，看起来很有希望。

But when I when I when I talk to someone, and we're trying to collaboratively solve a math problem together, There's this conversation where, you know, we neither of us knows how to solve the problem initially, but one of us has some idea and and it looks promising.

Speaker 1

于是我们形成了一种初步的策略，然后进行测试，发现行不通，接着我们就对其进行修改。

And and so then then we have some sort of prototype strategy, and then we test it, and then it doesn't work, but then we we we modify it.

Speaker 1

这个过程中体现了某种适应性，以及想法随着时间持续改进。

And there's some adaptivity and and and continual improvement of of of the idea over time.

Speaker 1

最终，我们系统地梳理出了哪些方法行不通、哪些方法有效，并且大致看到了前进的方向。

And, eventually, you know, we sort of we've we've systematically mapped out what doesn't work, what does work, and and and we can kinda see a path forward.

Speaker 1

但这个过程是随着我们的讨论不断演化的。

But it's evolving with our discussion.

Speaker 1

而人工智能只能稍微模仿这种过程。

And this isn't not quite what the AIs the AIs can kind of mimic this a little bit.

Speaker 1

回到这些跳跃机器人这个类比，它们可以不断跳跃、失败、再跳跃、再失败。

So to go back to this analogy of of of these jumping robots, you know, so, you know, they can jump and fail and jump and fail and and jump and fail.

Speaker 1

但它们做不到的是：稍微跳一下，抓住某个支撑点，然后停在那里，再拉别人上来，接着从那个位置继续跳。

But but what they can't do is that they kind of jump a little bit and they reach some handhold, but then they sort of stay there and then they pull other people up and then they try to jump from there.

Speaker 1

对。

Right.

Speaker 1

这种逐步累积、互动构建的过程并不存在。

There isn't this cumulative process which is sort of built up interactively.

Speaker 1

这更像是反复试错和单纯的重复、蛮力，虽然它能够扩展，并且在某些情境下表现得极其出色。

It it it seems to be a lot more trial and error and just repetition brute force, you know, which can you know, it scales, and it can work amazingly well in in certain contexts.

Speaker 1

但这个想法确实是通过部分进展逐步积累起来的，而这一点目前还尚未完全实现。

But, yeah, this this idea is is sort of building up cumulatively from, from partial progress is kind of is what's still not quite there yet.

Speaker 0

有意思。

Interesting.

Speaker 0

你的意思是，比如 Gemini 三号或者 Quad 4.5 之类的模型解决了某个问题，是吧？

You're saying, like, if Gemini three or a quad 4.5 whatever solves a problem Yeah.

Speaker 0

但它的数学理解能力并没有因此提升。

It, it is not the case that its own understanding of math has progressed.

Speaker 0

即使它没有解决这个问题，它的数学理解也没有因此进步。

Or even if it works on a problem without solving it, it's not that its own understanding of Yeah.

Speaker 1

它确实做到了。

It did.

Speaker 1

数学理解已经进步了。

Math has progressed.

Speaker 1

是的。

Yeah.

Speaker 1

你开启一个新会话，它就忘了刚才做了什么。

You you run a new session, it's forgotten what what it just did.

Speaker 1

对。

Right.

Speaker 1

它并没有获得任何新技能来作为基础，去解决其他不相关的问题。

It hasn't it, you know, it has no new skills to to attach to to to build on on on unrelated problems.

Speaker 1

也许你刚才做的，会成为下一代训练数据中那0.001%的一部分。

Maybe what you just did is part of 10.001% of the training data for the next generation.

Speaker 1

所以，也许最终有一部分会被吸收。

So maybe, eventually, some of it gets absorbed.

Speaker 1

但确实如此。

But Yeah.

Speaker 0

特伦斯谈到分解复杂问题的重要性，尤其是将棘手的问题拆解成一系列更简单的部分。

So Terence talks about the importance of decomposing, particularly gnarly problems, into a series of easier chunks.

Speaker 0

即使这并不能直接得出完整解决方案，以这种方式处理问题也有助于你培养直觉，并练习未来继续取得进展所需的技术。

Even if this doesn't result in the full solution, approaching problems in this way helps you build up the intuitions and practice the techniques that you'll need to keep making progress.

Speaker 0

但现在的模型在应对这类问题解决技巧时往往表现不佳。

But models today tend to struggle with these kinds of problem solving techniques.

Speaker 0

这就是Labelbox发挥作用的地方。

That's where Labelbox comes in.

Speaker 0

Labelbox 帮助你训练模型，不仅追求正确的答案，更要学会正确的思考方式。

Labelbox helps you train models not just to get the right answer, but to think the right way.

Speaker 0

他们已将这些推理行为转化为评估标准，使你能够评估模型输出的每一个重要维度。

They've operationalized these reasoning behaviors into rubrics, giving you the ability to evaluate every important dimension of a model's output.

Speaker 0

这些评估标准超越了简单的正确性。

These rubrics go beyond simple correctness.

Speaker 0

模型是否选用了正确的工具？

Did the model have reached for the right tools?

Speaker 0

它是否检查了自己的工作并探索了其他路径？

Did it check its own work and explore alternative paths?

Speaker 0

它的回答有多清晰？

How clear was its response?

Speaker 0

这些技能在多个领域都有用：数学、物理、金融、心理学等等。

These skills are useful across domains: math, physics, finance, psychology, and more.

Speaker 0

随着模型开始应对更复杂、更开放的问题——有些问题有多个解，有些甚至我们还不知道答案——这些技能变得越来越重要。

And they're becoming increasingly important as models take on harder, open ended problems some of which have multiple solutions and some of which we don't even know the solutions to.

Speaker 0

Labelbox 可以为你提供针对特定领域的评估标准，帮助你系统地衡量和塑造模型的思维方式。

Labelbox can get you rubrics tailored to your domain, helping you systematically measure and shape how your models think.

Speaker 0

了解更多请访问 labelbox.com door cash。

Learn more at labelbox.com door cash.

Speaker 0

我有一个很大的疑问：如果我们只是持续训练越来越擅长用 Lean 解决问题的 AI，它们是否能持续解决越来越令人印象深刻的问题？

One big question I have is how plausible is it that if we just keep training AIs that get better and better at, you know, solving problems in lean, that they will continue to solve more and more impressive problems.

Speaker 0

然后，我们事后可能会惊讶于，从某个 Lean 解法中对证明黎曼猜想之类的问题竟获得如此少的洞察。

And then we will, in retrospect, be surprised at how little insight we got from some lean solution to proving the Riemann hypothesis or something.

Speaker 0

或者你认为，即使 AI 完全用 Lean 来解决黎曼猜想，其在 Lean 程序中构建的结构、创建的定义，也必须推动我们对数学的理解，这才是必要条件？

Or do you think it is a necessary condition of solving the Riemann hypothesis even by an AI that is, like, totally doing it in lean that the constructions which are made, the definitions which are created even in the the lean program have to advance our understanding of mathematics?

Speaker 0

还是说这可能只是像汇编代码一样的一堆胡言乱语？

Or do you think it could just be assembly code gooble de gooke?

Speaker 1

哦，是的。

Oh, yeah.

Speaker 1

我们不知道。

We don't know.

Speaker 1

我的意思是，有些问题基本上是通过纯粹的暴力计算解决的。

I mean, some problems have been basically solved by pure brute force.

Speaker 1

四色定理就是一个著名的例子。

A full color theorem is is a famous example.

Speaker 1

我们至今仍未找到这个定理的概念性优美证明。

We have still not found a conceptually elegant proof of this theorem.

Speaker 1

它基本上可能永远也找不到这样的证明。

It it basically and and maybe we never will.

Speaker 1

我的意思是，有些问题可能只能通过将情况拆分成海量子类，并对每个子类进行暴力计算和计算机分析来解决。

I mean, some problems may only be solvable by just splitting into some enormous number of cases and and doing a brute force, unincible computer analysis on on each case.

Speaker 1

我认为我们之所以如此重视像黎曼猜想这样的问题，是因为我们相当确信，要解决它，必须创造出一种全新的数学，或者发现两个此前毫无关联的数学领域之间的全新联系。

I mean, part of the reason that we we prize problems like hypothesis is that we're pretty sure that that something amazing has to a new type of mathematics has to be created or a new connection between two previously unconnected areas of mathematics has to be discovered to to make this work.

Speaker 1

我们甚至不知道解的形态是什么，但感觉这并不是一个仅靠穷举情况就能解决的问题。

We we don't even know what the shape of the solution is, but it doesn't feel like a problem that will be solved just by exhaustively checking cases or something.

Speaker 1

我的意思是，它实际上可能是错误的。

I mean, it could be false, actually.

Speaker 1

是的。

Yeah.

Speaker 1

所以，我们实际上可以，好吧。

So we we could actually, okay.

Speaker 1

有一个不太可能的情形，那就是黎曼假设是错的，你只需计算出一个不在临界线上的零点，然后由超级计算机验证它。

There is an unlikely scenario that that that the hypothesis is false, and there's just this this this you can just compute, oh, here's a zero off off the line, and a master computer calculation verifies it.

Speaker 1

那将非常令人失望。

That would be very disappointing.

Speaker 1

我不知道。

I don't know.

Speaker 1

我确实觉得，完全自主的一次性方法并不适合解决这类问题。

I I I I do feel that, you know, fully autonomous one shot approaches are not the right approach for these problems.

Speaker 1

我的意思是，我认为人类与这些工具协作所产生的协同效应会带来更大的进展。

I mean, I think you you'll get a lot more mileage out of the interplay between between humans collaborating with these tools.

Speaker 1

我可以想象，某个问题会由一些聪明的人类在极其强大的AI工具辅助下解决，但这种互动模式可能与我们目前的设想大不相同。

And I can see one of these problems being solved by by some smart humans as assisted by some extremely powerful AI tools, but the exact dynamic may be very different from what we envisioned right now.

Speaker 1

我的意思是，这可能是一种我们目前还不存在的协作方式。

I mean, it it could be a collaborate collaboration of a type that we just doesn't exist yet.

Speaker 1

是的。

Yeah.

Speaker 1

我的意思是，也许我们可以生成一百万种LUMINA Zeta函数的变体，进行一些数据挖掘和AI辅助的数据分析，从而发现它们之间之前未知的关联模式。

I mean, we there may be a way to to generate, you know, a million variance of the LUMINA Zeta function and do some data analysis, AI assisted data analysis, we we we discover some pattern between connecting them, which which we didn't know about before.

Speaker 1

这能让你将问题转化为数学中的另一个领域。

And and this lets you transform the problem into into a different area of mathematics.

Speaker 1

我的意思是，可能还有各种各样的可能性。

I mean, and there could be all kinds of of scenarios.

Speaker 0

所以，假设AI找到了答案，而在Lean中隐藏着某种全新的构造，如果你意识到它的意义，我们就能将它应用到所有这些不同的情境中。

So suppose the AI figures it out and latent in the lean is some brand new construction, which, you know, if you realize the significance, would we would be able to apply it in all of these different situations.

Speaker 0

我们又该如何识别它呢？

How how would we even recognize it?

Speaker 0

对吧？

Right?

Speaker 0

比如，再问一个很天真的问题，如果你像笛卡尔那样提出一个想法——比如可以建立一个坐标系，把代数和几何统一起来。

Like, if, if you just again, a very naive question, but you if you if you come up with the equivalent of, like, Descartes comes with this idea, oh, you can have this coordinate system where you can unify algebra and geometry.

Speaker 0

但在Lean代码里，它可能就只是看起来像 r → r，根本看不出有什么特别之处。

But in lean code, it would just look like r to r, and it wouldn't look that significant or something.

Speaker 0

或者类似地，我相信还有很多其他构造也具有这种特性。

Or similarly, I'm sure there's other constructions which have this kind of property.

Speaker 1

将证明形式化到Lean这样的系统中的美妙之处在于，你可以把其中的任何一部分拿出来单独研究。

Well, the the beauty of formalizing a proof in something like lean is that you can take any piece of it and study it atomically.

Speaker 1

所以，当我读一篇论文，看到某个棘手的问题时，通常会有一长串引理和定理。

So, you know, so when I read a paper with my humans with which saw some some difficult problem, you know, there's often some big sequence of lemmas and theorems and things.

Speaker 1

理想情况下，作者会一步步解释清楚，哪些部分是关键，哪些不是。

And so, ideally, the author will talk talk their way through, you know, what's important, what's not.

Speaker 1

但有时候他们并不会指出哪些步骤是关键的，哪些只是常规的、模板化的步骤。

But but sometimes they don't reveal what what, what steps were the important ones and which ones are just kind of boilerplate, standard, steps.

Speaker 1

但你可以单独研究每一个引理，其中一些可能让你觉得，哦，这看起来挺普通的。

But you can study each LEMO in isolation, and some of them might say, oh, this looks fairly standard.

Speaker 1

这跟我熟悉的东西很相似。

This this this resembles something I'm I'm familiar with.

Speaker 1

我敢肯定这里没什么有趣的东西。

I'm pretty sure there's nothing interesting going on here.

Speaker 1

但这个引理，哦，这是我以前没见过的。

But this LEMO, oh, that's that's something I haven't seen before.

Speaker 1

我能明白，如果你能得出这个结果，确实会大大有助于证明主要结论。

And I could see why if you could if you had this result, that would really help prove the main result.

Speaker 1

你可以判断某些内容是否真正对你的论证至关重要。

Like, you could you know, you can assess whether some things are are really sort of key to your to your argument or not.

Speaker 1

而Lean 正是极大地促进了这一点。

And lean really facilitates that.

Speaker 1

你知道，每一个步骤都被非常精确地识别出来了。

You know, you can you can you can you know, the the individual steps are identified really precisely.

Speaker 1

我认为未来会出现专门的职业数学家，他们可能会拿一个由Lean生成的庞大证明，然后做一些消融实验之类的操作。

I think in the future, there'll be, you know, there'll there'll be entire professions of of mathematicians who might take a giant lean generated proof and maybe, you know, do some ablation on it or something.

Speaker 1

我会尝试去掉一些步骤，看看能不能找到更简洁的方法。

I'll try to remove steps parts of it and and try to find it find more elegant ways.

Speaker 1

你知道吧？

You know?

Speaker 1

你知道吧？

You know?

Speaker 1

也许其他AI会做一些强化学习。

Maybe some other AIs just sort of do some reinforcement learning.

Speaker 1

怎样才能让证明更优雅？也许其他AI会评估哪个证明看起来更好。

How can you make the proof more elegant and and and maybe other AIs will grade whether this this proof looks better or not.

Speaker 1

未来不久将发生很大变化的一点是，直到最近，写论文还是这项工作中最耗时、最昂贵的部分。

One thing that will change quite a bit, in in the near future is is that until recently, writing papers was the most time consuming and expensive part, of, of the job.

Speaker 1

所以你以前很少这么做。

And so you did you did it very rarely.

Speaker 1

你知道吗？

You know?

Speaker 1

你只有在所有其他论证部分都确认无误之后，才会把结果整理成文，因为反复重写和重构实在太麻烦了。

You you you only wrote up your results once everything was all the other parts of your argument were, were checked out and and things because you just rewriting it again, refactoring was just a total pain.

Speaker 1

但如今，现代AI工具让这件事变得容易多了。

But that's one thing that's become a lot easier now with modern AI tools.

Speaker 1

所以，你知道，你不必只保留论文的一个版本。

So, you know, you don't have to have just one version of of your paper.

Speaker 1

你知道，一旦你有了一个版本，人们就可以生成几百个其他版本。

You, you know, you can once you have one, you know, people can generate hundreds more.

Speaker 1

是的，一个庞大而杂乱的Lean证明本身可能没什么意义或难以理解，但其他人可以对其进行重构，并做各种各样的处理。

So, yeah, one giant messy lean truth may not be very, meaningful or, understandable on its own, but but other people can can can refactor it and do all kinds of of of things with them.

Speaker 1

我们在Erdős问题网站上已经看到过这种情况，你知道，AI会生成一个证明，然后这里有3000行代码来验证这个证明。

We have seen if with the Erdich problem website, you know, that people will will an AI will will generate a proof, then here's 3,000 lines of code that that verify the proof.

Speaker 1

但后来，我们让其他AI来总结证明，它们会写出自己的证明。

But then we people got other AIs to summarize the proof, and and and they will write their own proofs.

Speaker 1

实际上，这属于后处理阶段。

There's actually, post processing.

Speaker 1

一旦你有了一个证明，我们现在就有许多工具可以分解和解读它。

Once you actually have one proof, we we actually have a lot of tools now to to deconstruct it and and interpret it.

Speaker 1

这还是一个非常新兴的科学或数学领域，但我不太担心，你知道，有些人担心，如果一个真正的假设被一个完全无法理解的证明所证实怎么办？

It's a very nascent area of of of science or or mathematics, but, I'm not as worried about, you know, so so so some people concern me, what if the real hypothesis is proven with a complete incomprehensible proof?

Speaker 1

我认为，一旦你拥有了证明的成果，我们就能对它进行大量分析。

I I think once you have the artifact of a proof, we can do a lot of of of of analysis on it.

Speaker 0

你最近发帖说，相比于Lean擅长的数学证明，为数学策略建立一种形式化或半形式化语言会很有帮助。

You posted recently that it would be helpful to have a formal or semi formal language for mathematical strategies as opposed to just mathematical proofs, which is what Lean specializes in.

Speaker 0

我很想了解更多，这会涉及什么，或者会是什么样子。

I would love to learn more about what that would involve or look like.

Speaker 1

我们其实还不清楚。

We don't really know.

Speaker 1

我的意思是，数学领域一直很幸运，因为我们已经厘清了逻辑和数学的规律，但这其实是一项相当近期的成就。

I mean, we've been very lucky in mathematics that that we have worked out the laws of of logic and mathematics, but this is actually a fairly recent accomplishment.

Speaker 1

我的意思是，欧几里得早在数千年前就开始了这项工作，但直到二十世纪初，我们才最终列出了数学的公理体系，也就是我们所说的ZFC公理和一阶逻辑公理，而这就是证明的本质，我们现在已经能够将其自动化并建立形式化语言。

I mean, was started by Euclid, you know, millennia ago, but but only in, like, the early twentieth century did we finally list out here the the axioms of of mathematics or the standard axioms of what we call ZFC and the axioms of first order logic, and this is what a proof is, and and and this we've managed to automate and and and have a formal language for.

Speaker 1

但也许存在某种方式来评估某些猜想的合理性？

But there could be some way to assess plausibility of certain you know?

Speaker 1

所以你有一个猜想，认为某件事是正确的。

So you you have a conjecture that something is true.

Speaker 1

你测试了几个例子，结果都成立。

You you you test a few examples and it works out.

Speaker 1

那么，这如何提升你对这个猜想为真的信心呢？

Like, how does this increase your your your confidence that the conjecture is true?

Speaker 1

我们有一些数学方法可以建模这种情况，比如贝叶斯概率。

We have a few sort of mathematical ways to to to model this, like Bayesian probability, for example.

Speaker 1

但这些方法通常需要设定一些基础假设，而且这些任务中仍然存在大量主观性。

But they're not you often have to they often you have to set certain base assumptions and and and and it's it's it's there's a lot of subjectivity still in in these tasks.

Speaker 1

这并不清楚，这更像是一个愿望，而不是开发这些语言的计划。

It it is it's not clear if mean, this is more of a wish than a plan to develop these languages.

Speaker 1

但看到像Lean这样的正式框架如何让演绎证明变得更容易自动化和训练AI，确实令人印象深刻。

But just seeing how successful having a formal framework in place like Lean has made deductive proofs so much easier to automate and and and train AI on.

Speaker 1

如果能有类似的框架就好了。

If there was some similar framework yeah.

Speaker 1

因此，利用AI来创造策略和提出猜想的瓶颈在于，我们必须依赖人类专家和时间的检验来验证某事是否合理。

So the the bottleneck for using AI to to to create strategies and and and make conjectures is we have to rely on human experts to, and the test of time to to validate whether something's plausible or not.

Speaker 1

如果能有一个半正式的框架，以一种不容易被滥用的方式半自动地完成这件事，那就好了。

If there was some semi formal framework where this could be done semi automatically in a way that that isn't sort of easily hackable to you know, it is.

Speaker 1

当然，对于这些辅助形式化证明的系统来说，绝对不能有任何后门或漏洞，因为强化学习实在太擅长找到这些漏洞，让人在没有真正证明的情况下就能获得认证的证明。

Of course, it's really important with these formal proof of assistance that there are just no there there's no backdoors or exploits that that you can do to somehow get your your certified proof without actually proving it because reinforcement learning is just so so good at finding these these these backdoors.

Speaker 1

但，如果有一个框架能模仿科学家之间以半正式方式交流的方式，比如使用数据和论证，同时也能构建叙事——而科学中某些主观层面我们尚不清楚如何捕捉，也就无法以有用的方式让AI介入其中。

But, yeah, if if it's some framework that sort of mimics how scientists talk to each other in a semi formal way, you know, using data and and argument, but but also, you know, constructing narratives and and and and there's some sub there's some subjective act aspect of science that we don't know how to capture in a way that that that we can insert AI into them in any useful way.

Speaker 0

有意思。

Interesting.

Speaker 1

所以，这是一个未来的问题。

So, yeah, this is a this is a future problem.

Speaker 1

我的意思是，目前有一些研究正在尝试自动生成猜想，也许还有办法对这些进行评估或模拟，但这完全是全新的科学领域。

I mean, there are research efforts to, you know, to try to create automated conjectures and and and and maybe there are ways to benchmark these and and get some some way to simulate this, but this is it's it's all very, very new science.

Speaker 0

你能帮我建立一些直观理解吗？我有两个子问题。

Can can you help me get some intuition for I have two sub questions.

Speaker 0

第一，如果能有一个具体的例子，说明科学家之间那种我们目前还无法形式化的交流方式，会非常有帮助。

One, it would be very helpful to have a tangible sense of it would be helpful to have a specific example of what something like this would look like that the way scientists communicate that we can't formalize yet.

Speaker 0

第二，说要构建某种叙事或自然语言解释，同时又能将其形式化，这看起来几乎是定义上的悖论。

And two, it seems almost definitionally paradoxical to say building up some narrative or building up some natural language explanation, and then also having something which you could have formalized.

Speaker 0

我相信这其中一定有某种直觉上的关联，我很想更好地理解这一点。

And I'm sure there's some intuition behind where that overlap is, and I'd love to understand that better.

Speaker 1

好的。

Alright.

Speaker 1

举个猜想的例子。

So so an example of of a conjecture.

Speaker 1

高斯对质数感兴趣，并且他计算并创建了最早的数学数据集之一。

So, Gauss was interested in the prime numbers, and, he computed he he created one of the first mathematical datasets.

Speaker 1

他只是计算了前十万左右的质数，希望找到其中的规律。

He just computed the first 100,000 prime numbers or so, hoping to find patterns.

Speaker 1

他确实发现了一个模式，但也许不是他预期的那种模式。

And he did find a pattern, but maybe not not the pattern he was expecting.

Speaker 1

他发现了质数中的一个统计规律：当你统计不超过100、1000、100万等的质数个数时，它们会变得越来越稀疏，而密度的下降幅度与该数值范围的自然对数成反比。

He he found a statistical pattern in the primes that that if you count how many primes there are up to 100, 1,000, 1,000,000 and so forth, they get sparser and sparser, but the the the the the drop off in in in the density was inversely proportional to the natural logarithm of of of of of the range of numbers.

Speaker 1

因此，他提出了我们现在称为质数定理的猜想。

So he conjectured what we now call the prime number theorem.

Speaker 1

不超过x的质数个数大约等于x除以x的自然对数。

The number of primes up to x is like x divided by the natural log of x.

Speaker 1

但他无法证明这一点。

And he had no way to prove this.

Speaker 1

这是基于数据得出的结论。

It was it was data driven.

Speaker 1

所以这只是一个猜想。

So this this was a a conjecture.

Speaker 1

在当时这具有革命性，因为它可能是数学史上第一个具有统计性质的重要猜想。

It was revolutionary for its time because, it was maybe the first really important conjecture of of math that was statistical in nature.

Speaker 1

你知道吗？

You know?

Speaker 1

通常我们会谈论某种模式，比如质数之间的间隔具有某种规律性，但这个猜想完全不同，它并没有告诉你在任何给定范围内到底有多少个质数。

So normally, you talk about pattern like maybe the spacing between the primes has a certain regularity or something, but, yeah, but this was really something which it it didn't tell you exactly how many primes there were in any given range.

Speaker 1

它只是提供了一个近似值，而且随着数值越来越大，这个近似就越精确。

It just gave you an approximate approximation that got better and better as you went further and further out.

Speaker 1

但它极大地推动了我们现在称为解析数论这一领域的诞生。

But it it helped so it it it started the field of what we call an analytic number theory.

Speaker 1

但它是众多类似猜想中的第一个，其中许多后来被证明了，这些猜想逐渐巩固了一个观点：质数其实并没有明显的规律，它们的表现就像具有某种密度的随机数集。

But it was the first in many conjectures like this, many of which got proved, which sort of started consolidating the idea that the prime numbers actually didn't really have a pattern, that they behaved like random sets of numbers with a certain density.

Speaker 1

我的意思是，它们确实有一些模式，比如它们几乎都是奇数。

I mean, they had some patterns, like, they they're almost all odd.

Speaker 1

好的。

Okay.

Speaker 1

所以确实存在一些规律，但它们并不是真正的随机。

So there's there's there's some and and they're not actually random.

Speaker 1

它们被称为伪随机。

They're what's called pseudorandom.

Speaker 1

我的意思是，生成质数的过程中并没有涉及任何随机数生成。

I mean, there there's no random number generation involved in creating the prime numbers.

Speaker 1

但随着时间推移，人们越来越倾向于把质数看作仿佛有某个神明不断掷骰子，从而生成这样一个随机集合。

But over time, it became more and more productive to think of the primes as as if they were just generated by some some some god rolling dice all the time and just creating this this random set.

Speaker 1

这使我们能够做出许多其他预测。

And this allowed us to make all these other predictions.

Speaker 1

因此，数论中至今仍未解决的一个猜想是孪生质数猜想，即存在无穷多对相差为2的质数。

So this is still open conjecture in in in number three called the the twin prime conjecture that there should be infinitely many pairs of primes that are twins.

Speaker 1

它们相差两个数，比如11和13。

This is two apart, like eleven and thirteen.

Speaker 1

我们无法证明这一点，而且实际上有充分的理由说明我们为什么无法证明它。

We can't prove that, and there are actually good reasons why we can't prove it.

Speaker 1

但由于对素数的这种统计随机模型，我们绝对相信它是正确的。

But, but because of this statistical random model of the primes, we are absolutely convinced it's true.

Speaker 1

我们知道，如果素数像是通过抛硬币之类的方式生成的，就像无穷多只猴子在打字机上随机敲击一样，我们会一再看到孪生素数出现。

We we know that if if the primes were sort of generated by flipping coins or something that we would just by random charge, just like infinite monkeys that are typewriter, we would see, twin primes appear over and over again.

Speaker 1

随着时间推移，我们基于统计和概率发展出了一个非常精确的概念模型，用以描述素数应有的行为，但这些大多只是启发式的、非严格的，却极其准确。

And we have, over time, developed this very accurate conceptual model of what the primes should behave like based on statistics and probability, but it's all mostly heuristic and nonrigorous, but extremely accurate.

Speaker 1

因此，每当我们在素数方面真正能证明某些结论时，它们都与我们所谓的素数随机模型的预测完全吻合。

So the few times when we actually can prove things about the primes, it has matched up with the predictions of this, what we call the random model of of the primes.

Speaker 1

因此，我们拥有一个被所有人信奉的、关于素数的猜想性概念框架。

So we we we have this conjectural concept framework for understanding the primes that we everyone believes in.

Speaker 1

你知道，这和我们相信黎曼假设为真、相信基于素数的密码学在数学上是安全的，是同样的原因。

And, you know, it's the same reason why we we believe the real enough hypothesis is true, why we believe that cryptography based on the primes is basically, is mathematically secure, things like that.

Speaker 1

这一切都属于这种信念的一部分。

It's it's it's all part of this this this this belief.

Speaker 1

事实上，我们关注黎曼假设的一个原因在于，如果黎曼假设不成立，我们会知道它一定是错的。

In fact, one reason why we care about the Riemann hypothesis is that if the Riemann hypothesis failed, we we knew it was false.

Speaker 1

这意味着它将严重打击我们对素数的这种模型，表明素数背后存在我们尚未察觉的隐藏规律。

It means that it would it would be a serious blow to this model that that this it would mean there's a secret patent to the primes that we were not aware of.

Speaker 1

我想我们会迅速放弃任何基于素数的加密系统，因为如果存在一个我们不知道的规律，很可能还会有更多。

And, I think we would very rapidly abandon any cryptography based on the primes because if there was one patent that we didn't know about, there's probably more.

Speaker 1

这些模式可能导致加密系统的漏洞，而且，这将是一个巨大的冲击。

And these patterns can lead to exploits in in crypto, and, yeah, it's it's gonna be, it would be a big, big shock.

Speaker 1

所以我们真的希望确保这种情况不会发生。

So we really want to make sure that that doesn't happen.

Speaker 1

因此，是的，我们长期以来一直相信像黎曼假设这样的结论，其中一部分是基于实验证据，另一部分则是我们偶尔取得的理论成果。

So, yeah, it's it's, so we've been convinced of of things like agreement hypothesis and things over time, but some of it is experimental evidence, some is the few times we've been able to make theoretical results.

Speaker 1

它们始终是一致的。

They've always aligned.

Speaker 1

当然，共识也可能是错的，也许我们都忽略了一些非常基本的东西。

It You is possible that the consensus is wrong, and we've all just missed something very basic.

Speaker 1

在科学史上，过去曾发生过范式转变。

There have been paradigm shifts in the past in scientific history.

Speaker 1

但我们真的没有方法来衡量这一点，我认为部分原因是我们对数学或科学如何发展缺乏足够的数据。

But don't really have a way of measuring this, I think partly because we don't have enough data on on on how math or science develops.

Speaker 1

我们只有一条历史时间线，你知道，我们只有大约一百个历史转折点的故事。

We we have one timeline of history, and, you know, we we have, like, you know, a 100 stories of turning points in history.

Speaker 1

如果我们能接触到一百万个外星文明，了解它们各自不同的历史和科学发展顺序，那么也许我们真正有机会理解如何衡量什么是进步，以及什么是好的策略。

If if if we had access to a million alien civilizations and each of the the different development of of history and and of science in different orders, then maybe we we actually have a have a have a decent shot at at at an understanding of how do we measure what is progress and and and what is a good strategy.

Speaker 1

我们或许可以开始将其形式化，并真正建立一个框架。

And we could maybe start formalizing it and and actually having a a framework.

Speaker 1

也许我们需要做的，其实是创建大量微型宇宙或模拟，让人工智能解决一些非常基础的问题，比如算术之类的，让它们自己发展出解决这些问题的策略，并把这些小实验室用于测试。

Maybe if what we need do is actually start creating lots of mini universes or simulations of of AI solving very basic problems, you know, in arithmetic or whatever, but but but coming over their own strategies for doing these things and and and having these little laboratories to test.

Speaker 1

我的意思是，已经有人在研究，比如，能完成十位数乘法的最小神经网络究竟是什么样的，诸如此类的问题。

I mean, there are people who who who investigate, like, trying to what's the smallest, you know, neural network that can do 10 digit multiplication and things like that.

Speaker 1

我认为，我们仅仅通过在简单问题上让小型人工智能不断进化，就能学到很多东西。

I think I think we could actually learn a lot just from evolving small AIs on on on on simple problems.

Speaker 1

我们可以学到很多。

We could learn a lot.

Speaker 0

当Mercury联系我赞助这个播客时，我非常兴奋，因为我已经用他们家的银行服务好几年了。

I was super excited when Mercury reached out about sponsoring the podcast because I've been banking with them for years.

Speaker 0

我想我是在2023年开了第一个账户。

I think I opened my first account with them in 2023.

Speaker 0

在过去的几年里，我逐渐意识到Mercury一直在不断更新功能并添加新特性。

Something I've come to appreciate over the last few years is that Mercury is constantly updating things and adding new features.

Speaker 0

比如他们最新的功能——Insights。

Take their newest feature, Insights.

Speaker 0

Insights会汇总你的收支情况，显示你最大的交易，并提醒你需要特别关注的项目。

Insights summarizes your money in and out, showing you your biggest transactions and calling out anything that deserves extra attention.

Speaker 0

比如，某个合作伙伴的收入下降了，或者你有一笔未分类的大额支出需要调查。

Like, maybe your revenue from a particular partner has gone down, or you've got a big uncategorized purchase that needs to be investigated.

Speaker 0

这让我能以极低的摩擦成本随时掌握业务动态，并快速做出决策。

It's a super low friction way for me to keep tabs on my business and make quick decisions.

Speaker 0

例如，我会尝试投资那些不需要用于日常运营的现金。

For example, I try to invest any cash that I don't need on hand to keep running the business.

Speaker 0

通过洞察功能，我只需点击几下，就能清楚地看到2025年每个月的支出情况。

With Insights, with just a couple of clicks, was able to see exactly how much money I spent in each month of 2025.

Speaker 0

这让我能准确知道未来一两年运营所需的资金，然后把剩下的钱拿去投资。

And that lets me know exactly how much cash I'll need for the next year or so of operations, and then I can go invest the rest.

Speaker 0

Mercury 不断推出这样的新功能。

Mercury just keeps adding new features like this.

Speaker 0

前往 mercury.com 了解详情。

Go to mercury.com to check it out.

Speaker 0

Mercury 是一家金融科技公司，而非FDIC承保的银行。

Mercury is a fintech company, not an FDIC insured bank.

Speaker 0

银行服务由Choice Financial Group和Column NA提供，均为FDIC成员。

Banking services provided through Choice Financial Group and Column NA, Members FDIC.

Speaker 0

你必须不仅快速，而且深入地学习新领域，才能推动前沿发展。

You have to, learn about new fields, not only very rapidly, but deeply enough to contribute to the frontier.

Speaker 0

所以某种程度上，你也是世界上最好的自学高手之一。

So in some sense, you're also one of the world's greatest autodidacts.

Speaker 0

你是如何学习数学中的一个新子领域的？你的过程是怎样的？

What how does what is your process of learning about a new subfield in math?

Speaker 0

那具体是什么样的？

What does that look like?

Speaker 1

是的。

Yeah.

Speaker 1

所以，我确实认同我们之前讨论过的深度与广度这一观点。

So, I I certainly identify with kind of the, yes, we talked about depth and breadth before.

Speaker 1

这并不仅仅是人类与人工智能之间的区别。

And it's it's not purely human AI distinction.

Speaker 1

我的意思是，人类也会这样划分，我认为是欧文把他们分成了刺猬和狐狸。

I mean, humans also split and so it's I think it was Irving who split them into hedgehogs and foxes.

Speaker 1

他说，刺猬精通一件事，而狐狸则样样通一点。

And he said the hedgehog knows one thing very, very well, and a fox knows a a little bit about everything.

Speaker 1

所以我 definitely，说实话，我认为自己是一只狐狸。

So I definitely I didn't you know, I I I think of myself as a fox.

Speaker 1

你知道，我经常和刺猬合作，有时如果需要，我也可以当一只刺猬。

You know, I mean, I I I work with hedgehogs a lot, and sometimes I can be a hedgehog if need be.

Speaker 1

但确实如此。

But yeah.

Speaker 1

我一直有点偏执的倾向。

So I've I've always had a little bit of an obsessive streak.

Speaker 1

如果我读到某样东西，觉得我应该理解，我有能力理解它，但我搞不懂它为什么有效。

If if there's something which I read about, which I feel like I should understand, I I I have the capability to understand this, but I don't understand why it works.

Speaker 1

这里面有种神奇之处——有人用了一种我不熟悉的数学方法，解决了我本想自己证明的问题，而我却做不到，但他们用他们的方法做到了。

There's there's a magic in it that, you know, so someone was able to use it, a type of mathematics I'm not I'm not familiar with and get over that which I would like to prove, and I can't do it by myself, but they could do it by by their method.

Speaker 1

于是我想弄清楚他们的诀窍是什么。

Then I wanted to find out what was their trick.

Speaker 1

别人能做我本以为自己也能做到的事，却偏偏我做不了，这让我很困扰。

It bugs me that they someone else can can do something, which I think I I can do, but but I can't.

Speaker 1

所以我一直有这种执着于完成任务的倾向。

So I've always had that kind of obsessive completionist type type streak.

Speaker 1

我不得不戒掉电脑游戏，因为我一旦开始玩，

I've had to wean myself off computer games because I I I start a game.

Speaker 1

就一定要玩到完全通关。

I wanna play it to completion.

Speaker 1

所有的关卡都要打完。

So all the levels.

Speaker 1

这正是我学习新领域的一种方式。

And so that's one one way in which I I learn new fields.

Speaker 1

我经常与许多人合作，他们教会了我其他类型的数学。

I collaborate with a lot of people who have taught me other types of mathematics.

Speaker 1

我只是结识另一位研究不同数学领域的数学家，觉得他们的课题很有趣，但他们得教我一些基本技巧、已知和未知的内容，我从中学到了很多。

I just make friends with another mathematician who is working on another area of mathematics, and I find their problems interesting, but they have to teach me some of the basic tricks and what's known and what's not known, and I learned a lot from that.

Speaker 1

我发现写下我所学的东西很有帮助，我有一个博客，有时会记录我学到的内容。

I found that writing about my ex what I've learned I have a blog where I sometimes record things that I've learned.

Speaker 1

因为过去当我年轻的时候，我会学一些东西，然后用一个很酷的技巧展示出来，并告诉自己：我要记住这个。

Because in the past, when I was younger, I would learn something and do this cool trick and I said, I'm gonna remember this.

Speaker 1

但六个月后，我就忘了。

And then six months later, I'd I'd I'd forgotten.

Speaker 1

我记得自己曾经记得它，但我无法重新推导出我的论证过程。

I I I remember remembering it, but I don't but I can't reconstruct my arguments.

Speaker 1

前几次明明理解了某个东西，却再次失去它，这让我非常沮丧。

And it the first few times, was so frustrating to have understood something and then lost it.

Speaker 1

于是我下定决心，凡是学到的有趣东西，一定要记下来。

I sort of resolved that I should always write down anything cool that I've learned.

Speaker 1

这也就是为什么我会开这个博客的原因之一。

And that's this is part of why how this blog came about.

Speaker 0

写一篇博客文章需要多长时间？

How long does it take you to write a blog post?

Speaker 1

我常常在不想做其他工作的时候写博客。

It's something I often do when I don't want to do other work.

Speaker 1

你知道，比如某份审稿意见之类的。

You know, like like, there's some referee report or something.

Speaker 1

有些事情做起来当时让我觉得有点不舒服。

There's there's there's something that that it feels slightly unpleasant for me to do at the time.

Speaker 1

所以写博客让我觉得很有创意，也很有趣。

And so, writing a blog, it feels creative and fun.

Speaker 1

这是我为自己做的事情。

Like, it is something that I I do for myself.

Speaker 1

所以，根据主题不同，可能只需要半小时，也可能要好几个小时。

So maybe depending on on on the topic, it could be a quick, you know, half an hour or several hours.

Speaker 1

但因为我是在自愿做这件事，所以写这些内容时并不觉得时间飞逝，而那些出于行政原因不得不做的工作，纯粹就是苦差事。

But I it doesn't because it's something that I do sort of voluntarily, it doesn't feel like it it it it doesn't feel time flies when I when I write these things, as opposed to sort of doing something which I have to do for administrative reasons, but it's just that it's it's it's drudgery.

Speaker 1

好的。

Okay.

Speaker 1

实际上，现在AI正在帮我们处理这些任务。

Those are tasks that AI is really helping with nowadays, actually.

Speaker 1

是

Speaker 0

如果文明能从第一性原理出发，决定如何使用陶哲轩的时间，那会怎样呢？

it, if if, like, civilization could could from first principles decide how to use Terry Tao's time?

Speaker 0

你知道，这是一种有限的资源。

You know, it's like a limited resource.

Speaker 0

如果无知之幕来决定如何使用陶哲轩的时间，与现在的情况相比，最大的区别是什么？

How how what what is the biggest difference between in the if the veil of ignorance got to decide how to use Terry Tao's time versus what it does now?

Speaker 0

好吧。

Okay.

Speaker 0

那么这个播客就不会存在了。

So this podcast wouldn't be happening.

Speaker 1

是的。

Yeah.

Speaker 1

尽管我经常抱怨那些我不愿做但又不得不做的任务。

So I could the as much as I complain about certain tasks that I don't want to do, but I have to do.

Speaker 1

随着你在学术界地位越来越高，你的责任也会越来越多。

So as as you get more senior in in academia, you get more responsibilities.

Speaker 1

比如要参加更多的委员会和其他各种事务。

Like, it's more committees and and and whatever.

Speaker 1

但我发现，很多我因为某种原因被迫参加的活动，其实并不情愿。

But I have also found that a lot of events that I kind of reluctantly went to because I was obliged to for one reason or another.

Speaker 1

由于这些活动超出了我的舒适区，我常常能遇到平时不会交流的人，比如你。

And because this outside my comfort zone, I often find interactions with people who I wouldn't normally talk to, like you, for instance.

Speaker 1

我从中学习到了有趣的东西，获得了独特的体验，也有了机会去结识那些我原本根本不可能接触到的人。

And I've I would learn interesting things and have interesting experiences, and I I would have opportunities to to to to then network with other people that I would never have have done before.

Speaker 1

所以我非常相信机缘巧合。

So I do believe a lot in serendipity.

Speaker 1

我的意思是，我确实在某些时候会精心安排自己的时间，有些时段我会非常仔细地规划。

I mean, I I do optimize my time in in in when I so there's some portions of of my of my day where I do schedule very carefully.

Speaker 1

但我愿意留出一些时间，就让它这样放着，无所谓。

But I I have been willing to sort of leave some some portions just, okay.