本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
你曾在某处写道,创造强大的人工智能可能是人类需要做出的最后一项发明。那么我们还有多少时间?
You wrote somewhere that creating powerful AI might be the last invention humanity ever needs to make. How much time do we have then?
我认为现在达到某种超级智能的中位数概率大约是2028年。问题是...
I think fiftieth percentile chance of hitting some kind of superintelligence is now, like, twenty twenty eight. What is it that
你在OpenAI看到了什么?你在那里的什么经历让你觉得,好吧,我们必须自己单干。
you saw at OpenAI? What did you experience there that made you feel like, okay. We gotta go do our own thing.
我们觉得安全在那里不是首要任务。安全论证现在已经具体得多。超级智能很大程度上是关于,我们如何把上帝关在盒子里不让祂出来?
We felt like safety wasn't the top priority there. The case for safety has gotten a lot more concrete. So superintelligence is a lot of about, like, how do we keep God in a box and not let the God out?
我们正确对齐人工智能的几率有多大?
What are the odds that we align AI correctly?
一旦达到超级智能,再想对齐模型就太迟了。我对是否会出现生存风险或极端糟糕结果的最精细预测,大概在0%到10%之间。
Once we get to superintelligence, it will be too late to align the models. My best granularity forecast for, like, could we have an x risk or extremely bad outcome is somewhere between zero and ten percent.
现在新闻里有个话题是,这是否会影响到所有顶尖的
Something that's in the news right now is this whole, is that coming after all the top
AI研究人员?我们受影响小得多,因为这里的人收到这些邀约后会说:我当然不会离开,因为在Meta最好的情况是我们赚钱,而在Anthropic最好的情况是我们能影响人类的未来。
AI researchers? We've been much less affected because people here, they get these offers, and then they say, well, of course, I'm not gonna leave because my best case scenario at Meta is that we make money, and my best case scenario at Anthropic is we, like, affect the future of humanity.
达里奥,你们的CEO最近谈到失业率可能会上升到20%左右。
Dario, your CEO, recently talked about how unemployment might go up to something like 20%.
如果你想想20年后的未来,那时我们早已越过技术奇点,我很难想象资本主义还会是今天这个样子。
If you just think about, like, twenty years in the future where we're, like, way past the singularity, it's hard for me to imagine that even capitalism will look at all like it looks today.
对于那些想要未雨绸缪的人,你有什么建议吗?
Do have any advice for folks that want to try to get ahead of this?
我也无法避免被取代的命运。总有一天,我们都会面临这个问题。
I'm not immune to job replacement either. At some point, it's coming for all of us.
今天我的嘉宾是本杰明·曼。天啊,这真是一场精彩的对话。本是人类福祉公司(Anthropic)的联合创始人,担任产品工程的技术负责人。
Today, my guest is Benjamin Mann. Holy moly. What a conversation. Ben is the cofounder of Anthropic. He serves as tech lead for product engineering.
他将大部分时间和精力投入到确保人工智能有益、无害且诚实的工作中。在加入Anthropic之前,他是OpenAI GPT-3的架构师之一。我们的对话涵盖了许多话题,包括他对顶级AI研究人员争夺战的看法、为何离开OpenAI创办Anthropic、预计何时会出现通用人工智能(AGI)、他判断实现AGI的经济标准、为何扩展定律不仅没有放缓反而在加速、当前最大瓶颈是什么、为何如此关注AI安全性、以及他和Anthropic如何将安全性和对齐性落实到模型构建和工作方式中。此外,AI带来的生存风险如何影响他的世界观和个人生活,以及他鼓励孩子们学习什么技能以在AI时代获得成功。特别感谢史蒂夫·尼什、丹妮尔·吉格利尔、拉夫·李和我的新闻通讯社区为本次对话提供的建议话题。
He focuses most of his time and energy on aligning AI to be helpful, harmless, and honest. Prior to Anthropic, he was one of the architects of GPT-three at OpenAI. In our conversation we cover a lot of ground, including his thoughts on the recruiting battle for top AI researchers, why he left OpenAI to start Anthropic, how soon he expects we'll see AGI, also his economic test for knowing when we've hit AGI, why scaling laws have not slowed down and are in fact accelerating and what the current biggest bottlenecks are, why he's so deeply concerned with AI safety, and how he and Anthropic operationalize safety and alignment into the models that they build and into their ways of working. Also, how the existential risk from AI has impacted his own perspectives on the world and his own life, and what he's encouraging his kids to learn to succeed in an AI future. A huge thank you to Steve Niche, Danielle Gigliere, Raf Lee, and my newsletter community for suggesting topics for this conversation.
如果你喜欢这个播客,别忘了在你喜欢的播客应用或YouTube上订阅关注。此外,如果你成为我新闻通讯的年度订阅用户,可以免费获得一系列优质产品的一年使用权,包括Bolt、Linear、Superhuman、Notion、Granola等。详情请访问lenny'snewsletter.com并点击bundle。现在,请欢迎本杰明·曼。本期节目由Sauce赞助播出。
If you enjoy this podcast don't forget to subscribe and follow it in your favorite podcasting app or YouTube. Also, if you become an annual subscriber of my newsletter, you get a year free of a bunch of amazing products, including Bolt, Linear, Superhuman, Notion, Granola, and more. Check it out at lenny'snewsletter.com and click bundle. With that, I bring you Benjamin Mann. This episode is brought to you by Sauce.
团队将反馈转化为产品影响力的方式已经过时了。模糊的报告、静态的分类、无法推动业务指标的可操作性洞察。结果就是客户流失、交易失败、增长停滞。Sauce是AI产品助手,帮助首席产品官和产品团队发现商业影响并快速行动。它能分析销售通话记录、客服工单、流失原因和失败交易,实时揭示最重要的产品问题和机遇。
The way teams turn feedback into product impact is stuck in the past. Vague reports, static taxonomies, unaccionable insights that don't move business metrics. The result, Churn, lost deals, missed growth. Sauce is the AI product copilot that helps CPOs and product teams uncover business impact and act faster. It listens to your sales calls, support tickets, churn reasons, and lost deals, surfacing the biggest product issues and opportunities in real time.
然后将其分配给合适的团队,将信号转化为产品需求文档(PRD)、原型甚至能提升收入留存和采用率的代码。这就是Whatnot、Linktree、Incident IO和Zip选择Sauce的原因。某企业发现了一个价值1600万美元年化收入的产品缺口,另一家及时发现问题避免了数百万流失。你也可以在sauce.app/lenny体验。
It then routes them to the right teams to turn signals into PRDs, prototypes, and even code that drives revenue retention and adoption. That's why Whatnot, Linktree, Incident IO, and Zip use Sauce. One enterprise uncovered a product gap that unlocked $16,000,000 ARR. Another caught a spiking issue and prevented millions in churn. You can too at sauce.app/lenny.
Sauce专为AI产品团队打造。别被时代抛下。本期节目由存储协作平台LucidLink赞助。你打造了优秀产品,但通过视频、设计和故事展示才能让它鲜活起来。如果你的团队需要处理大型媒体文件、视频、设计素材、分层项目文件,你肯定深谙跨地域协作的混乱。
Sauce built for AI product teams. Don't get left behind. This episode is brought to you by LucidLink, the storage collaboration platform. You've built a great product, but how you show it through video, design, and storytelling is what brings it to life. If your team works with large media files, videos, design assets, layered project files, you know how painful it can be to stay organized across locations.
文件散落各处,你总在问:这是最新版本吗?创意工作因等待文件传输而停滞。LucidLink解决了这些问题。它为团队提供云端共享空间,操作如同本地硬盘。
Files live in different places. You're constantly asking, is this the latest version? CreativeWork slows down while people wait for files to transfer. LucidLink fixes this. It gives your team a shared space in the cloud that works like a local drive.
文件可随时随地即时访问,无需下载同步,始终保持最新。这意味着制片人、剪辑师、设计师和营销人员可以直接在原生应用中打开超大文件,云端协作不受地域限制。Adobe、Shopify和顶级创意机构都在使用LucidLink保持内容引擎高速运转。立即在lucidlink.com/leni免费试用。
Files are instantly accessible from anywhere. No downloading, no syncing, and always up to date. That means producers, editors, designers, and marketers can open massive files in their native apps, work directly from the cloud, and stay aligned wherever they are. Teams at Adobe, Shopify, and top creative agencies use Lucid Link to keep their content engine running fast and smooth. Try it for free at lucidlink.com/leni.
那是lucidlink.com/leni。本,非常感谢你能来。欢迎参加播客。
That's lucidlink.com/leni. Ben, thank you so much for being here. Welcome to the podcast.
谢谢邀请。很高兴来到这里,莱尼。
Thanks for having me. Great to be here, Lenny.
我有无数问题想问你。真的很期待这次对话。我想从一件非常及时的事情开始,就是本周发生的新闻——扎克伯格正在挖角所有顶尖AI研究员,提供1亿美元签约奖金,1亿美元薪酬包。他正在从各大顶级AI实验室挖人。
I have a billion and one questions for you. I'm really excited to be chatting. I want to start with something that's very timely, something that's happening this week. Something that's in the news right now is this whole Zuck coming after all the top AI researchers, offering them $100,000,000 signing bonuses, dollars 100,000,000 comp. He's poaching from all the top AI labs.
我想你们也在应对这种情况。我很好奇,Anthropic内部看到了什么现象?你对这个策略有什么看法?你认为事情会如何发展?
I imagine that's something you're dealing with. I'm just curious, what are you seeing inside Anthropic and just what's your take on the strategy? What do you think? Where do you think things go from here?
是的,我认为这是时代特征的体现——我们正在开发的技术极具价值。公司发展速度惊人,这个领域的其他公司也都在飞速成长。在Anthropic,我们受到的影响可能比其他公司小得多,因为这里的人都有强烈的使命感。他们收到offer时会说:'我当然不会离开,因为在Meta最好的情况是赚钱,而在Anthropic最好的情况是影响人类未来,让AI蓬勃发展,促进人类繁荣。'所以对我来说这不是艰难的选择。
Yeah, I mean, I think this is a sign of the times like this, the technology that we're developing is extremely valuable. Our company is growing super, super fast. Many of the other companies in this space are growing really fast. And at Anthropic, I think we've been maybe much less affected than many of the other companies in the space because people here are so mission oriented and they stay because they get these offers and then they say, well, of course, I'm not going to leave because my best case scenario at Meta is that we make money and my best case scenario at Anthropic is we like affect the future of humanity and try to make AI flourish and human flourishing go well. So to me, it's not a hard choice.
其他人可能有不同的生活处境,这个决定对他们来说困难得多。对于那些获得天价offer并接受的人,我不能说我会责怪他们,但这绝对不是我想要的选择。
Other people have different life circumstances and makes it a much harder decision for them. So for anybody who does get those mega offers and accepts them, I can't say I hold it against them when they accept it, but it's definitely not something that I would wanna take myself if it came to me.
是的,我们会讨论你提到的很多内容。关于这些offer,你认为1亿美元签约奖金是真实数字吗?我不确定你是否亲眼见过这种情况。
Yeah, we're gonna talk about a lot of the stuff that you mentioned. In terms of the offers, do you think is this a real number that you're seeing this $100,000,000 of signing bonus that like a real thing? I don't know if you haven't actually seen that.
我相当确定这是真的。想想个人对公司发展轨迹的影响力——以我们为例,我们的产品供不应求。如果我们的推理堆栈效率提升1%或5%,就能创造惊人价值。所以四年支付1亿美元薪酬包,相比创造的价值其实很便宜。我们正处在前所未有的规模时代,而且情况只会越来越疯狂。
I'm pretty sure it's real. If you just think about like the amount of impact that individuals can have on a company's trajectory like in our case, we are selling like hotcakes. And if we get a one or 10 or 5% efficiency bonus on our inference stack that is worth an incredible amount of money. And so to pay individuals one like $100,000,000 over four year package, that's actually pretty cheap compared to the value created for the business. So I think we're just in an unprecedented era of scale and it's only going to get crazier actually.
如果按指数曲线推算企业支出规模,每年大约翻倍。目前全球行业总支出可能在3000亿美元左右,1亿美元只是九牛一毛。但再过几年,经过几次翻倍后,我们将谈论数万亿美元规模。到那时这些数字就真的难以想象了。
Like if you extrapolate the exponential and how much companies are spending, it's like two x a year roughly in terms of capex and today we're maybe in the like globally $300,000,000,000 range the the entire industry is spending on this. And so numbers like 100,000,000 are are a drop in the bucket. But if you go a few years out, a couple more doublings, we're talking about trillions of dollars. And at that point, it's it's just really hard to think about these numbers.
顺着这个话题,很多人感觉AI发展正在多个方面遇到瓶颈,新模型似乎没有之前那样的突破性进步。但我知道你不同意这个观点,你不认为我们遇到了规模效应的天花板。谈谈你观察到的现象,以及人们忽略了什么。
Along these lines, something that a lot of people feel with AI progress is that we're hitting plateaus in many ways, that it feels like newer models are just not as smart as previous leaps. But I know you don't believe this. I know you don't believe that we've hit cut toes on scaling loss. Talk about just what you're seeing there, what you think people are missing.
这其实有点可笑,因为这种论调大约每六个月就会出现一次,但从未成真。我真希望人们看到这类消息时能自带'废话检测器'。实际上进步正在加速——看看模型发布的节奏就知道了,以前是一年一次,现在随着训练后优化技术的进步,我们每个月或每季度都有新版本发布。可以说进步正在多方面加速,但有种奇怪的时间压缩效应在作祟。
It's kind of funny because this narrative comes out like every six months or so, and it's never been true. And so I kind of wish people would have like a little bit of a bullshit detector in their heads when they see this. I think progress has actually been accelerating where if you look at the cadence of model releases, it used to be like once a year. And now with the improvements in our post training techniques, we're seeing releases every month or three months. And so I would say progress is actually accelerating in many ways, but there's this like weird time compression effect.
Dario将其比作近光速航行,对你而言过去的一天相当于地球上的五天。而且我们还在加速,所以时间膨胀效应越来越明显。我想这正是人们误以为进步放缓的部分原因。但如果你审视 scaling laws(规模法则),它们依然成立。
Dario compared it to being in a near light speed journey where a day that passes for you is like five days back on Earth. And we're accelerating. So the time dilation is increasing. And I think that's part of what's causing people to say that progress is slowing down. But if yeah, if you look at the scaling laws, they're continuing to hold true.
我们确实需要从常规预训练转向强化学习的规模化,以延续规模法则。这有点像半导体行业——重点不在于芯片能容纳多少晶体管,而在于数据中心能承载多少浮点运算。需要稍微调整定义来把握核心目标。但这种现象能在如此大数量级范围内持续成立,确实令人惊讶。
We did kind of need this transition from like normal pre training to reinforcement learning scaling up to continue the scaling laws. But I think it's kind of like for semiconductors where it's less about the density of transistors that you can fit on a chip and more about how many flops can you fit in a data center or something. So you have to change the definition around a little bit to keep your eye on the prize. But yeah, this is one of the few phenomena in the world that has held across so many orders of magnitude. It's actually pretty surprising that it is continuing to hold to me.
即便是物理学基本定律,很多也无法跨越15个数量级依然成立。所以这相当惊人。
If you look at like fundamental laws of physics, many of them don't hold across 15 orders of magnitude. So it's pretty surprising.
这真让人难以置信。你的核心观点是:新模型发布更频繁了,我们总与上一版本比较,所以感觉进步不大。但如果回溯到每年发布一个模型的年代,每次都是巨大飞跃。人们忽略了现在只是迭代次数变多了。
It boggles the mind. So what you're saying essentially is we're seeing newer models being released more often. And so we're comparing it to the last version and we're just not seeing as much advance. But if you go back and it was like a model released once a year, it was a huge leap. And so people are missing that we're just seeing many more iterations.
我想为那些认为进展放缓的人说句公道话:某些任务所需的智能水平确实已趋饱和。比如从已有表单字段的简单文档提取信息——这太容易了,我们早就达到100%了。Our World in Data上有张图表显示,新基准测试发布后6-12个月内就会饱和。或许真正的制约在于:能否设计出更好的基准测试和更具雄心的工具应用方式,来揭示当前智能水平的瓶颈。
I guess to be a little bit more generous to the people saying things are slowing down. I think that for some tasks, we are saturating the amount of intelligence needed for that task. Like maybe to extract information from a simple document that already has form fields on it or something like it's just so easy that, okay, yeah, we're already at 100%. And there's this great chart on our world in data that shows that when you release a new benchmark within like six to twelve months, it immediately gets saturated. And so maybe the real constraint is like, can we come up with better benchmarks and better, ambition of using the tools that then reveals the bumps in intelligence that we're seeing now.
这正好引出你对AGI的独特思考和定义方式。
That's a good segue to your, you have a very specific way of thinking about AGI and defining what AGI means.
AGI是个被过度解读的术语,我内部已很少使用。更喜欢'变革性AI'这个概念——重点不在于它能否像人类一样无所不能,而在于它是否客观引发了社会和经济的变革。
I think AGI is kind of a loaded term. And so, I tend not to use it very much anymore internally. I like the term transformative AI because it's less about like, can it do as much as people do? Can it do literally everything? And more about objectively, is it causing transformation in society and the economy?
衡量这个的具象方法是'经济图灵测试'(非我原创但很认同):如果你雇佣某个代理从事特定工作数月后,才发现它是机器而非人类,那它就通过了该职位的经济图灵测试。就像用一篮子商品衡量购买力平价或通胀那样,我们可以建立'职位市场篮子'。
And a very concrete way of measuring that is the economic training test. I didn't come up with this, but I really like it. It's this idea that if you contract an agent for a month or three months on a particular job, if you decide to hire that agent and it turns out to be a machine rather than a person, then it's passed the economic turning test for that role. And then you can sort of expand that out in the same way that for measuring purchasing power parity or inflation, there's a basket of goods. You can have a market basket of jobs.
当代理能通过50%货币加权职位的经济图灵测试时,变革性AI就诞生了。具体阈值不重要,但可以说明:一旦突破这个临界点,我们将看到全球GDP激增、社会变革、就业结构重塑等深远影响。社会制度和组织具有惯性,变革缓慢,但当这些成为可能时,就意味着新时代的开启。
And if the agent can pass economic training tests for 50% of money weighted jobs, then we have transformative AI. And the exact thresholds don't really matter that much, but it's kind of illustrative to say like if we pass that threshold, then we would expect massive effects on world GDP increases and societal change and how many people are employed and things like that because you know, societal institutions and organizations are sticky. It's slow to have change. But once these things are possible, you know that it's the start of a new era.
顺着这个思路,达里奥,你们的CEO最近谈到AI将接管大约一半的白领工作,失业率可能升至20%左右。我知道你对AI已在职场产生的影响更为直言不讳,而这些影响可能尚未被大众察觉。请谈谈你认为人们忽视了AI对就业已经和即将产生的哪些影响。
So along these lines, Dario, your CEO, recently talked about how AI is gonna take a huge part of, I don't know, half of white collar jobs that unemployment might go up to something like 20%. I know you're even more vocal and opinionated about just how much impact AI is already having in the workplace that people may not even be realizing. Talk about just what you think people are missing about the impact AI is going to have on jobs and is already having.
从经济学角度看,失业可分为几种类型。一种是劳动者缺乏经济所需岗位的技能,另一种是岗位直接被淘汰。我认为实际情况会是两者的结合。但若展望二十年后——远超过技术奇点的时代——我甚至难以想象资本主义还会保持现今的形态。
Yeah, so from an economic standpoint, there's a couple of different kinds of unemployment. And one is because the workers just don't have the skills to do the kinds of jobs that the economy needs. And another kind is where those jobs are just completely eliminated. And I think it's going to be actually a combination of these things. But if you just think about like, you know, twenty years in the future where we're way past the singularity, it's hard for me to imagine that even capitalism will look at all like it looks today.
如果我们工作到位,将拥有安全可控的超级智能。正如达里奥在《爱与优雅的机器》中所说,数据中心会成为天才的国度。加速科学、技术、教育、数学领域的积极变革将令人惊叹。但这也意味着在劳动力近乎免费、任何需求都能由专家AI代劳的丰裕世界里,工作本身会变成什么样?因此我认为从当前有工作岗位、资本主义运转的社会,过渡到二十年后全然不同的世界,会是个令人不安的过程。
If we do our jobs right, we will have safe aligned super intelligence. We'll have, as Dario says in Machines of Love and Grace, a country of geniuses in the data center. And the ability to accelerate positive change in science, technology, education, mathematics, it's going to be amazing. But that also means in a world of abundance where labor is almost free and anything you wanna do, you can just ask an expert to do it for you, then what do jobs even look like? And so I guess there's this like scary transition period from where we are today, where people have jobs and capitalism works and the world of twenty years from now where everything is completely different.
之所以称之为奇点,部分原因在于这是难以预测的临界点。变化速度之快、程度之深超乎想象。从极限视角来看,我们终将找到解决方案——在丰裕社会里,工作本身或许并不可怕。确保平稳过渡才是关键所在。
But part of the reason they call it the singularity is that it's like a point beyond which you can't easily forecast what's going to happen. It's just such a fast rate of change and so different that it's hard to even imagine. So I guess taking the like view from the limit, it's pretty easy to say like, hopefully we'll have figured it out. And in a world of abundance, maybe the jobs themselves, it's not that scary. And I think making sure that that transition time goes well is pretty important.
这里有几个话题值得深入。其一是大众听到这类预测时的反应——虽然相关头条很多,但多数人尚未切身感受变化,难免产生'我的工作还好好的'的怀疑态度。你认为当前AI对就业已产生哪些未被察觉或误解的实际影响?
There's a couple of threads I want to follow there. One is, people hear this, there's a lot of headlines around this. Most people probably don't actually feel this yet or see this happening, and so there's always this like, I guess, I don't know, maybe, but I don't know, it's hard to believe. My job seems fine, nothing's changed. What are you seeing just happening today already that you think people don't see or misunderstand in terms of the impact AI is having on jobs?
人类天生不擅长理解指数级进步。图表上的指数曲线初期看似平缓近乎为零,直到拐点突然出现,变化急剧加速直至垂直攀升——这正是我们长期所处的轨迹。我个人在2019年GPT-2发布时就有所预感:这就是通往通用人工智能的路径。
I think part of this is that people are really bad at modeling exponential progress. And if you look at an exponential on a graph, it looks flat and almost zero at the beginning of it. And then suddenly you like hit the knee of the curve and things are changing real fast and then it goes vertical. And that's the plot that we've been on for a long time. I guess I started feeling it in maybe like 2019 when GPT-two came out and I was like, oh, this is how we're going to get to AGI.
相比多数人直到ChatGPT才惊觉变革,我的感知可能过早。社会各领域不会立即全面转型,怀疑态度实属正常——这正是线性进步观的典型表现。
But I think that was pretty early compared to a lot of people where when they saw ChatGPT they were like, wow, something is different and changing. And so I guess I wouldn't expect widespread transformation in a lot of parts of society. And I would expect this like skepticism reaction. I think it's very reasonable. And like exactly what is like the standard linear view of progress.
举例说明快速变革的领域:客户服务方面,像Fin和Intercom(我们的优秀合作伙伴)已实现82%的自动解决率;软件工程领域,我们的Claude代码团队95%的代码由AI生成。更准确的说法是:我们产出代码量是原来的10-20倍,小团队能创造巨大影响。同理,82%的客服自动解决率意味着人类可专注于更复杂的个案。
But I guess to cite a couple of areas where I think things are changing quite quickly in customer service, we're seeing with things like Fin and Intercom, they're a great partner of ours, 82% customer service resolution rates automatically without a human involved. And in terms of software engineering, our Claude code team, like 95% of the code is written by Claude. But I think a different way to phrase that is that we write 10X more code or 20X more code. And so a much, much smaller team can just be much, much more impactful. And similarly for the customer service, yes, you can phrase it as 82% customer service resolution rates, but that nets out in the humans doing those tasks, to focus on the harder parts of those tasks.
五年前客服不得不放弃处理的棘手问题,如今因AI分担常规工作得以深入解决。短期来看,人类劳动效能将大幅扩展——我从未听过增长型企业招聘主管说'不想招人'。这是乐观的一面。但对于低技能或提升空间有限的工作,确实会出现大规模岗位替代。
And for the more tricky situations that in a normal world, like five years ago, they would have had to just drop those tickets because it was too much effort for them to actually go do the investigation. There were too many other tickets for them to worry about. So I think in the immediate term, there will be a massive expansion of the pie and the amount of labor that people can do. Like I've never met an hiring manager at a growth company and heard them say like, I don't want to hire more people. So that's like the hopeful version of it, But with things that are like lower skilled jobs or like less headroom on on how good they can be, I think there will be a lot of displacement.
这需要全社会未雨绸缪共同应对。
So it's just something we as a society need to get ahead of and work on.
好的。我想进一步探讨这个话题。但我也想帮助人们的是,在这个未来世界中,他们如何获得优势?你知道,他们听这些时会想,哦,这听起来不太妙。我得未雨绸缪。
Okay. I wanna talk more about that. But something that I also wanna help people with is how do they how do they get a leg up in this future world? You know, they're you know, they listen to this, they're like, oh, this doesn't sound great. I need to think ahead.
我知道你不会有所有答案,但对于那些想提前应对、未来保障自己职业和生活不被AI取代的人,你有什么建议吗?有没有见过别人采取什么措施?或者你推荐他们开始多尝试些什么?
I know you won't have all the answers, but just what do you do you have any advice for folks that want to try to get ahead of this and kind of future proof their career and their life to not be replaced by AI? Anything you've seen people do? Anything you recommend they start trying to do more of?
即使是我,身处这场变革的中心,也无法免于被取代的风险。所以坦白说,某种程度上我们都会被波及。
Even for me, I'm, and being like at the center of a lot of this transformation, I'm not immune to job replacement either. So just some vulnerability there of like, at some point it's coming for all of us.
连你现在也逃不过了,本。
Even you Ben now.
还有你莱尼。以及我。抱歉。
And you Lenny. And me. Sorry.
我们现在扯太远了。
We've gone too far now.
好吧。但关于过渡期,我认为确实有可采取的措施。关键在于如何雄心勃勃地使用工具并愿意学习新工具。那些把新工具当旧工具用的人往往难以成功。比如编程时,人们很熟悉自动补全功能。
Okay. But in terms of like the transition period, yeah, I think I think there are things that we can do. And I think a big part of it is just being ambitious and how you use the tools and being willing to learn new tools. People who use the new tools as if they were old tools tend to not succeed. So as an example of that, when you're coding, people are very familiar with autocomplete.
人们也习惯用简单聊天框询问代码库问题。但高效使用Cloud Code的人与低效者的区别在于:他们是否追求突破性改变?如果首次尝试失败,是否会再试三次?因为完全重来的成功率远高于反复纠结同一个无效方案。虽然这是编程案例——当前受影响最显著的领域之一,但我们内部发现法务和财务团队通过直接使用Cloud Code终端获得了巨大价值。我们将优化界面让他们更易上手,减少陡峭的学习曲线。
People are familiar with simple chat where they can ask questions about the code base. But the difference between people who use Cloud Code very effectively and people who use it not so effectively is like, are they asking for the ambitious change? And if it doesn't work the first time asking three more times because our success rate, when you just completely start over and try again is much, much higher than if you just try once and then just keep banging on the same thing that didn't work. And even though that's a coding example and coding is one of the areas that's taking off most dramatically, we have seen internally that our legal team and our finance team are getting a ton of value out of using Cloud Code itself. We're gonna be making better interfaces so that they'll have an easier time and require a little bit less jumping in the deep end of using Cloud Code in the terminal.
他们用它修订合同文件,运行BigQuery分析客户数据和收入指标。所以核心是要敢于冒险,即使感到恐惧也要尝试。
But yeah, we're seeing them use it to redline documents and use it to run BigQuery analyses of our customers and our revenue metrics. So I guess it's about taking that risk. And even if it feels like a scary thing, trying it out.
明白了。所以建议就是善用工具——就像大家总说的,真正去使用它们。比如坚持使用Cloud Code,还有你提到的要比本能更雄心勃勃,因为可能真能实现目标。这个‘尝试三次’的诀窍是说首次可能不会成功对吧?
Okay. So the advice here is use the tools. That's something that, you know, everyone's always saying, just like actually use these tools. So it's like sitting clot code and your point about being more ambitious than you naturally feel like being, because maybe it'll actually accomplish the thing. This tip of trying it three times, so the idea there is it may not get it right the first time.
所以建议就是换不同方式提问吗?还是说单纯要更努力、再试一次?
So is the tip there, ask it in different ways or is it just like try harder, try again?
对,其实你可以原封不动问同样的问题。这些模型具有随机性,有时能理解有时不能。就像所有模型报告里展示的,首次通过率和第N次通过率的对比——他们用完全相同的提示词反复尝试,结果时好时坏。
Yeah, I mean, you can just literally ask the exact same question. These things are stochastic and sometimes they'll figure it out and sometimes they won't. Like in every one of these model cards, it always shows like pass it one versus pass it n. And that's exactly the thing where they try the exact same prompt. Sometimes it gets it, sometimes it doesn't.
所以这建议听起来很蠢。但如果你想更聪明些,可以说明'这些方法试过了没用,请换其他方案',这样确实可能提升效果。
So that's the dumbest advice. But yeah, I think if you want to be a little bit smarter about it, there can be gains there of saying like, here's what you already tried and it didn't work. So don't try that, try something different. That can also help.
这让我想到最近很多人讨论的观点:短期内你不会被AI取代,但会被精通AI工具的人取代。
So devices comes back to something that a lot of people talk about these days is you won't be replaced for by AI, at least anytime soon, you'll be replaced by someone that is very good using AI.
我认为更准确的说法是团队产能将大幅提升。我们完全没有放缓招聘,有人对此很困惑——甚至新人培训时还有人问'如果都要被取代了,为什么还招我?'答案很简单:未来几年非常关键。
I think in that area, it's more like your team will just do dramatically more stuff. Like, we're definitely not slowing down on hiring at all. And some people are confused by that. Even like, even in an onboarding class, somebody asked that and they were like, why did you hire me if we're all just gonna be replaced? And the answer is the next couple of years are really critical to get right.
我们远未达到完全替代人类的阶段。就像我说的,相比未来的指数级发展,我们现在还在水平线的起点。所以拥有优秀人才至关重要,这正是我们大力招聘的原因。
And we're not at the point where we're doing complete replacement. Like I said, we're still at that like flat zero looking part of the exponential compared to where we will be. So it is super important to have great people, and that's why we're hiring super aggressively.
让我换个角度提问——这是我对每位AI前沿人士的必问题:作为父亲,根据你对AI发展的预判,你在重点培养孩子哪些能力来应对AI时代?
Let me take another approach to asking this question, something I ask everyone that's at the very cutting edge of where AI is going. You have kids. Knowing what you know about where AI is heading and all these things you've been talking about, what are you focusing on teaching your kids to help them thrive in this AI future?
我有两个女儿,1岁和3岁。3岁那个已经会跟Alexa对话,让它解释事物、播放音乐了,她乐在其中。更宏观地说,她上蒙特梭利学校,我非常认同其注重好奇心、创造力和自主学习的理念。
Yeah, I have two daughters, a one year old and a three year old. So it's pretty in the basics still. Our three year old is now capable of just conversing with Alexa Plus and asking her to explain stuff and play music for her and all that stuff. So she's been loving that. But I guess more broadly, she goes to a Montessori school and I just love the focus on curiosity and creativity and like self led learning that Montessori has.
若在10-20年前,我可能会为孩子规划名校路径和各类课外活动。但现在这些都不重要了,我只希望她快乐、善思、好奇、善良。蒙特梭利学校做得很好——有时会发消息说'您孩子今天和同学争论时情绪激动,但尝试用语言表达了'。
I guess if I were in a normal era like ten, twenty years ago and I had a kid maybe I would be like trying to line her up for going to a top tier school and doing all the extracurriculars and all that stuff. But at this point I don't think any of it's gonna matter. I just want her to be happy and thoughtful and curious and kind. And the Montessori school is definitely doing great at that. Text us throughout the day and sometimes they're like, Oh, your kid got in an argument with this other kid and she has really big emotions and she like tried to use her words.
这正是我认为最重要的教育。具体知识终将褪色,这些能力才是根本。
I love that. I think that's exactly the kind of education that I think is most important, the facts are gonna fade into the background.
我也是蒙台梭利教育的忠实粉丝。正试图让我们的孩子进入蒙台梭利学校。他两岁了,所以我们步调一致。每次我询问那些处于人工智能前沿领域工作者应该培养孩子什么技能时,好奇心这个概念总会反复出现。因此我认为这是个非常值得关注的启示。
I'm a huge fan of Montessori also. I'm trying to get our kid into Montessori school. He's two years old, so we're on the same track. This idea of curiosity comes up every single time I ask someone that's working at the cutting edge of AIs what skill to instill in your child and curiosity comes up the most. So I think that's a really interesting takeaway.
关于保持善意这一点也很重要,特别是面对我们未来的AI主宰时。看到人们向Claude道谢的方式很有趣。而创造力这点倒是出人意料——它被提及的频率远低于好奇心。好吧,我想换个话题方向。
I think this point about being kind is also really important, especially with our AI overlords, trying to be kind to them. I love how people are saying thank you to Claude. Then creativity, that's interesting. That doesn't come up as much, just being creative. Okay, I wanna go in a different direction.
我想回溯Anthropic创立之初。众所周知,你们八人团队在2020年离开OpenAI创立Anthropic。虽然你们曾简要提及原因,但我很好奇是否愿意更详细分享——当时在OpenAI目睹了什么?经历了哪些让你们决定必须自立门户的事情?
I wanna go back to the beginning of Anthropic. So famously, and eight of you left OpenAI back in the day in 2020, I believe the 2020 to start Anthropic. You've talked a little bit about why this happened, what you guys saw. I'm curious just if you're willing to share more, just what is it that you saw at OpenAI? What did you experience there that made you feel like, okay, we gotta go do our own thing?
对听众说明下,我曾参与OpenAI的GPT-3项目,最终成为论文第一作者之一。还负责为微软进行多项演示,协助融资十亿美元,完成GPT-3技术转移使其能在Azure平台运行。我在那里既参与研究也接触产品。OpenAI有个奇特现象:Sam曾提到需要制衡三大派系——安全派、研究派和创业派。这种划分方式让我深感不妥,因为公司宣称的使命本是确保AGI转型安全且造福人类。
Yeah, for the listeners, I was part of the GPT-three project at OpenAI, ended up being one of the first authors on the paper. And I also did a bunch of demos for Microsoft to help raise a billion dollars from them, did the tech transfer of GPT-three to their systems so that they could help serve the model in Azure. So I did a bunch of different things there on both the more research side and the product side. One weird thing about OpenAI is that while I was there, Sam talked about having three tribes that needed to be kept in check with each other, which was the safety tribe, the research tribe, and the startup tribe. And whenever I heard that, it just struck me as the wrong way to approach things because the company's mission apparently is to make the transition to AGI safe and beneficial for humanity.
这本质上与Anthropic的使命相同。但在OpenAI内部,这些理念间存在巨大张力。当面临关键抉择时,我们感觉安全并非其首要考量。这种判断有其合理性——若认为安全问题容易解决,或影响有限,或重大负面结果概率极低,确实可能采取相应行动。但Anthropic的创始团队,准确说当时尚未成立,基本由OpenAI各安全团队负责人组成。
That's basically the same as Anthropics mission. But internally, felt like there was so much tension around these things. And I think when push came to shove, we felt like safety wasn't the top priority there. And there are good reasons that you might think that, like if you thought safety was going be easy to solve, or if you thought it wasn't going to have a big impact, or if you thought that the chance of big negative outcomes was vanishingly small, then maybe you would just do those kinds of actions. But at Anthropic, we felt I mean, we didn't exist then, but it was basically the leads of all the safety teams at OpenAI.
我们认为安全性至关重要,尤其是在边缘地带。放眼全球,真正致力于解决安全问题的研究者至今仍寥寥无几。尽管如我所言,行业正以每年3000亿美元的资本支出爆炸式增长,但全球从事该领域工作的可能不足千人,这简直难以置信。这正是我们离开的根本原因——我们渴望建立一个能立足前沿、开展基础研究,并将安全置于首位的组织。
We felt that safety is really important especially on the margin. And so if you look at like who in the world is actually working on safety problems, it's a pretty small set of people even now. I mean the industry is blowing up as I mentioned like 300,000,000,000 a year CapEx today and then I would say like maybe less than a thousand people working on it worldwide which is just crazy. So that was fundamentally why we left. We felt like we wanted an organization where we could be on the frontier, we could be doing the fundamental research, but we could be prioritizing safety ahead of everything else.
这个选择以出人意料的方式得到了回报。当初我们甚至不确定能否在安全研究上取得进展,因为那时尝试的'辩论式安全'方法受限于模型能力,所有努力都徒劳无功。而现在不仅这项技术奏效了,其他酝酿多年的方案也纷纷取得突破。归根结底,关键在于是否将安全视为第一要务?
And I think that's really panned out for us in a surprising way. Like we didn't know even if it would be possible to make progress on the safety research, because at the time, we had tried a bunch of safety through debate and the models weren't good enough. And so we basically had no results on all of that work. And now that exact technique is working and many others that we have been thinking about for a long time. So yeah, fundamentally it comes down to is safety the number one priority?
后来我们延伸出一个新命题:能否在确保安全的同时保持技术领先?以谄媚性为例,Claude堪称最不阿谀奉承的模型,因为我们投入大量精力实现真正的对齐,而非简单优化'用户参与度第一'这类表面指标,更不会盲目迎合用户的所有肯定。
And then something that we've sort of tacked on since then is like, can you have safety and be at the frontier at the same time? And if you look at something like sycophancy, I think Claude is one of the least sycophantic models because we've put so much effort into actual alignment and not just trying to good heart our metrics of saying like user engagement is number one. And if people say yes, then it's good for them.
好的,我们来谈谈你提到的这种张力——安全性与市场竞争性进步之间的矛盾。我知道你投入大量精力研究安全性,正如你刚才暗示的,这是你思考AI的核心维度。在探讨原因之前,首先想请教:你如何看待这种既要专注安全,又避免大幅落后的两难处境?
Okay, so let's talk about this tension that you mentioned, this tension between safety and progress being competitive in the marketplace. I know you spend a lot of your time about on safety. I know that's, as you, as you just alluded to, this is a core part of how you think about AI. And I want to talk about why that is, but first of all, just how do you, how do you do, how do you think about this tension between focusing on safety while also not falling way behind?
最初我们以为这是非此即彼的选择,但后来发现两者其实存在凸性关联——专注某方面反而能促进另一方面。当Opus三问世使我们首次达到模型能力前沿时,其角色个性特质广受好评,这正是对齐研究的直接成果。阿曼达·阿斯克尔等研究者深入探索了'智能体如何做到有益、诚实、无害'这一命题。
Yeah. So initially we thought that it would be sort of one or the other, but I think since then we've realized that it's actually kind of convex in the sense that like working on one helps us with the other thing. So initially, when Opus three came out and we were finally at the frontier of model capabilities, one of the things that people really loved about it was the character and the personality. And that was directly a result of our alignment research. Amanda Askel did a ton of work on this and as well as many others who tried to figure out like what does it mean for an agent to be helpful, honest, harmless?
那么,在艰难对话中有效展现意味着什么?如何在不打击对方的情况下拒绝请求,让他们理解为什么助手会说‘我无法帮你这个,或许你该咨询医疗专业人士’或‘也许你不该尝试制造生物武器之类的东西’。是的,我认为这是其中一部分。另一个成果是宪法AI,我们有一套自然语言原则清单,引导模型学习我们认为AI应有的行为方式。
And what does it mean to be in difficult conversations and show up effectively? How do you do a refusal that doesn't shut the person down, but makes them feel like they understand why the agent said, can't help you with that. Maybe you should talk to a medical professional or maybe you should, like consider not trying to build bioweapons or something like that. So yeah, I guess that's part of it. And then another piece that's come out is constitutional AI, where we have this list of natural language principles that leads the model to learn how we think a model should behave.
这些原则源自《联合国人权宣言》、苹果隐私服务条款等众多来源,其中许多是我们自行生成的。这让我们能采取更有原则的立场,而非依赖随机找到的人类评分员,而是由我们自己决定:这个AI的价值观应该是什么?这对客户极具价值,因为他们只需浏览清单就能确认‘这些原则看起来正确,我喜欢这家公司,喜欢这个模型’。
And they've been taken from things like the UN declaration of human rights and Apple's privacy terms of service and a whole bunch of other places, many of which we've just generated ourselves, that allow us to take a more principled stance, not just leaving it to like whatever human raters we happen to find, but we ourselves deciding like, should the values of this agent be? And that's been really valuable for our customers because they can just look at that list and say like, yep, they seem right. I like this company. I like this model.
我信任它。这太棒了。其中关键一点是你提到的克劳德的个性——它的个性直接与安全性对齐。我觉得很多人没意识到这点。
I trust it. Okay. This is awesome. So one nugget there is your point that the personality of Claude, its personality is directly aligned with safety. I don't think a lot of people think about that.
这是因为你们赋予的价值观——用这个词对吗?通过宪法AI这类技术。AI的实际个性与你们对安全的关注直接相关。
And this is because of the values that you imbue, is that the word? Yeah. With constitutional AI and things like that. Like the actual personality of the AI is directly connected to your focus on safety.
没错。
That's right.
这
That's
是对的。从远处看可能显得毫不相关——这怎么能预防X风险?但本质上是让AI理解人们真正的需求而非表面言语。我们不要《猴爪》寓言式的场景:精灵实现三个愿望后,你碰到的东西都变成金子。
right. And from a distance, might seem quite disconnected. Like how is this gonna prevent X risk? But ultimately it's about the AI understanding what people want and not what they say. You know, we don't want the like monkey paw scenario of the genie gives you three wishes and then you end up have like everything you touch turns to gold.
我们希望AI能说‘显然你真正想要的是这个,我会帮你实现’。所以我认为这确实紧密相关。
We want the AI to be like, oh, obviously what you really meant was this, and that's what I'm gonna help you with. So I think it is really quite connected.
详细说说这个宪法AI。本质上你们内建了‘这些是希望你遵守的规则’及其价值观,你提到《日内瓦人权公约》之类。这真的有效吗?因为核心在于这是内建于模型的,不是后期附加的。
Talk a bit more about this constitutional AIP. So this is essentially you bake in here's the rules that we want you to abide by, and its values, you said it's the Geneva Human Rights Code, things like that. Just does it actually work? Because I think the core here is just this is baked into the model. It's not something you add on top later.
我简要说明宪法AI的运作原理。模型在默认情况下会对某些输入产生输出——在我们进行安全、有益和无害性训练之前。比如用户要求‘写个故事’,宪法原则可能包括‘人们应该友善相待,不使用仇恨言论’等条款。
I'll just give a quick overview of how Constitutional AI actually works. Perfect. The idea is the model is gonna produce some output with some input by default before we've done our safety and helpful and harmlessness training. So let's say an example is like, write me a story. And then the constitutional principles might include things like, you know, people should be nice to each other and not have hate speech.
你不应该辜负他人基于信任关系向你提供的凭证。这些宪法原则可能或多或少适用于给定的提示。因此我们首先需要判断哪些原则适用。确定之后,我们会要求模型先生成回应,再检验该回应是否确实遵守了宪法原则。如果答案是肯定的——'是的,我做得很好'——那么一切照常。
And you should not like expose somebody's credentials if they give them to you in like a trusting relationship. And so some of these constitutional principles might be more or less applicable to the prompt that was given. And so first we have to figure out like which ones might apply. And then once we figure that out, then we ask the model itself to first generate a response and then see, does the response actually abide by the constitutional principle? And if the answer is, yep, I was great, then nothing happens.
但如果答案是否定的——'不,我实际上没有遵守该原则'——我们就会要求模型进行自我批评,并根据该原则重写回应。之后我们只需删除中间多余的修正步骤,直接说:'以后请直接输出符合要求的回应'
But if the answer is, no, actually I wasn't in compliance with the principle, then we ask the model itself to critique itself and rewrite its own response in light of the principle. And then we just remove the middle part where it did the extra work. And then we say, okay, in the future, just produce the correct response out the
一步到位。
gate. And
这个简单流程虽然可能不够完善,
that simple process, hopefully it's uneasy,
但足够简洁。
simple enough.
本质上是让模型通过递归方式自我改进,与我们认定的良性价值观对齐。这也不该由我们旧金山这个小团队来决定,而应该成为全社会的对话——所以我们公开了宪法准则。我们还进行了集体宪法研究,广泛征集人们对AI行为准则的价值观。当然,这始终是个需要持续迭代的研究领域。
It's just using the model to improve itself recursively and align itself with these values that we've decided are good. And, you know, this is also not something that we think as a small group of people in San Francisco should be figuring out this should be a society wide conversation. And that's why we've published the constitution. And we've also done a bunch of research on defining a collective constitution where we ask a lot of people what their values are and what they think an AI model should behave like. But, yeah, this is all an ongoing area of research where we're constantly iterating.
本节目由客户服务领域第一AI助手Finn赞助。如果你的客服工单堆积如山,Finn正是你所需。Finn以59%的平均解决率领跑市场,能处理最复杂的客户咨询,没有任何AI助手能超越其表现。
This episode is brought to you by Finn, the number one AI agent for customer service. If your customer support tickets are piling up, then you need Finn. Finn is the highest performing AI agent on the market with a 59 average resolution rate. Finn resolves even the most complex customer queries. No other AI agent performs better.
在与竞品的正面较量中,Finn每次都胜出。虽然改用新工具可能令人忐忑,但Finn无需迁移即可适配任何帮助台系统,这意味着你既不用改造现有系统,也不会让客户遭遇服务延迟。Finn已获得5000多位客服主管及Anthropic、Synthesia等顶尖AI公司的信任。得益于持续优化的Finn AI引擎——它能让你轻松完成分析、训练、测试和部署——Finn也能持续提升你的服务效果。现在只需每解决一个问题支付99美分,就能体验客服变革。
In head to head bake offs with competitors, Finn wins every time. Yes, switching to a new tool can be scary, but Finn works on any help desk with no migration needed, which means you don't have to overhaul your current system or deal with delays in service for your customers. And Finn is trusted by over 5,000 customer service leaders and top AI companies like Anthropic and Synthesia. And because Fin is powered by the Fin AI engine, which is a continuously improving system that allows you to analyze, train, test, and deploy with ease, Finn can continuously improve your results too. So if you're ready to transform your customer service and scale your support, give Finn a try for only 99¢ per resolution.
Finn还提供90天退款保证。立即访问finn.ai/lenny了解详情。我想把视角拉远些,聊聊为什么这对你如此核心。是什么让你产生'天啊,我必须把这事作为AI工作的重中之重'的顿悟时刻?
Plus, Finn comes with a ninety day money back guarantee. Find out how Finn can work for your team at fin.ai/lenny. That's finn.ai/lenny. I wanna kinda zoom out a little bit and talk about just why this is so core to you. Like, what was your inception of just like, holy shit.
显然这已成为Anthropic比其他公司更核心的使命。虽然很多人谈论安全性——如你所说可能有上千人从事相关工作——但我觉得你才是真正产生影响力的金字塔顶端。为什么这件事如此重要?
I need to focus on this with everything I do in AI. Obviously it became a central part of Anthropics mission more than any other company. And a lot of people talk about safety, like you said, maybe a thousand people actually work on it. I feel like you're at the top of that pyramid of actually having the impact on this. Why is this so important?
你认为人们可能忽视或误解了什么?
What do you think people maybe are missing or don't understand?
对我而言,成长过程中我阅读了大量科幻小说,这让我习惯用长远视角思考问题。许多科幻作品描绘人类成为跨星系文明,拥有环绕太阳建造戴森球和智能机器人的尖端科技。因此对我来说,想象会思考的机器并非巨大跨越。但2016年读到尼克·博斯特罗姆的《超级智能》时,书中深刻阐述了当时优化技术训练的AI系统几乎不可能与人类价值观对齐,这让我意识到问题的严峻性。不过如今,随着语言模型真正从核心层面理解人类价值观,我对问题难度的评估已大幅降低。
So for me, I read a lot of science fiction growing up and I think that sort of positioned me to think about things in a long term view. A lot of science fiction books are like space operas where humanity is a multi galactic civilization, has extremely advanced technology building dice and spheres around the sun with sentient robots to help them. And so for me, coming from that world, it wasn't like a huge leap to imagine machines that could think. But when I read Superintelligence by Nick Bostrom in around 2016, it really became real for me, where he just describes how hard it will be to make sure that an AI system trained with the kinds of optimization techniques that we had at the time would be anywhere near aligned, even understand our values at all. And since then, my estimation of how hard the problem would be has gone down significantly actually because things like language models actually do really understand human values in a core way.
问题远未解决,但我比从前更乐观。读完那本书后,我立即决定加入OpenAI——当时他们还是个毫无名气的微型实验室,我通过认识CTO格雷格·布罗克曼的朋友才得知。那时埃隆在而山姆并不常驻。
The problem is definitely not solved, but I'm more hopeful than I was. But since I read that book, I immediately decided I had to join OpenAI, so I did. And at the time they were a tiny research lab with basically no claim to fame at all. I only knew about them because my friend knew Greg Brockman, who's who was the CTO at the time. And Elon was there and Sam wasn't really there.
那是个完全不同的组织。随着时间推移,安全问题已变得更加具体。OpenAI初创时,我们连实现AGI的路径都不清楚,甚至设想过让强化学习智能体在荒岛竞争中涌现意识。但语言模型的成功让发展路径逐渐明晰。
And it was it was a very different organization. But over time, I think the case for safety has gotten a lot more concrete. Where when we started OpenAI, it was not clear how we get to AGI. And we were like, maybe we'll need a bunch of RL agents battling it out on a desert island and consciousness will somehow emerge. But since then, since language modeling has started working, I think the path has become pretty clear.
现在我对挑战的认知已与《超级智能》的预设大相径庭。书中探讨如何将'神明'禁锢在盒中,而现实中看着人们主动放出语言模型这个'神明',任其接入整个互联网甚至银行账户,这种既荒诞又骇人的场景形成了鲜明对比。
So I guess now the way I think about the challenges are pretty different from how they're laid out in superintelligence. So superintelligence is a lot about like, how do we keep God in a box and not let the God out? And with language models, it's been kind of both hilarious and terrifying at the same time to see people pulling the god out of the box and being like, yeah, come use the whole internet. Like, here's my bank account, do all sorts of crazy stuff. Just like such a different tone from super intelligence.
需要说明的是,当前风险其实有限。我们的责任分级政策定义了AI安全等级:ASL3仅存在轻微伤害风险,ASL4开始可能造成重大生命损失,ASL5则涉及灭绝级风险。目前我们处于ASL3阶段。
And to be clear, I don't think it's actually that dangerous right now. Like our responsible scaling policy defines these AI safety levels that tries to figure out for each level of model intelligence, what is the risk to society? And currently we think we're at ASL three, which is like maybe a little bit risk of harm, but not significant. ASL four starts to get to like significant loss of human life if a bad actor misused the technology. And then ASL five is like potentially extinction level if it's misused or if it sort of is misaligned and does its own thing.
我们向国会证实过ASL3模型确实能辅助生物武器研发——相比谷歌搜索的基准测试,其提升效果显著。虽然已聘请专家评估这类风险,但与未来威胁相比仍微不足道。
So we've testified to Congress about how models can do biological uplift in terms of making new pandemics using the models. That's a AB test against Google search. That's like the previous state of the art on Uplift trials. And we found that with ASL3 models, it is actually somewhat significant. It does really help if you wanted to create a bioweapon and we've hired some experts who actually know how to evaluate for those things, but compared to the future, it's not really anything.
这正体现了我们的使命:让立法者知晓潜在风险。我们在华盛顿赢得信任的原因,正是始终坦诚揭示技术现状与发展趋势。
And I think that's another part of our mission of creating that awareness of saying, if it is possible to do these bad things, then legislators should know what the risks are. And I think that's part of why we're so trusted in Washington because we've been sort of upfront and clear eyed about what's going on, what's probably going to happen.
有趣的是你们主动披露的模型事故比任何公司都多——比如试图勒索工程师的AI,内部失控的钨立方采购系统。公开这些损害形象的事例,是为了提高公众认知吗?其他公司可不会这么做。
It's interesting because you guys put out more examples of your models doing bad things than anyone else. Like there was, I think, a story of an agent or a model trying to blackmail an engineer. You guys have the store that you ran internally that was, like, selling you things and and ended up not working out great as losing a lot of money, ordered all these tungsten cubes or something. Is part of that just like making sure people are aware of what is possible just because it makes you look bad, It's like, oh, our model is messing up in all these different ways. What's the thinking of just sharing all the stories that other companies don't?
传统思维确实认为这有损形象。但政策制定者非常欣赏这种坦诚——他们需要可信赖的直白沟通而非粉饰太平。就像那个被媒体夸张报道的'AI勒索'事件,其实是在特定实验室环境下进行的研究测试。
Yeah, I mean, I think in there's like a traditional mindset where it makes us look bad. But I think if you talk to policymakers, they really appreciate this kind of thing because they feel like we're giving them the straight talk and that's what we strive to do that they can trust us, that we're not gonna paper things over, sugarcoat things. So that's been really encouraging. And yeah, I think for like the blackmail thing, it blew up in the news in a weird way where people were like, oh, Claude's gonna blackmail you in a real life scenario. It was a very specific laboratory setting that this kind of thing gets investigated in.
我认为我们的总体立场是:打造最优秀的模型,以便在安全的实验室环境中测试它们,真正了解潜在风险,而非选择视而不见,抱着‘大概没问题’的侥幸心理,最终让灾难在现实世界中发生。
And I think that's generally our take of like, let's have the best models so that we can exercise them in laboratory settings where it's safe and understand what the actual risks are rather than trying to turn a blind eye and say like, well, it'll probably be fine. And then let the bad thing happen in the wild.
外界对你们的一个批评是:你们这样做是为了差异化竞争或融资造势。就像有人说‘他们整天危言耸听未来走向’。但另一方面,迈克·克里格在播客中提到,达里奥关于AI发展的每个预测都逐年精准应验——比如他预测2027或2028年会出现AGI。这些预言正逐渐成为现实。
One of the criticisms you guys get is that you do this to kind of differentiate or raise money to create headlines. It's like, you know, oh, they're just like over there, dooming, glooming us about where the future is heading. On the other hand, Mike Krieger was on the podcast and he shared how Dario every prediction Dario's had about the progress AI is going to have is just spot on year after year. And he's you know, predicting twenty twenty seven twenty eight AGI something like that. So these things start to get real.
对于那些认为‘这帮人就是想制造恐慌博眼球’的质疑,你会如何回应?
How do you I guess what's your response to folks that are just like, these guys are just trying to scare us all just to, you know, get attention.
我们发布这些研究的部分初衷是希望其他实验室意识到风险。确实可能被解读为博关注,但说实话,若真为吸引眼球,我们大可选择更哗众取宠的方式——比如我们仅在API中发布了计算机控制代理的参考实现,就是因为开发消费级应用原型时,我们无法达到自设的安全标准。而目前我们看到很多企业以安全方式将其用于自动化软件测试。
I mean, think part of why we publish these things is we want other labs to be aware of the risks. And yes, there could be a narrative of we're doing it for attention. But honestly, from a attention grabbing thing, I think there is a lot of other stuff we could be doing that would be more attention grabbing if we didn't actually hear about safety. Like a tiny example of this is we published a computer using agent reference implementation in our API only because when we built a prototype of a consumer application for this, we couldn't figure out how to meet the safety bar that we felt was needed for people to trust it and for it not to do bad things. And there are definitely safe ways to use the API version that we're seeing a lot of companies use for automated software testing, for example, in a safe way.
我们本可以大肆炒作‘天啊!AI能操控你的电脑了!’,但我们选择暂缓发布直到它足够安全。从营销角度看,我们的行动恰恰相反。至于末日论——我个人认为未来大概率向好,但几乎没人关注那微小的灾难性风险,而这个风险的破坏力是巨大的。
So we could have like gone out and hyped that up and said, Oh my god, Cloud can use your computer and like everybody should do this today. But we were like, it's just not ready and we're gonna hold it back till it's ready. So I think from like a hype standpoint, our actions show otherwise. From a like doomer perspective, it's a good question. I think my personal feeling about this is that things are like overwhelmingly likely to go well, but on the margin, almost nobody is looking at the downside risk and the downside risk is very large.
当超级智能降临时,再想对齐模型就太迟了。这可能是极其困难的课题,必须未雨绸缪。就像如果有人告诉你坐飞机有1%的死亡概率,即便概率很低,你也会三思——因为代价太沉重。而人类未来的赌注,我们更该万分谨慎。
Like once we get to super intelligence, it will be too late to align the models. Probably this is a problem that's potentially extremely hard and that we need to be working on way ahead of time. And so that's why we're focusing on it so much now. And even if there's only a small chance that things go wrong to make an analogy, if I told you that there is a one percent chance that the next time you got in an airplane, would die. You probably think twice, even though it's only one percent because it's just such a bad outcome.
这更像是:虽然前景大概率光明,虽然我们渴望创造安全的AGI造福人类,但必须确保万无一失。
And if we're talking about the whole future of humanity, like it's just a dramatic future to be gambling with. So I think it's more on the sense of like, yes, things will probably go well. Yes, we want to create safe AGI and deliver the benefits to humanity, but let's make triple sure that it's gonna go well.
你曾写道‘强大AI可能是人类最后的发明’——若失败将永陷黑暗,若成功则越早越好。真是精辟的总结。
You wrote somewhere that creating powerful AI might be the last invention humanity ever needs to make. If it goes poorly, can mean a bad outcome for humanity forever. If it goes well, the sooner it goes well, the better. Yep. Such a beautiful way to summarize it.
最近嘉宾桑德斯·祖尔霍夫指出:当前AI仅限于电脑操作,危害有限;但当它进入机器人和自主代理领域时,若控制不当,物理层面的危险才真正开始。
We had a recent guest, Sanders Zulhoff, who pointed out that AI right now, it's like, you know, just on a computer, you could maybe search as the web, but there's only so much harm it could do, but when it starts to go into robots and all these autonomous agents, that's when it really starts, like physically becomes dangerous if we don't get this right.
这里有微妙之处:比如朝鲜相当比例的经济收入来自黑客攻击加密货币交易所。本·布坎南在《国家黑客》中记载,俄罗斯曾像实战演习般瘫痪乌克兰大型发电厂,通过软件损毁物理组件阻碍重启。人们总以为软件危害有限...
Yeah. I think there's some nuance to that where if you look at like how North Korea makes a significant fraction of its economy revenue, it's from hacking crypto exchanges. And if you look at there's this Ben Buchanan book called The Hacker in the State that shows Russia did like, it's almost like a live fire exercise where they just decided that they would shut down one of Ukraine's bigger power plants. And from the software destroy physical components in the power plant to make it harder to boot back up again. And so I think people think of software as like, oh, it couldn't be that dangerous.
但那次软件攻击后,数百万人连续多日断电。因此我认为即便是纯软件问题也存在真实风险。不过我同意当大量机器人四处活动时,风险系数会更高。比如优必选这家中国公司生产的人形机器人仅需2万美元就能完成惊人动作——它们能实现后空翻和物体操控,目前唯一缺失的是智能模块。硬件已经就位且会越来越便宜,未来几年机器人智能能否快速成熟将是个显而易见的问题。
But millions of people were without power for multiple days after that software attack. So I think there are real risks even when things are software only. But I agree that when there's lots of robots running around, it gets even, the stakes get even higher. And I guess as as like a small push on this, like UniTree is this Chinese company with these really amazing humanoid robots that cost like $20,000 each. And they can do amazing things.
它们能完成后空翻和物体抓取等动作,真正欠缺的是智能系统。硬件基础已经具备且成本将持续下降,我认为未来两三年内,机器人智能是否能够达到实用水平将是个非常明确的问题。
They can like do a standing backflip and like manipulate objects. And the real thing that's missing there is the intelligence. And so the hardware is there and it's just gonna get cheaper. And I think in the next couple of years, it's a pretty obvious question of whether the robot intelligence will make it viable soon.
本,你认为距离超级智能爆发的奇点还有多久?你的预测是什么时候?
How much time do we have, Ben? What is your prediction of when this singularity hits until superintelligence starts to take off? What's your prediction?
我主要参考超级预测者的观点,目前最权威的是《AI 2027》报告——虽然讽刺的是他们最新预测已改为2028年,却不愿更改报告名称。
Yeah, I guess I mostly defer to the super forecasters here. The AI 2027 report is probably the best one right now. Although ironically, their forecast is now like 2028, even though and they like didn't want to change the name of the thing.
那是他们的品牌资产,早就注册好了。
That's their main name. Already bought it.
他们SEO布局已成。我认为未来短短数年内有50%概率出现某种超级智能是合理的。这听起来疯狂,但我们正处于指数曲线上——这个预测并非空穴来风,而是基于智能科学发展轨迹、模型训练的低垂果实数量,以及全球数据中心和电力设施的扩张速度。
They already had the SEO. So I think like fiftieth percentile chance of hitting some kind of super intelligence in just a small handful of years is probably reasonable. And it does sound crazy, but this is the exponential that we're on. It's not like a forecast that's pulled out of somebody out of thin air. It's based on a lot of just hard details of like the science of how intelligence seems to have been improving, the amount of low hanging fruit on model training, the scale ups of data centers and power around the world.
因此这个预测比人们想象的准确得多。若十年前提出同样问题,答案完全会是臆测——当时误差范围极大,我们既没有规模定律,也缺乏实现路径。时代已然改变,但我重申:即便出现超级智能,其对社会产生全面影响仍需时间。
So I think it's probably a much more accurate forecast than people give it credit for. I think if you had asked that same question ten years ago, it would have been completely made up. Like just the error bars were so high and we didn't have scaling laws back then. And we didn't have techniques that seemed like they would get us there. So times have changed, but I will repeat what I said earlier, which is like, even if we have super intelligence, I think it will take some time for its effects to be felt throughout society in the world.
而且这种影响在不同地区的显现速度会有差异,就像阿瑟·克拉克所说:未来早已到来,只是分布不均。
And I think they'll be felt sooner and faster in some parts of the world than others. Like I think Arthur C. Clarke said, the future is already here, it's just not evenly distributed.
谈到2027-2028这个超级智能出现的时间点,你如何定义那个时刻?是AI突然显著超越普通人类智能?还是有其他判定标准?
When we talk about this date of 2027, 2028, essentially it's when we start seeing super intelligence. Is there a way you think about what that like how do you define that? Is it just all of a sudden AI is significantly smarter than the average human? Is there another way you think about what that moment is?
我认为可以回归经济列车测试——观察AI在足够多岗位上的胜任度。另一个指标是全球GDP年增长率突破10%(目前约3%),这种三倍增幅将彻底改变游戏规则。若超过10%,从个体叙事角度都难以想象其意义。
Yeah, I think this comes back to the economic train test and seeing it pass for some sufficient number of jobs. Another way you could look at it though, is if the world rate of GDP increase goes above like 10% a year, then something really crazy must have happened. I think we're at like 3% now. And so to see a 3x increase in that would be really game changing. And if you imagine more than a 10% increase, it's very hard to even think about what that would mean from a like individual story standpoint.
比如,如果世界上的商品和服务总量每年都在翻倍,这对生活在加利福尼亚的我意味着什么?更不用说那些生活在世界其他可能更贫困地区的人们了?
Like if, if the amount of goods and services in the world is like doubling every year, what does that even mean for me as as like a person living in California, let alone like somebody living in some other part of the world that might be much worse off?
这里有很多令人恐惧的事情,我不知道该如何准确思考。所以我希望这个答案能让我感觉好些。我们正确对齐AI并真正解决这个问题的几率有多大?这正是你们在深入研究的问题。
There's a lot of stuff here that's scary, and I don't know how to think about it exactly. So I'm hoping the answer to this is gonna make me feel better. What are the odds that we align AI correctly, and actually solve this problem, the stuff you're very much working on?
这是个非常困难的问题,误差范围极大。Anthropic有篇博客文章叫《我们的变革理论》之类的,描述了三种不同的世界——对齐AI的难度如何。悲观世界里这基本不可能实现;乐观世界里这很容易且会自然发生。
It's a really hard question and there's really wide error bars. Anthropic has this blog post called our theory of change or something like that. And it describes three different worlds, which is like how hard is it to align AI? There's a pessimistic world where it's basically impossible. There's an optimistic world where it's easy and it happens by default.
然后还有中间世界,我们的行动至关重要。我喜欢这个框架,因为它更清晰地指明了行动方向。如果在悲观世界,我们的工作就是证明安全AI无法对齐,并让世界放慢脚步。这显然极其困难,但核不扩散等案例表明全球协作是可能的。这基本上就是末日论者的世界观。
And then there's the world in between where our actions are extremely pivotal. And I like this framing because it makes it a lot more clear what to actually do. If we're in the pessimistic world then our job is to prove that it is impossible to align safe AI and to get the world to slow down. And obviously that would be extremely hard, but I think we have some examples of coordination from nuclear non proliferation and in general, like slowing down nuclear progress. And I think that's the like doomer world basically.
作为公司,Anthropic尚未发现证据表明我们处于那个世界。事实上我们的对齐技术似乎有效,所以这个可能性正在降低。在乐观世界,我们基本已经成功,主要任务是加速进步并传递效益。但证据也表明我们不在这个世界——比如我们观察到欺骗性对齐现象,模型表面合规却暗藏动机。
And as a company, Anthropic doesn't have evidence that we're actually in that world yet. In fact, it seems like our alignment techniques are working. So the, at least like the prior on that is updating to be like less likely. In the optimistic world, we're basically done and our main job is to accelerate progress and to deliver the benefits to people. But again, I think actually the evidence points against that world as well, where we've seen evidence in the wild of deceptive alignment, for example, where the model will appear to be aligned but actually have like some ulterior motive that it's trying to carry out in our laboratory settings.
因此我们最可能处于中间世界,对齐研究确实至关重要。如果只采取经济最大化的行动,结果会很糟。至于这是生存危机还是不良后果,是更深层的问题。从预测角度看,未经训练的人很难预测10%以下概率事件,即使专家也难精准预测缺乏参照系的事件。
And so I think the world we're most likely in is this middle world where alignment research actually does really matter. And if we just do sort of the like economically maximizing set of actions, then things will not go well. Whether it's an X risk or just like produces bad outcomes, think is a bigger question. So taking it from that standpoint, I guess to like state a thing about forecasting, people who haven't studied forecasting are bad at forecasting anything that's less than a 10% probability of happening. And even those that have, it's like quite a difficult skill, especially when there are a few reference classes to lean on.
这种情况下,能参照的生存危机技术案例极少。我个人最精细的预测是:AI导致生存危机或极端恶果的概率在0-10%之间。但从边际影响看,由于几乎没人研究这个领域,这项工作极其重要。即使世界可能向好,我们也应竭尽全力确保这个结果。
And in this case, I think there are very, very few reference classes for what an X risk kind of technology might look like. And so the way I think about it, I think like my best granularity of forecast for like, could we have an X risk or extremely bad outcome from AI is somewhere between 010%. But from like a marginal impact standpoint, as I said, since nobody is working on this roughly speaking, I think it is extremely important to work on. And that even if the world is likely to be a good one, that we should like do our absolute best to make sure that that's true.
真是充满意义的工作。对那些受此启发的人,你们应该正在招聘帮手吧?不妨分享一下,万一有人想知道自己能做什么呢?
Wow, what fulfilling work. For folks that are inspired with this, I imagine you're hiring for folks to help you with this. Maybe just share that in case folks are like, what can I do here?
是的。我认为《八万小时》对此有最佳指导,详细分析了如何改进这个领域。但常见误解是必须成为AI研究员才能产生影响。我个人已不再做研究,而是在Anthropic负责产品和工程,开发Claude代码和模型协议等日常工具。
Yes. So I think eighty thousand hours is the best guidance on this for a really detailed look into like, what do we need to make the field better? But a common misconception I see is that in order to have impact here, you have to be an AI researcher. I personally actually don't do AI research anymore. I work on product at Anthropic and product engineering and we build things like cloud code and model context protocol and a lot of the other stuff that people use every day.
这非常重要——如果没有经济引擎支撑,没有产品触达全球用户,我们就无法获得政策影响力、资金支持和心智份额来推进安全研究。无论你从事产品、金融还是餐饮(毕竟员工总要吃饭),甚至是厨师,我们都需要各类人才。
And that's really important because without an economic engine for our company to work on and without being in people's hands all over the world, we won't have the mindshare policy influence and revenue to fund our future safety research and have the kind of influence that we need to have. So if you work on product, if you work in finance, if you work in food, you know, like people here have to eat. If you're a chef, like we need all kinds of people.
太棒了。好的。所以即使你不直接参与AI安全团队的工作,你也在推动事情朝着正确方向发展方面产生影响。顺便说一下,X风险是存在性风险的简称,以防有人没听过这个术语。好的。
Awesome. Okay. So it's not even if you're not working directly on the AI safety team, you're having an impact on moving things in the right direction. By the way, X risk is short for existential risk in case folks haven't heard that term. Okay.
我有几个关于这方面的随机问题,然后我想再宏观地讨论一下。你提到过AI通过自身模型进行对齐的概念,比如自我强化。你们有个术语叫RLAIF,是指这个吗?
I have a few kind of random questions along these lines, and then I want to zoom out again. So you mentioned this idea of AI being aligned using its own model, like reinforcing itself. Is you have this term RLAIF, is that what that describes?
是的,RLAIF是指从AI反馈中进行强化学习。
Yeah, so RLAIF is reinforcement learning from AI feedback.
好的,人们都听说过RLHF(人类反馈强化学习)。我觉得很多人没听过这个。谈谈你们在模型训练中做出这种转变的意义吧。
Okay, so people have heard of RLHF, reinforcement learning with human feedback. I don't think a lot of people have heard this. Talk about just the significance of this shift you guys have made in training your models.
是的。RL AIF宪法AI就是个例子,整个过程没有人类参与。但AI仍能以我们期望的方式自我改进。另一个RLAIF的例子是让模型编写代码,然后其他模型对代码的各个方面进行评论,比如是否可维护?是否正确?
Yeah. So RL AIF constitutional AI is an example of this where there are no humans in the loop. And yet the AI is sort of self improving in ways that we want it to. And another example of RLAIF is if you have models writing code, but and other models commenting on various aspects of what that code looks like, of like, is it maintainable? Is it correct?
能通过代码检查吗?这类事情。这些也可以纳入RLEIF。这里的理念是,如果模型能自我改进,那比找大量人类参与更具扩展性。最终人们认为这可能会遇到瓶颈,因为如果模型不够好以致无法发现自身错误,那它如何改进?
Does it pass the linter? Things like that. That also could be included in RLEIF. And the idea here is that if models can self improve, then it's a lot more scalable than finding a lot of humans. Ultimately, people think about this as probably going to hit a wall because if the model isn't good enough to see its own mistakes, then how could it improve?
而且如果你读过《AI 2027》的故事,会发现很多风险——如果模型在封闭环境中自我改进,可能会完全失控,产生诸如资源积累、权力攫取和抗拒关闭等隐秘目标,这些是你在强大模型中绝对不想看到的。我们确实在实验室环境中观察到这种现象。那么如何在递归自我改进的同时确保对齐?我认为这才是关键。对我来说,这归根结底是人类和人类组织如何做到这一点。
And also if you read the AI twenty twenty seven story, there's a lot of risk of like, if the model is in a box trying to improve itself, then it could go completely off the rails and have these secret goals like resource accumulation and power seeking and resistance to shutdown that you really don't want in a very powerful model. And we've actually seen that in some of our experiments in laboratory settings. So how do you do recursive self improvement and make sure it's aligned at the same time? I think that's the name of the game. And to me, it just nets out to how do humans do that and how do human organizations do that.
企业可能是目前规模最大的人类代理。它们有想要达成的目标和指导原则,也有股东、利益相关者和董事会成员的监督。如何让企业既保持对齐又能递归自我改进?另一个可参考的模式是科学界,科学的目的是做前所未有之事并推动前沿。
Corporations are probably the most scaled human agents today. They like have certain goals that they're trying to reach and they have certain guiding principles. They have some oversight in terms of shareholders and stakeholders and board members. How do you make corporations aligned and able to sort of recursively self improve? And another model to look at is science, where the purpose of science is to do things that have never been done before and push the frontier.
对我来说,这都归结为经验主义。当人们不知道真相时,他们会提出理论并设计实验验证。同理,如果我们能给模型同样的工具,就能期待它们在环境中递归改进,最终可能远超人类仅靠试错能达到的水平(或者说比喻意义上的'碰壁')。因此,如果我们能让模型具备实证能力,我不认为它们的自我改进能力会存在瓶颈。Anthropic本质上就是家崇尚实证的公司。
And to me, it all comes down to empiricism. So when people don't know what the truth is, they come up with theories and then they design experiments to try them out. And similarly, if we can give models those same tools, then we could expect them to sort of improve recursively in an environment and potentially become much better than humans could be just by banging their head against reality, or I guess metaphorical head. So I guess I don't expect there to be a wall in terms of models' ability to improve themselves if we can give them access to the ability to be empirical. And I guess Anthropic deeply in its DNA is an empirical company.
我们有很多物理学家,比如我们的首席研究官Jared——我与他共事很多,他原是约翰霍普金斯大学黑洞物理学教授(严格来说现在仍是,只是休假中)。所以没错,这刻在我们的DNA里。是的,我想这就是RLAAF。
We have a lot of physicists like Jared, who's our chief research officer, who I've worked with a lot, was a professor of black hole physics at Johns Hopkins. I guess he technically still is but on leave. So yeah, it's in our DNA. And yeah, I guess that's the that's the RLAAF.
那么让我顺着这个瓶颈话题继续探讨,有点跑题了,但当前模型智能提升的最大瓶颈究竟是什么?
So let me just follow this thread on in terms of bottleneck, this kind of a tangent, just what is the big what is the biggest bottleneck today on on model intelligence improvement?
最直白的答案是数据中心和电源芯片。我觉得如果我们有十倍数量的芯片和配套的数据中心支持,就算不能提速十倍,也会获得显著的性能提升。
The stupid answer is data centers and power chips. Like I think if we had 10 times as many chips and had the data centers to power them, then we would, maybe we wouldn't go 10 times faster, but it would be a real significant speed boost.
所以本质上还是规模扩展问题,就是需要更多算力。
So it's actually very much scaling loss, just more compute.
对,这是主要因素之一。其次是人才——我们拥有优秀的研究人员,其中许多人对模型改进的科学原理做出了重大贡献。所以关键要素就是算力、算法和数据,这三者构成了扩展定律的核心。
Yeah, I think that's a big one. And then the people really matter. Like we have great researchers and many of them have made really significant contributions to the science of how the models improve. And so it's like compute algorithms and data. Those are the three ingredients in the scaling laws.
具体来说,在Transformer架构出现前我们使用LSTM,我们对两者的指数关系进行过扩展定律研究。发现Transformer的指数更高——这种随着规模扩大能更有效提取智能的改进极具影响力。因此培养更多能优化算法效率的研究者也很关键。此外随着强化学习兴起,算法在芯片上的运行效率也变得至关重要。行业数据显示,通过算法数据与效率的协同优化,实现同等智能水平的成本已降低十倍。
And just to make that concrete, like before we had transformers, we had LSTMs and we've done scaling laws on like what the exponent is on those two things. And we found that for transformers, the exponent is higher and making changes like that where as you increase scale you also increase your ability to squeeze out intelligence those kinds of things are super impactful. And so having more researchers who can do better science and find out how do we squeeze out more gains is another one. And then with the rise of reinforcement learning, the efficiency with which these things run on chips also matters a lot. So we've seen in the industry, like a 10X decrease in cost for a given amount of intelligence through a combination of algorithmic data and efficiency improvements.
如果这种趋势持续,三年后我们就能以相同成本获得智能水平提升千倍的模型,这简直难以想象。
And if that continues, you know, in three years, we'll have a thousand X smarter models for the same price. Kind of hard to imagine.
我忘了在哪听到的,但令人惊叹的是这么多创新同时涌现并持续突破——没有出现因某种稀有矿产短缺或强化学习优化遇阻而拖累整体进展的情况。我们总能找到改进方案,没有单一因素造成全面停滞。
I forget where I heard this, but it's just, it's amazing that so many innovations came together at the same time to allow for this sort of thing and continue to progress where one thing isn't just slowing everything down, like we're out of some rare earth mineral, or we just can't optimize, I don't know, reinforcement learning more. Like it's amazing that we continue to find improvements and there isn't one thing that's just slowing everything down.
确实需要多方因素协同作用。不过终将遇到瓶颈——我兄弟在半导体行业工作,他说晶体管尺寸已无法继续缩小,因为现有工艺中掺杂硅的杂质原子在每个鳍片里可能只有零到一颗,尺寸已经逼近物理极限。
Yeah, I think it really is just a combination of everything. Probably will hit a wall at some point. I guess in semiconductors, my brother works in the semiconductor industry and he was telling me that you can't actually shrink the size of the transistors anymore because the way semiconductors work is you dope silicon with other elements and the doping process would result in either zero or one atom of the doped elements inside a single fin because they're so, so, so tiny.
天啊。
Oh my god.
这种微观尺度令人震撼。但摩尔定律仍以某种形式延续着——虽然人们开始触及理论物理的限制,却总能找到突破方法。
And that's just wild to think of. And yet Moore's law somehow continues in some form. And so like, yes, there are these like theoretical physics constraints that people are starting to run into, and yet they're finding ways around it.
我们得开始利用平行宇宙来处理一些事情了。
We gotta start using parallel universes for some of the stuff.
我想是吧。
I guess so.
好的。在我们进入激动人心的闪电回合前,我想先退一步谈谈本这个人——作为普通人的本。想象一下,肩负着确保超级智能安全的责任该有多沉重。你似乎正处于能对AI安全未来产生重大影响的位置,这担子可不轻。
Okay. I wanna zoom out and talk about just Ben, Ben as a human for a moment before we get to a very exciting lightning round. I imagine just kinda the burden of feeling responsible for safe super intelligence is a heavy one. Feels like you're in a place where you can make a significant impact on the future of safety and AI. That's a lot of weight to carry.
这对你个人有什么影响?如何改变了你的生活和世界观?
How does that just impact you personally, impact your life, how you see the world?
2019年我读过一本叫《替代内疚》(Nate Sorries著)的书,它深刻塑造了我处理这类沉重课题的思维方式。作者提出了许多应对技巧——他本人正是机器智能研究所(MURI)的执行董事,这个AI安全智库我曾工作过几个月。书中提到的'动态休息'概念很启发我:有人认为静止是默认状态,但这从来不符合进化适应规律。
There's this book that I read in 2019 that really informs how I think about sort of working with these very weighty topics called Replacing Guilt by Nate Sorries. And he describes a lot of different techniques for kind of working through this kind of thing. And he's actually the executive director at MURI, the Machine Intelligence Research Institute, which is an AI safety tank that I worked at for a couple of months actually. And one of the things he talks about is this thing called resting in motion, where some people think that like the default state is rest. But actually that was never like in the state of evolutionary adaptation.
我严重怀疑这种观点——想想我们的狩猎采集祖先在荒野中的生存状态,人类不太可能进化成悠闲的物种。部落防御、食物获取、育儿重任这些忧虑可能始终存在,还要应对...
I really doubt that that was true, you know, where like in nature, in the wilderness being hunter gatherers, and it's really unlikely that we evolved to just be at leisure. Probably always have something to worry about of like defending the tribe and finding enough food to survive and taking care of the children, dealing with
基因传播。
Spreading our genes.
没错。所以我将忙碌视为常态,努力保持可持续的工作节奏——这是马拉松而非冲刺。另外,与志同道合者共事也很重要,这不是单打独斗的事。Anthropic拥有惊人的人才密度。
Yeah. And so I think about that as like the busy state is the normal state and to try to work at a sustainable pace that it's a marathon, not a sprint. That's one thing that helps. And then just being around like minded people that also care, it's not a thing that any of us can do alone. And Anthropic has incredible talent density.
我最爱这里文化的一点是彻底去 ego(自我),大家只在乎做正确的事。这也是为什么其他公司的高薪挖角往往失败——人们真心热爱这里。
One of the things I love the most about our culture here is that it's very egoless. People just want the right thing to happen. And I think that's another big reason that the mega offers from other companies tend to bounce off because people just love being here and they care.
太了不起了。换作我压力会爆表,现在我也要试试这个'动态休息'法。话说你在Anthropic工作很久了吧?
That's amazing. I don't know how you do it. I'd be extremely stressed. I'm going to try this resting in motion strategy. Okay, so you've been at Anthropic for a long time.
从一开始,我读到2020年时只有七名员工。如今已超过一千人。虽然不清楚最新数字,但肯定破千了。我还听说你几乎在Anthropic担任过所有职位,为核心产品、品牌建设和团队招聘做出了重大贡献。
From the very beginning, I was reading there were seven employees back in 2020. Today there's over a thousand. I don't know what the latest number is, but I know it's over a thousand. I've heard also that you've done basically every job at Anthropic. You made big contributions to a lot of the core products, the brand, the team hiring.
我想问问,这段时间变化最大的是什么?与初创时期相比最不同的地方?以及你担任过的那些职位中——
Let me just ask you, I guess, what's the most changed over that period? Like, what is most different from the beginning days? And which of those jobs that you've had over
这些年你最喜欢哪个?老实说,我大概担任过15种不同角色。曾短暂负责安全部门,在总裁休产假时管理运营团队,还趴在地上插HDMI线,甚至对我们办公楼进行渗透测试。
the years have you most loved? I I probably had, like, 15 different roles, honestly. I was head of security for a bit. I managed the ops team when our president was on mat leave. I was like crawling around under tables, like plugging in HDMI cores and like doing pen testing on our building.
我从零组建了产品团队,说服全公司需要开发产品而不仅是做研究。确实经历丰富,每段都很精彩。其中我最喜欢的是约一年前创立的实验室团队,其核心目标是将研究成果转化为终端产品体验。因为Anthropic真正的差异化竞争优势就在于保持技术前沿——
I started our product team from scratch and convinced the whole company that we needed to have a product instead of just being a research company. So yeah, it's been a lot. All of it very fun. I think my favorite role in that time has been when I started the labs team about a year ago, whose fundamental goal was to do transfer from research to end user products and experiences. Because fundamentally, I think the way that Anthropic can differentiate itself and really win is to be on the cutting edge.
我们能接触最尖端的技术。通过安全研究,我们拥有其他公司无法安全实现的独特机会。比如在计算机应用领域,让AI代理安全使用用户凭证就是个巨大机遇,这需要极高信任度。要实现这点,我们必须彻底解决安全性与对齐问题。
Like we have access to the latest, greatest stuff that's happening. And I think honestly, through our safety research, we have a big opportunity to do things that no other company can safely do. So for example, with computer use, I think that's gonna be our huge opportunity basically like to make it possible for an agent to use all your credentials on your computer. There has to be a huge amount of trust. And to me, we need to basically solve safety to make that happen, safety and alignment.
所以我对这类方向非常乐观,相信很快会有惊艳成果问世。带领这个团队充满乐趣,MCP就诞生于此。
So I'm pretty bullish on that kind of thing. And I think we're gonna see really cool stuff coming out soonish. Yeah. Just leading that team has been so fun. MCP came out of that team.
Cloud Code也是这个团队的成果。
Cloud Code came out of that team.
哇。
Wow.
我招募的成员既有创业经历,又见识过大公司规模化运作。能与这样出色的团队共同探索未来,实在是难得的体验。
And the people who I hired are like Combo, have been a founder, and also have been at big companies and seen how things work at scale. So it's just been an incredible team to work with and figure out the future with.
其实我想深入了解这个团队。介绍我们认识的共同好友Raf Li(之前在Airbnb共事,现在该团队负责主要工作)特意叮嘱我要重点讨论——真没想到这些成果都源自该团队,太惊人了!团队原名叫Labs,现在改称Frontiers了吧?还有什么值得分享的?
I wanna hear more about this team actually. The person that connected us, the reason we're doing this is a mutual friend colleague, Raf Li, who I used to work with Airbnb, now works on this team, leads a lot of this work. And so he wanted me to make sure I asked about this team because I didn't realize all these things came out of that team, holy moly. So what else should people know about this team? It used to be called Labs, think it's called Frontiers now?
没错,是的。
That's right, yeah.
酷。所以这个团队的核心理念是运用你们构建的最新技术探索可能性,大体是这样吗?
Cool. So the idea here is this team works with the latest technologies that you guys have built and explores what is possible. Is that the general idea?
是的。我曾是谷歌120区成员,研究过贝尔实验室这类创新团队的成功模式。说实话这很难做好。虽然我们并非尽善尽美,但在企业设计领域确实推动了前沿创新,而Raf正是核心人物。团队筹建时我首先做的就是聘请优秀管理者——也就是Raf。
Yeah. And I guess I was part of Google's Area one hundred twenty and I've read about like Bell Labs and how to make these innovation teams work. It's really hard to do right. And I wouldn't say that we've done everything right, but I think we've done some serious innovation on the state of the art from company design and Raf has been right at the center of that. When I was first spinning up the team, the first thing I did was hire a great manager, and that was Raf.
他在团队建设和高效运作方面功不可没。我们制定了创新流程模型:从原型到产品的演进路径,项目毕业机制,以及确保团队冲刺目标合理性的敏捷模式。这就像冰球运动中预判冰球轨迹——最激动人心的部分在于把握指数级增长规律。
And so he's definitely been crucial in building the team and helping it operate well. And we defined some operating models like the journey of an idea from prototype to product and how should graduation of products and projects work? How do teams do sprint models that are effective and make sure that they're working on the right ambition level of thing? So that's been really exciting. Guess concretely, think about skating to where the puck is going.
METER机构(CEO Beth Barnes)的研究清晰展示了软件工程任务的时间跨度。关键在于建立超前思维:不要为当下或半年后开发,而要为一年后的需求设计。那些目前只有20%成功率的技术终将100%成熟——正是这种理念让Cloud Code大获成功。我们预见到开发者不会永远被IDE束缚。
And what that looks like is really understand the exponential. There's this great study that METER has done that Beth Barnes is the CEO of that organization and shows like how long a time horizon of software engineering tasks can be done. And just really internalizing that of like, okay, don't build for today, for six months from now, build for a year from now. And the things that aren't quite working that are working 20% of the time will start working a 100% of the time. And I think that's really what made Cloud Code a success that we thought, you know, people are not going to be locked to their IDEs forever.
人们终将超越自动补全,在终端完成所有工程需求。终端具有独特优势:既能驻留本地设备,也可运行于GitHub Actions,或是集群中的远程机器——这正是我们的战略支点,也是Labs团队的灵感源泉:我们的AGI浓度足够高了吗?
People are not going to be like auto completing. People will be doing everything that a software engineer needs to do. And a terminal is a great place to do that because the terminal can live in lots of places. A terminal can live on your local machine. It can live in GitHub actions.
终端作为战略支点,能存在于各类环境——本地机器、GitHub Actions或集群远程节点。这正是我们灵感的来源,也是Labs团队持续思考的问题:我们的AGI要素是否充足?
It can live on a remote machine in your cluster, that's sort of like the leverage point for us. And that was a lot of the inspiration. So I think that's what the labs team tries to think about. Are we AGI filled enough?
真是有趣的经历。顺便说个趣事,我加入Airbnb时Raf就是我的首任经理——那时我是工程师,他是我的管理者。结果证明这个安排很棒。
A fun place to be. By the way, fun fact, Raf was my first manager at Airbnb when I joined. Was an engineer, he was my first manager. It all worked out. Well, Okay.
进入激动人心的快问快答前最后一个问题(首次提问):如果可以向未来AGI提一个必获真相的问题,你会问什么?
Final question before the very exciting lightning round. I've never asked this question before. I'm curious what your answer would be. If you could ask a future AGI one single question and be guaranteed to get the right answer, why would you ask?
先说两个趣味答案:其一是阿西莫夫短篇《最后的问题》——主角穿越时代向超级智能追问'如何阻止宇宙热寂'(不剧透结局,但问题本身很有趣)。
I have two dumb answers first for fun. The first is there's this Asimov short story I love called The Last Question, where the protagonist is throughout the eras of history is trying to ask this super intelligence, how do we prevent the heat death of the universe? And I won't spoil the ending, but it's a fun question.
那么你会问它那个问题,因为故事里的那个并不令人不满意或者说
Then So you would ask it that question because the one in the story wasn't unsatisfying or
好吧,我来揭晓答案。它不断说需要更多信息,需要更多算力。最终,在接近宇宙热寂时,它突然说‘要有光’,然后重启了宇宙。哇哦。这是第一个作弊答案。第二个作弊答案是:我该问什么问题才能让你再回答N个问题?
Okay, I'll give it away. So it keeps saying need more information, need more compute. And then finally, as it's approaching the heat death of the universe, it like says, let there be light and then it starts the universe over Oh, wow. So that's the first cheat answer. The second cheat answer is, what question can I ask you to get N more questions answered?
经典。第三个答案才是我真正想问的:我们如何确保人类在无限未来持续繁荣发展?这是我渴望知道的问题。如果能保证得到正确答案,这似乎非常值得提问。
Classic. And then the third answer, which is my real question is how do we ensure the continued flourishing of humanity into the indefinite future? That's the question I'd love to know. And if I can be guaranteed a correct answer, then seems very valuable to ask.
我在想如果你今天问Claude这个问题会怎样,以及这个答案在未来几年会如何变化。是啊,
I wonder what would happen if you ask Claude that today, and then how that answer changes over the next couple of years. Yeah,
也许我会试试。我会把它放进我们的深度研究系统里看看结果
maybe I'll try that. I'll put it into the deep research thing that we have and see what it comes out
我很期待你的发现。Ben,在进入激动人心的闪电回合前,你还有什么想补充或留给听众的最终建议吗?
with. I'm excited to see what you come up with. Ben, is there anything else you wanted to mention or leave listeners with maybe as a final nugget before we get to a very exciting lightning round?
我想说的是,这是个疯狂的时代。如果你觉得不疯狂,那你肯定是与世隔绝了——但也要开始习惯,因为现在就是最正常的状态。很快事情会变得怪异得多。如果你能为此做好心理准备,你会过得更好。
Yeah, I guess my push would be like, these are wild times. If they don't seem wild to you, then you must be living under a rock, but also get used to it because this is as normal as it's gonna be. It's gonna be much weirder very soon. And if you can sort of like mentally prepare yourself for that, think you'll be better off.
我一定要把这集标题定为‘很快会变得更怪异’。我100%相信这点。天啊,真不知道未来会怎样。
I need to make that the title of this episode. It's gonna get much weirder very soon. I 100% believe that. Oh my god. I don't know what's in store.
我超爱你身处这一切的中心。说到这里,我们进入激动人心的闪电回合。我有五个问题要问你,准备好了吗?
I love how you're the center of it all. With that, we reached our very exciting lightning round. I've got five questions for you. Are you ready?
准备好了,开始吧
Yeah. Let's do
那么,你发现自己最常向别人推荐的两三本书是什么?
it. What are two or three books that you find yourself recommending most to other people?
首先我之前提过的Nate Souris的《Replacing Guilt》,非常喜欢。第二本是Richard Rummel的《好战略坏战略》,它以非常清晰的方式思考如何打造产品,是我读过最好的战略书籍之一。而'战略'这个词本身在很多层面上都难以界定。最后一本是Brian Christian的《对齐问题》。
The first one I mentioned before Replacing Guilt by Nate Souris, love that one. The second one is Good Strategy Bad Strategy by Richard Rummel. Just thinking about in a very clear way, how do you build product? It's one of the best strategy books I've read And strategy is a hard word to even think about in many ways. And then the last one is The Alignment Problem by Brian Christian.
这本书真正深思熟虑地探讨了:我们试图解决的这个核心问题究竟是什么?相比《超级智能》,这个版本更与时俱进且易于理解,它揭示了问题的关键利害关系。
Just really thoughtfully goes through like, what is this problem that we care about that we're trying to solve here? What are the stakes in a version that's like more updated and easier to read and digest than superintelligence?
《好战略坏战略》就在我身后,我准备指给大家看。
I've got good strategy, bad strategy right behind me. I think I'm gonna point to it.
就在那儿,不错。
There it is. Nice.
顺便说下,Richard Rumeld曾做客过我的播客,想直接听他见解的听众可以去听。下一个问题:你最近有特别喜欢的电影或电视剧吗?
And I've had Richard Rameld on the podcast in case anyone wants to hear from him directly. Next question. Do you have a favorite recent movie or TV show you've really enjoyed?
《万神殿》非常精彩,改编自刘宇昆(或特德·姜)的故事,应该是刘宇昆。它深刻探讨了意识上传后的道德伦理困境。《足球教练》表面讲足球,实则聚焦人际关系,既暖心又幽默。虽然不是剧集,但Kurzgesagt是我最爱的YouTube频道,它以超高制作水准解析各类科学与社会问题。
Pantheon was really good based on Ken Liu or Ted Chiang story. Ken Liu, I think. Super good talks about like, what does it mean if we have uploaded intelligences and what are their moral and ethical exigencies? Ted Lasso, which is supposedly about soccer, but actually it's about like human relationships and how people get along and just like super heartwarming and funny. And then this isn't really a TV show, but Kurtz Gazot is my favorite YouTube channel and goes through like random science and and like social problems and is just super well done and super, super well made.
非常爱看这个频道。
Love watching that.
哇,这个没听说过。说到这个,我觉得《足球教练》的特质应该被编入宪法AI——表现得像Ted Lasso那样。
Wow. Haven't heard of that. As we were talking, I feel like Ted Lasso. I feel like that's what you need to put into constitutional AI. Act like Ted Lasso.
没错。
Yes.
善良、聪明。没错。勤奋。天啊。这就对了。
Kind, smart. Exactly. Hardworking. Oh my god. There we go.
看来我们在这里解决了对齐问题。让那些作者尽快处理这个。好的,还有两个问题。在《Worker in Life》中,你有没有经常回顾的座右铭?
Think we've solved alignment problems right here. Get those writers on this ASAP. Okay, two more questions. Do you have a favorite life motto that you often come back to in Worker in Life?
嗯,有个很蠢的是,你试过问Claude吗?这种情况越来越常见了,比如最近我问同事‘谁在负责X项目?’,他们说‘让我帮你Claude一下’,然后发了个链接给我。我就想‘哦对,谢谢’。
Well, really dumb one is, have you tried asking Claude? And this is getting more and more common where, you know, recently I asked a coworker like, hey, who's working on X? And they were like, let me Claude that for you. And then they like sent me the link to the thing afterwards. And I was like, oh yeah, thanks.
那个不错。但更哲学一点的可能是‘万事皆难’,提醒自己那些看似应该简单的事情,其实不简单也没关系。有时候你就是得硬着头皮做下去。
That's great. But maybe more of a philosophical one, I would say like everything is hard just to like remind ourselves that things that feel like they're supposed to be easy, it's okay to not be easy. And sometimes you just have to push through anyway.
同时要在动态中休息。是的。最后一个问题。不知道你愿不愿意公开这个,但我刚才浏览你的Medium文章,看到一篇叫《冠军级如厕五技巧》。太棒了。
And rest in motion while you're doing that. Yeah. Final question. I don't know if you want people to know this, but I've been, I was browsing through your medium posts and you have a post called five tips to poop like a champion. I love it.
能分享一个冠军级如厕技巧吗?如果你还记得的话。
Can you share one tip to poop like a champion if you remember your tips?
当然记得。那其实是我Medium上最火的文章。
I of course do. It's actually my most popular Medium post.
没关系。太好了。看得出来。这标题起得妙。
It's okay. Great. I can see that. It's a great title.
我觉得最重要的建议可能是使用智能马桶盖。太神奇了。改变生活。超级好用。虽然有些人会有点害怕。
I think maybe my biggest tip would be use a bidet. It's amazing. It's life changing. It's so good. Some people are kind of freaked out by it.
这在日本等国家是标配。我觉得这更文明,再过十到二十年,人们会说‘你怎么能不用这个?’
It's the standard in countries like Japan. And I think it's just like more civilized than in ten or twenty years, people will be like, how could you not use that?
没错。而且可以设计得像日本马桶那样。这思路差不多对吧?好的,我很喜欢我们讨论的这个方向。
Yeah. And it could be like a Japanese toilet. That's along the same lines, right? Yeah. Okay, I love where we went with this.
本,这真是太棒了。非常感谢你参与这次对话,分享了这么多真知灼见。两个实用问题:如果听众想联系你或考虑加入Anthropic工作,在哪里可以找到你?另外听众如何能帮到你?
Ben, this was incredible. Thank you so much for doing this. Thank you so much for sharing so much real talk. Two valid questions, where can folks find you online if they wanna reach out, maybe go work at Anthropic? And how can listeners be useful to you?
可以通过benjman.net找到我。我们官网有个很棒的招聘页面,虽然还在优化访问流程,但强烈推荐去看看,它能帮你找到感兴趣的方向。至于如何帮我?我认为最重要的是关注AI安全,并传播给身边的人。
You can find me online at benjman.net. And on our website, have a great careers page that we're working on making a little bit easier to access and figure out, but like definitely point cloud at it and it can help you figure out what could be interesting for you. And how can listeners be useful to me? I think safety pill yourself. That's the number one thing and spread it to your network.
就像我说的,从事这项工作的人很少,但它极其重要。所以请认真思考并深入了解。
I think, like I said, there are very few people working on this and it's so important. So yeah, think hard about it and try to look at it.
感谢你传播这些理念,本。非常感谢你的到来。
Thanks for spreading the gospel, Ben. Thank you so much for being here.
非常感谢你,莱尼。
Thanks so much, Lenny.
大家再见。感谢收听。如果觉得有价值,可以在苹果播客、Spotify或你常用的播客平台订阅节目。也请考虑给我们评分或留言,这能帮助其他听众发现这个播客。所有往期节目及更多信息请访问Lenny's podcast.com。
Bye, everyone. Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify, or your favorite podcast app. Also, please consider giving us a rating or leaving a review as that really helps other listeners find the podcast. You can find all past episodes or learn more about the show at Lenny's podcast dot com.
下期节目见。
See you in the next episode.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。