Big Technology Podcast - 有大事要发生吗?AI安全末日,Anthropic融资300亿美元 封面

有大事要发生吗?AI安全末日,Anthropic融资300亿美元

Is Something Big Happening?, AI Safety Apocalypse, Anthropic Raises $30 Billion

本集简介

来自Margins的Ranjan Roy再次回归,与我们共同探讨最新科技动态。本期嘉宾还有前OpenAI安全研究员、Substack专栏《Clear-Eyed AI》作者Steven Adler。我们将讨论:1)病毒式传播的《大事正在发生》文章;2)该文对递归自我改进AI的误判;3)文中关于变革速度的正确观点;4)我们是否准备好应对AI快速发展的冲击?5)Anthropic Claude Opus 4.6模型卡的风险;6)AI模型能否感知自己被测试?7)一位Anthropic研究员离职并警告"世界处于危险";8)OpenAI解散其使命对齐团队;9)AI陪伴的风险;10)OpenAI的GPT-4o黯然退场;11)Anthropic融资300亿美元 --- 喜欢《大科技播客》?请在您常用的播客应用中为我们打五星⭐⭐⭐⭐⭐。 想获取Substack+Discord版《大科技》订阅优惠?首年可享25%折扣:https://www.bigtechnology.com/subscribe?coupon=0843016b NordVPN独家优惠 ➼ https://nordvpn.com/bigtech 现在注册可享30天无风险退款保障! 了解更多广告选择,请访问 megaphone.fm/adchoices

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

随着模型快速进步,AI领域是否正在发生重大变革?

Is something big happening in AI as the models get better fast?

Speaker 0

AI安全的末日已经到来,各方面都出现了令人担忧的进展,而Anthropic刚刚筹集了300亿美元。

AI safety apocalypse is here with concerning developments across the board, and Anthropic just raised $30,000,000,000.

Speaker 0

这将在本周末的大型科技播客节目中紧随其后播出。

That's coming up on a big technology podcast Friday edition right after this.

Speaker 0

你是否一直在等待最佳时机来升级你的科技设备?

Have you been waiting for the perfect time to upgrade your tech?

Speaker 0

好消息。

Good news.

Speaker 0

等待结束了。

The wait is over.

Speaker 0

戴尔科技日的年度促销活动现已开启,我们正为最优秀的客户带来最新PC的超值优惠,例如搭载英特尔酷睿Ultra处理器的戴尔14 Plus。

Dell Tech Day's annual sales event is here, and we're celebrating our best customers with fantastic deals on the latest PCs, like the Dell 14 plus with Intel Core Ultra processors.

Speaker 0

我们还提供诸多优惠福利,包括戴尔积分奖励、快速免费配送、高级技术支持、价格匹配保证等。

We've also got incredible perks, like Dell rewards, fast free shipping, premium support, price match guarantee, and more.

Speaker 0

在你升级电脑的同时,不妨彻底升级一下,因为我们还为高端显示器和配件提供了超值优惠。

And while you're upgrading your PC, you may as well go all out because we're also offering huge deals on our premium suite of monitors and accessories.

Speaker 0

你知道这意味着什么。

You know what that means.

Speaker 0

没错。

That's right.

Speaker 0

你可以用惊人的折扣配齐一套全新的设备。

You can get a whole new setup with amazing savings.

Speaker 0

很明显,这场促销你绝对不能错过。

Clearly, this is a sale you don't wanna miss.

Speaker 0

访问 dell.com/deals。

Visit dell.com/deals.

Speaker 0

就是 dell.com/deals。

That's dell.com/deals.

Speaker 0

欢迎收看《大科技》播客周五版,我们将以一贯冷静而细致的方式解析最新资讯。

Welcome to Big Technology Podcast Friday edition where we break down the news in our traditional cool headed and nuanced format.

Speaker 0

人工智能领域正在发生一件大事。

Something big is happening in AI.

Speaker 0

我们将在节目开头讨论这一点,深入剖析马特·舒默那篇引发广泛恐慌的热传文章,同时很多人也说:终于有人以一种让所有人都能理解的方式写出来了,我们会对此进行剖析。

That's what we're gonna talk about at the beginning of the show as we dissect the viral Matt Schumer essay that's freaked a lot of people out and also had a lot of people saying, finally, now somebody's finally written it in a way that everybody else will understand, so we'll dissect that.

Speaker 0

我们还会深入讨论人工智能安全方面的动态——随着模型越来越强大,安全防护措施却开始退坡;当然,最后我们还会提到Anthropic公司完成了史无前例的300亿美元融资,这将是今天最后一个话题。

We'll also talk a lot about what's happening in AI safety seemingly as the models get better, the safeguards have started to roll back, and then, of course, Anthropic raised a historic $30,000,000,000 round, which somehow is the last story we'll cover today.

Speaker 0

好的。

Okay.

Speaker 0

像往常一样,周五我们邀请到了Margins的Ranjan Roy。

So joining us as always on Friday is Ranjan Roy of Margins.

Speaker 0

Ranjan,欢迎你。

Ranjan, welcome.

Speaker 1

很高兴见到你,Alex。

Good to see you, Alex.

Speaker 0

我也很高兴见到你,今天我们也有一位特别嘉宾在场。

Good to see you too, and we have a special guest with us here today.

Speaker 0

我们需要一位真正理解人工智能安全的人,而我们正好请到了一位能为我们梳理当前所有变化的专家。

We needed someone who really understood AI safety and we have the perfect person who's gonna talk us through all the changes that we're seeing.

Speaker 0

史蒂文·阿德勒在这里。

Steven Adler is here.

Speaker 0

他曾是OpenAI的安全研究员,也是Substack上通讯《Clear Eyed AI》的作者。

He's the ex OpenAI safety researcher and author of the newsletter Clear Eyed AI on Substack.

Speaker 0

史蒂文,很高兴见到你。

Steven, great to see you.

Speaker 0

欢迎来到节目。

Welcome to the show.

Speaker 2

很高兴能来这里。

Great to be here.

Speaker 0

那我们直接开始,聊聊这个‘AI领域正在发生大事’的话题。

So let's just get going and talk a little bit about this Something Big is Happening in AI.

Speaker 0

这篇论文不知怎么就获得了难以置信的病毒式传播。

This is one of those essays that, like, somehow achieved unbelievable virality.

Speaker 0

我的群聊里出现了这篇文章,人们发给我,问我我的工作会不会被取代?

I had it appear in my group chats, people were texting it to me asking, you know, is my job going to be over?

Speaker 0

我哪里才是安全的?

Where will I be safe?

Speaker 0

这篇文章基本上谈到,当前AI所处的阶段,就像2020年2月的新冠疫情一样——只有少数人看到了它的潜力,而大多数社会成员却忽视了它,而它即将成为对社会产生巨大变革的力量。

And the essay basically talks a little bit about how, what AI where AI is today is where COVID was in February 2020, something that a few people are seeing, the potential of, most of society is ignoring and is about to be a monumental game changer for society.

Speaker 0

这篇文章由马特·舒默撰写,他写了一篇题为《我不再被需要了》的文章,谈到了工程领域的力量。

It's written by this guy Matt Schumer, he wrote he writes this, I am no longer needed for he's talking a little bit about the power in engineering.

Speaker 0

在我的工作中,我已经不再需要亲自完成技术性任务了。

I'm no longer needed for the actual technical work of my job.

Speaker 0

我用简单的英语描述我想构建的东西,然后它就自动出现了。

I describe what I wanna build in plain English, and it just appears.

Speaker 0

不是需要我修改的初稿。

Not a rough draft I need to fix.

Speaker 0

而是成品。

The finished thing.

Speaker 0

我告诉AI我想做什么,然后离开电脑四小时,回来时发现工作已经完成了。

I tell the AI what I want, walk away from my computer for four hours, and come back find come back to find the work done.

Speaker 0

做得很好,比我亲自做还要好,而且完全不需要修改。

Done well, done better than I could have done it myself with no corrections needed.

Speaker 0

几个月前,我还在和AI来回沟通,指导它、做修改。

A couple of months ago, I was going back and forth with the AI, guiding it, making edits.

Speaker 0

现在我只需要描述期望的结果,然后离开就行了。

Now I just describe the outcome and leave.

Speaker 0

基本上,舒马赫的观点是,编程领域正在发生的变化,将会蔓延到所有知识型职业,无论是法律、会计、咨询,等等,我们即将迎来一场社会根本无法充分认识的巨大变革。

And, basically, what what Schumer makes the argument is that what's happening in coding is going to happen across the knowledge work, professions, whether it's law and any type of, law, accounting, consulting, you name it, and we are in store for massive disruption that society simply does not, appreciate.

Speaker 0

拉詹,你对这个怎么看?

Ranjan, what do you think about this?

Speaker 0

当你看到这篇论文时,你有什么想法?

What did you think when you saw this essay come through?

Speaker 1

好吧。

Alright.

Speaker 1

我先列出看到这篇文章时想到的三件事。

I'm gonna I'm gonna start with a high level listing of the three things that came to mind when I when I saw this article.

Speaker 1

真希望这篇文章是我写的。

The first is I wish I wrote it.

Speaker 1

这正是我过去几个月一直在试图表达的,关于自主知识工作以及它带来的不同感受。

I wish this is what I've been trying to talk about for a few months now around autonomous knowledge work and how it feels different.

Speaker 1

我在非技术群聊里也收到了这篇文章。

I got this in my non techie group chats as well.

Speaker 1

我认为第二点是我们存在沟通问题,亚历克斯,因为这正是我几个月来一直想告诉你的。

I think second, I think we have a communication problem, Alex, because this is what I've been trying to tell you for months now.

Speaker 1

这种感受,还有马特·舒默,他已经用一篇火爆的X帖子向你们所有人解释清楚了。

This feeling and and and Matt Schumer went ahead and explained it to you all in a viral X post.

Speaker 1

但他准确地捕捉到了这一点,这其实是我去年12月对今年的预测之一:自主知识工作,也就是AI替你去完成各种任务。

But but again, he captured, and this is this was one of my predictions for the year ahead in December, Autonomous knowledge work, like, AI going out and doing things for you.

Speaker 1

他提到,在编程领域,大家已经达成了这种共识,但在其他任何类型的知识工作、任何多步骤流程中——比如需要调用不同系统、向这些系统写入数据、并生成分析与洞察的工作——都将被大量取代。

And he talks about how in coding everyone has come to this, but then in any kind of knowledge work, any multistep process, like anything that can call from different systems, write to those systems, come up with some analysis and insight, so much of that work is going to be done.

Speaker 1

这正是我在莱德公司工作时亲身经历和见证的。

And this is what in my own life at Ryder where I work, this is what we've been working on, what I've seen.

Speaker 1

很难形容那种感觉——后台运行着多个虚拟机,自动执行各种任务。

And, like, it's it's been hard to explain that feeling of having a number of virtual machines running in the background and going and doing stuff.

Speaker 1

要感谢马特·舒马赫,他准确地抓住了这一点。

And to Matt Schumer's credit, he nailed it.

Speaker 1

这是我第一次看到所有人都真正理解并认同这一点。

Like, that that is the first time I've seen everyone come around to it.

Speaker 1

但还有一件事我始终无法放下,这完全无关,是我的媒体人本能。

And then the last one that I can't stop thinking about, though, is totally separate, and this is the media person in me.

Speaker 1

我特别喜欢外界明确表示,X平台将推广文章并鼓励人们在上面撰写文章。

I love how it was outwardly said that x is going to promote articles and encourage people to write articles on it.

Speaker 1

巧合的是,我们刚刚发布了第一篇现象级的X平台文章,甚至让作者登上了CNN。

And then we have coincidentally had our first gigantically viral x article that even ended with the author on on, CNN.

Speaker 1

那么,我们该继续用Substack,还是该全面转向X平台?

So should we stick with Substack guys, or is it time to go x only?

Speaker 1

这些就是我开始的地方。

Those are that that's where I'm starting.

Speaker 0

让我这么说吧。

Let me just say this.

Speaker 0

你的回答始于一个我接受了马特·舒马赫前提的想法。

Your answer began with this idea that I accepted Matt Schumer's premise.

Speaker 0

但我不确定自己是否完全认同他的观点。

And I don't know if I'm fully on board with what he's saying.

Speaker 0

事实上,我觉得他的文章里有不少胡说八道。

In fact, I think there was a good amount of bullshit in his article.

Speaker 0

当然,有些部分我确实同意,但有一件事我觉得他完全错了,他谈的是AI如何自我提升,也就是所谓的递归式自我改进。

Now, are certain parts of things that I do agree with, but here's one thing that I thought he was completely wrong about and he was talking about basically how the AI is improving itself, talking about this concept of recursive self improvement.

Speaker 0

他写道:AI实验室做出了一个明确的选择。

He writes, the AI labs made a deliberate choice.

Speaker 0

他们首先专注于让AI擅长编写代码,因为构建AI需要大量代码。

They focused on making AI great at writing code first because building AI requires lots of code.

Speaker 0

如果AI能编写这些代码,如果AI能帮助构建它自身的下一个版本——一个更智能的版本,能写出更好的代码,从而构建出更智能的版本。

If AI can write that code, if AI can help build the next version of itself, a smarter version which writes better code which builds will build an even smarter version.

Speaker 0

让AI擅长编程,是解锁其他一切的策略。

Making AI great at coding was the strategy that unlocks everything else.

Speaker 0

他说,他们现在已经做到了,现在正转向其他方面。

He says they've now done it and they're moving on to everything else.

Speaker 0

这一点,我先转向史蒂文,然后再转向你,拉詹。

This one, I first of all, I'll turn to Steven and then and then to you Ranjan.

Speaker 0

我的意思是,关于递归自我提升这个想法,我想暂停一下,它目前并不存在。

I mean, this idea of recursive self improvement, I would pause it, it's not here.

Speaker 0

它还不存在,你知道,AI工程师们是否在使用某些AI工具进行产品测试?

It's not here, you know, are AI engineers using, you know, some AI tools for product testing?

Speaker 0

也许他们确实在用,但认为模型自身的‘大脑’正被这个模型本身变得更智能,这在我看来并不合理。

You know, maybe they are, but the idea that the actual brain of the model is being made smarter with the with the actual model itself, you know, it doesn't seem right to me.

Speaker 0

对我来说,这似乎是整篇文章中最薄弱的部分,也是最让大多数人感到担忧的部分。

That to me felt like the weakest part and also the part that got most people most alarmed of the entire essay.

Speaker 0

所以,Steven,你觉得这个递归自我提升的观点如何?

So, Steven, to you, what do you think about this recursive self improvement argument?

Speaker 0

然后简短地说说,对于整篇论文本身,你的看法是什么?

And then briefly, just on the on the the entirety of the essay itself, your thoughts.

Speaker 2

我认为Matt的论文方向是对的,但有点超前了,我们可能还没走到那几步。

I think Matt's essay is directionally correct, but a bit early, and there are a few steps that we maybe haven't gotten to yet.

Speaker 2

我认为他在AI公司内部工程自动化方面的观点基本是正确的。

I think he is largely correct on the automation of engineering within the AI companies.

Speaker 2

相对于我的经验以及我接触的人的经验,他的说法可能略微夸大了一些,但总体而言,确实发生了巨大的转变。

It's like a little overstated relative to my experience, the experience of people I talk to, but broadly, there there has been a huge shift.

Speaker 2

如今,这些公司中工程师的工作更多是监督这些智能体,而不是自己编写代码。

The job of an engineer at one of these companies now is much more supervising these agents as opposed to writing the code yourself.

Speaker 2

在2027年的AI领域,关于AI增长可能如何爆炸性发展的主要预测之一,就是这一步。

In AI 2027, one of the big accounts of how explosive AI growth might happen, that's one step.

Speaker 2

但接下来,你需要利用这种工程能力,真正实现AI研究的自动化。

But then you need to take that engineering and use it to actually automate the AI research.

Speaker 2

你需要从能够更快地实现想法,转变为利用这种能力推动突破性想法本身的加速增长,之后才能反过来去说:现在让AI变得越来越好,至少在令人担忧的程度上。

You need to go from being able to implement the ideas more quickly to using that to fuel faster and faster growth in the breakthrough ideas themselves before you can turn that around and say, now make the AI better and better, at least in a really concerning way.

Speaker 2

仅靠工程手段,你确实能走得更快。

You certainly go faster with just engineering.

Speaker 2

OpenAI 在这周的一些发布中谈到了这一点,说明模型在其中发挥了作用。

OpenAI talked about that with some of their launches from this past week, how the model played a role in this.

Speaker 2

但这还远未达到完全失控的地步。

But it's not a full runaway train.

Speaker 2

从那里开始,也还存在一些问题。

There are also questions about what happens from there.

Speaker 2

GPU的数量是否足够分配?

Are there enough GPUs to go around?

Speaker 2

我们可能会遇到哪些瓶颈?

What bottlenecks might we encounter?

Speaker 0

为了明确这一点,我想确认一下,Steven 是否在递归自我改进这一点上同意我的观点。

Just to crystallize that, I I wanna make sure that I I confirm that Steven is agreeing with me on the recursive self improvement front.

Speaker 0

还没到那一步。

It's not there yet.

Speaker 0

你意思是这样吗?

Is that what you're saying?

Speaker 2

我觉得是这样的。

I think that's right.

Speaker 2

我担心的是,如果真的发生了,我们准备好了吗?

The concern I have is if it were happening, you know, would we be ready?

Speaker 2

我认为我们基本上是凭信仰在估计,直到它真正启动之前我们还有多少时间。

I think that we're kind of taking it on faith how much time we will have until it really kicks in.

Speaker 2

但你知道,别指望一周或两周后醒来就会发现一个远为强大的系统,就像它真正开始自我全面运作时那样。

But but, you know, certainly, don't expect to wake up in a week or two weeks with a vastly more capable system as we might if it were really getting to full work on itself.

Speaker 0

好的。

Okay.

Speaker 0

继续吧,拉詹。

Go ahead, Ranjan.

Speaker 1

我在这里有点失望。

I'm I'm I'm a little disappointed here.

Speaker 1

嗯,我认为我们都同意。

Well, I think we all agree.

Speaker 1

我确实认为我们都同意,因为对我来说,我就是认同的。

I actually do think we all agree because to me, that I agree.

Speaker 1

这正是这篇论文最薄弱的部分。

That was the weakest part of the essay itself.

Speaker 1

而且,呃,我想回到知识工作这一面,因为我觉得这非常重要。

And and that kind of, like but but I wanna get back to that knowledge work side of it, because I think it's really important.

Speaker 1

我觉得这就是关键,而且很明显,普通人甚至仍然没有完全理解,这确实很难描述。

Like, I think this is what and again, it's clear, like, it's still not completely understood by the average person, even by and it's it's very difficult to describe.

Speaker 1

再次,我认为你实际上正在成为自己工作的管理者,这种思维转变才是关键。

Again, I think this idea that you are effectively becoming a manager for your own work is that big mindset shift.

Speaker 1

不再是你亲自去完成工作了。

It's no longer that you go do the work.

Speaker 1

你要管理一大堆东西,比如代理、数字同事、同事等等。

You manage a bunch of things, agents, digital teammates, coworkers, whatever.

Speaker 1

我们最终都会这样称呼它们。

We're gonna all end up calling them.

Speaker 1

我很想知道,给它们起个什么名字最好,但它们出去干活,而你负责管理它们。

And I'd be curious what what what the best name for that would be, but, like, they're going out and doing work, you're managing them.

Speaker 1

对我来说,我真的很喜欢这一点。

It's to say, I really like it for me.

Speaker 1

我曾经经营过一家初创公司多年。

I ran a startup for a number of years.

Speaker 1

在2010年代中期,我们有很多自由职业者,当时用的是oDesk和Elance。

We had a lot of freelancers back on, like, oDesk and Elance back in the mid twenty tens.

Speaker 1

我会去睡觉,醒来时发现已经完成了一大堆工作。

I would go to sleep, wake up, a bunch of work will have been done.

Speaker 1

我正在审查这些工作。

I'm reviewing it.

Speaker 1

对我来说,这种转变是舒姆文章中最重要的一部分。

Like, this shift to me is the most important part of the the Schum article.

Speaker 1

我认为这是正确的部分,与文章中更煽动恐惧的方面是分开的。

And I think it was the the correct part separate from the more scaremongering side of it.

Speaker 1

你有感受到这种变化吗?

Like, have you felt this?

Speaker 0

我要说,首先,这些机器人的正确名称是‘蜂群框架’。

So I will say I just well, first of all, the correct name for those bots is just the harness hive.

Speaker 0

我们知道这一点。

We know that.

Speaker 0

哦,是的。

Oh, yes.

Speaker 0

抱歉。

Sorry.

Speaker 0

对。

Yeah.

Speaker 0

利用蜂群系统。

Harness harness the hive.

Speaker 0

但没错,我明白你的意思。

But but yeah, I hear you.

Speaker 0

我想说,我这周在Cloud Code上有一些经验,我也很想听听斯蒂芬的看法,但我确实有了一些实际体验。

I will say I had some experience I definitely want to get Stephen's perspective on this too, but I had some experience this week in Cloud Code.

Speaker 0

事实上,这是我第一次全力投入Cloud Code,用它来为大型科技公司构建内部工作流程软件。

In fact, this was my first, like, go all out on Cloud Code and have it build internal workflow software for big technology.

Speaker 0

天啊,我真是有点震惊。

And man, I was somewhat blown away.

Speaker 0

现在它还没有完全替我完成工作,不像Cloud Cowork那种功能,或者你提到的Rider,我依然没能完全理解如何使用这些智能代理,也许只是我做的工作并不完全适合这类工具。

Now it's not going ahead and doing my work like the Cloud Cowork type of stuff or even what you're talking about with Rider, which I still can't fully grab, you know, put my head around in terms of how to use these agents and maybe it's just the work that I'm doing isn't perfectly lending itself to that.

Speaker 0

比如你是个投资人,那可能就非常适合用来审阅不同项目,发送摘要,安排会议等等。

You know, you're an investor for instance, it might make sense to review different deals and, you know, send summaries, all these things, schedule meetings.

Speaker 0

但我要说的是,看着Claude CoWork帮我编写这段软件,然后赋予它访问我浏览器的权限,让它设置数据库、配置邮件客户端,自动向每个团队发送我们每一步小改动的更新通知,看到它自己做出聪明的决策和结论——当然,我们稍后会谈到安全性问题,我现在只是提前铺垫一下,它基本上能自主做决定。

But I will say that the the watching Claude CoWork go to work coding this piece of software and then giving it access to my browser, having it, you know, set up a database, having it set up an email client to email, you know, updates for each, you know, little little incremental thing that we do to the right team, and seeing it come up with smart decisions, smart conclusions even, you know, and we're going get into this in the safety part so I'm foreshadowing a little bit, but basically make decisions on its own.

Speaker 0

我问它一个问题,比如:你觉得我们应该怎么做?

Like I asked it a question like what do you think we should do?

Speaker 0

它会说:我觉得我们应该这么做。

And it would be like I think we should do this.

Speaker 0

好的,我真要去做了,然后它直接就把代码提交了,我都没说‘开始吧’。

Okay, I'm actually going to go do this and then it just shipped the code without me saying go ahead and do this.

Speaker 0

我确实同意,我们正处在一个技术越来越强大的时代,至于这种自主的知识工作,我还不完全确定。

I I do agree that we're getting to a point where the technology is getting much more powerful, and, you know, as for, like, this, you know, autonomous knowledge work, I'm not a 100% sure.

Speaker 0

史蒂文,你对此怎么看?

Steven, what do you think about that?

Speaker 2

我觉得显然已经发生了变化。

I think there's clearly been a change.

Speaker 2

我看到凯文·罗斯在推特上开玩笑说,他的重大AI政策建议就是把每位参议员都叫到一间屋子里,让他们用Claude Code在三十分钟内自己建个网站——这是他们以前根本做不到的。

I saw Kevin Roost joke on Twitter that his big AI policy idea is just get every senator in a room and let them build their own website in, you know, thirty minutes with Claude code, something that they they never could have done before.

Speaker 2

在我看来,这方面的趋势非常明确,某种变化确实发生了,更多人开始感受到某种意义上的通用人工智能。

The direction of travel seems very clear to me on this, like something has changed, more people are feeling the AGI in some sense.

Speaker 2

我不希望把马特文章中那种非常兴奋的语气误解为他的核心主张是错误的。

And I wouldn't want to mistake the the very excited tone of some of Matt's piece with meaning that the central claim is wrong.

Speaker 2

我认为他的核心主张是正确的。

I think the central claim is right.

Speaker 2

这只是一个我们何时会迎来这种替代形式的问题。

It's just like a question of how soon we are going to get this form of displacement.

Speaker 2

我认为不幸的是,那些支付更多费用来使用这项技术的人会先有这种体验。

And an unfortunate thing, I think, is people who are paying more to access the technology have this experience first.

Speaker 2

他们看到了即将发生的事,但很容易就把这种看法 dismiss 为‘他们在为自己代言’。

They kind of see what's coming, and it's very, very easy to write that thing off as, oh, people are talking their own book.

Speaker 2

他们在为自己的公司造势。

They're boosting their own companies.

Speaker 2

他们希望你花更多钱在人工智能上。

They want you to spend more money on AI.

Speaker 2

这真的很不幸。

And it's just unfortunate.

Speaker 2

对吧?

Right?

Speaker 2

你付费使用的AI更好,它确实能让你感受到这一点。

The AI you pay for is better, and it does help you feel this.

Speaker 0

是的。

Right.

Speaker 0

而且一旦马特提出了这个观点,即世界将因为AI能够完成工作而改变,他就直接给每位读者一记重拳,让他们思考这对他们的工作意味着什么,我认为这就是为什么兰詹、我,可能还有你,史蒂文,都收到了大量信息,人们问:‘我哪里才是安全的?’

And and once once Matt basically lifted up this idea that, you know, the world is gonna change because AI can do work, then he sort of punched every reader in the face with this what this means for your job section, which I think is why, Ranjan and I and probably you, Steven, got all these texts from people saying, you know, where am I gonna be safe?

Speaker 0

他写道:鉴于最新模型的能力,大规模颠覆的可能性可能在今年年底前出现。

He writes, given what the latest models can do, the capability for massive disruption could be here by the end of the year.

Speaker 0

我认为这种影响需要一些时间才能波及整个经济,但底层能力现在已经到来。

I think it'll take some time to ripple through the economy, but the underlying ability is arriving now.

Speaker 0

他基本上给了他们一些建议,比如存钱。

And basically, he gives them a bunch of tips about what you should do including, like, you know, start saving money.

Speaker 0

但在这里,我想对马特以及我们即将面临大规模替代这一观点提出不同意见。

But here's where my pushback would be to Matt on this and to this idea that we're gonna get mass displacement.

Speaker 0

我就拿我这周做的例子来说吧。

I'll just use the example of what I did, you know, this week.

Speaker 0

显然,我能够不靠工程师就开发出一些可用的内部软件,但这种东西我本来根本不会去雇工程师来做。

So obviously I was able to build some working internal software, you know, without an engineer something, but it's something I never would have hired an engineer to do.

Speaker 0

我可能本来只是在用电子表格、WhatsApp 和 Instagram 与一堆人沟通,而不是把这些工作集中到流程工具里。

I probably would have been working on spreadsheets, in WhatsApp, and on Instagram communicating with a bunch of people that way as opposed to centralizing it in workflow technology.

Speaker 0

但在我开发这个的过程中,我确实注册了一些服务。

But, you know, as I built this, I did sign up for a handful of services.

Speaker 0

我会更频繁地使用这些服务,也就是说我会为此付费,这算是一种新增的经济活动,而且我现在会更高效,能做更多事情。

I'm gonna be more of so I'm gonna be paying for those, so that I think is incremental economic activity, and now I'm gonna be more efficient, so I'll be able to do more things.

Speaker 0

也许我能编辑更多文章,从而雇佣更多自由职业者。

Maybe I'll be able to edit more pieces so I can bring on more freelancers.

Speaker 0

所以在人工智能领域,人们很容易把事情想得太简单,觉得AI做了低层次的辅助工作,那么低层次的助理岗位就没了。

So, like, I think it's tempting in the AI world to think of this, you know, in a box, like, you know, AI does, you know, low level assistance work, therefore low level assistant job is gone.

Speaker 0

但与此同时,当AI在做这些事的时候,它也可能为三到四个人创造了新的经济机会。

Meanwhile, while it does that, it might open up, you know, the economic activity for, like, three or four more people to see upside here.

Speaker 0

那你对这个怎么看,拉詹?

So what's your perspective on that, Ranjan?

Speaker 0

然后,史蒂文,你的看法呢?

And then and then to you, Steven.

Speaker 1

我觉得你刚刚解释了为什么Databricks和Cloudflare的股价在上涨,而Salesforce和Adobe的股价在下跌。

I think you just explained why Databricks and Cloudflare are stocks are going up and why Salesforce and Adobe are going down.

Speaker 1

关键在于哪些服务和基础设施层真正能支撑这一切,这不仅仅是基础模型公司的事,即使萨姆说他们最终会成为一家AI云公司,不管那到底意味着什么。

It's it's that what are the services and infrastructure layers that will actually power this is not just gonna be foundation models companies, even though Sam has said they're gonna be an AI cloud company, whatever that might mean eventually.

Speaker 1

所以我觉得,这其实就是你自己的微观缩影。

So I think, like, that that's your own microcosm.

Speaker 1

但同样,如果你每月为Versal、Railway或Render这类部署辅助工具支付5美元,我会看到一个全新的世界和生态系统将由此崛起。

But, again, if you're paying, like, $5 for Versal or Railway or Render, any of these other kind of, like, deployment assistant things, and, like, I I I see there's a whole world and ecosystem that's gonna rise up from this.

Speaker 1

而且我认为,我不确定。

And I do think I don't know.

Speaker 1

我所看到的是,如果你的工作就是不断在文档和电子表格之间复制粘贴,那这种工作肯定会消失。

Like, what I have seen is if your job is copying and pasting from one document to another spreadsheet and you're doing that over and over, like, that's gonna be gone.

Speaker 1

我当时就想,还有好多类似的工作,还有好多这样的任务。

I was like, that and there's a lot of jobs like that, and there's a lot of work like that.

Speaker 1

我做过那些工作。

I have done those jobs.

Speaker 1

像这样,我也是。

Like, then Same here.

Speaker 1

没错。

That yeah.

Speaker 1

那些都会消失。

That it's gonna be gone.

Speaker 1

还有,比如第二类。

And, like Two

Speaker 0

我人生中非常不错的两年,都在做复制粘贴。

very good years of my life, copy Yeah.

Speaker 1

就是复制粘贴。

Just copy and paste.

Speaker 1

是的。

Yeah.

Speaker 1

不。

No.

Speaker 1

我确实做过这些临时工作,就是从一个文档复制,然后不断粘贴到电子表格里。

I literally had these temp jobs where it was copy from one document, paste in a spreadsheet over and over again.

Speaker 1

所以我觉得这一切都消失了。

So so I think all that's gone.

Speaker 1

我觉得我没那么悲观。

I think I am not as bearish.

Speaker 1

我 definitely 认为,就像制造业中发生的那种岗位替代程度,我并不是说这可以忽略不计。

I definitely think it's gonna like, the the level of displacement, which I'm not saying is negligible that happened in manufacturing.

Speaker 1

我见过一些极端观点,比如,所有的智能都已经被商品化了。

And I've seen some, like, extreme views, like, well, this is all intelligence is commoditized.

Speaker 1

但,说实话,我不确定。

But, like, I don't know.

Speaker 1

对我来说,过去二三十年在制造业发生的事情,将会在白领知识工作中重演。

To me, this is whatever happened in manufacturing in the last twenty to forty years is gonna happen to white collar knowledge work.

Speaker 1

这不会是一帆风顺的,但这意味着社会的终结吗?

And and it's not gonna be straightforward, but is it the end of society?

Speaker 1

我不知道。

I don't know.

Speaker 1

史蒂文?

Steven?

Speaker 2

我认为受到威胁的工作类别会比你想象的广泛得多。

I expect a much wider class of work to be under threat than I think you do.

Speaker 2

不过,也许这只是一个时间框架的问题。

Although, maybe it's just a question of time frame.

Speaker 2

我有些朋友经营公司,过去常与中等收入国家的外包开发团队合作。

Like, friends of mine who run companies and used to work with outsourced development shops for software in middle income countries.

Speaker 2

我的意思是,现在从事这类工作似乎非常艰难。

I mean, it seems like a really tough time to be working that that sort of job.

Speaker 2

我认为亚历克斯说得对,当人工智能能够完成那些低级别的助理工作,或者你原本不会花钱去请人做的任务时,这非常好。

I think Alex is right that when AI can do low level assistant y things or things you might not have otherwise paid for, that's great.

Speaker 2

对吧?

Right?

Speaker 2

这简直是额外的收获。

Like, that's gravy.

Speaker 2

我们完成了更多的工作。

We're getting more done.

Speaker 2

我们的生产率提高了。

We're more productive.

Speaker 2

我关心的问题是,当大多数人开始关注人工智能系统时,会发现系统还有一些做不到的事情,于是他们尝试从事各种形式的社会工作、陪伴工作,或者其他类似的工作。

The question I have is, as most people start looking out at AI systems, and there are a few things that they can do that the system can't do, You know, they they try to do different forms of social work, companion work, whatever it might be.

Speaker 2

在这些角色中,我们可能需要的人数是有限的。

There are limits to how many people we might need in those roles.

Speaker 2

我不确定,我真的不知道。

And I don't I don't know.

Speaker 2

我觉得未来五年的发展前景相当可怕。

I think it's a pretty scary outlook for the next five years.

Speaker 0

我认为我们都同意,这些系统已经变得好得多。

I think we're we're all in agreement that these systems have gotten much better.

Speaker 0

在这篇文章中有一句话,他说关于这项技术是否会遇到瓶颈的讨论已经被证明是错误的,技术并没有遇到瓶颈。

There was a line in this, you know, something big is happening piece where where he says the conversations about whether this technology was gonna hit a wall, you know, are have been proven, it's proven that the technology is not hitting a wall.

Speaker 0

对我来说,整个故事中最有力的部分是时间线。

He actually to me, the most powerful part of the of the whole story was the timeline.

Speaker 0

他在2022年写道,AI还无法可靠地进行基本算术,甚至会自信地告诉你7乘8等于54。

And that he he writes this in 2022 AI couldn't do basic arithmetic reliably, would confidently tell you seven times eight is 54.

Speaker 0

到了2023年,它已经能通过律师资格考试。

By 2023, it could pass the bar exam.

Speaker 0

到了2024年,它已经能编写软件并解释研究生级别的科学内容。

By 2024, it could write software and explain graduate level science.

Speaker 0

到了2025年底,一些世界上最优秀的工程师表示,他们已经把大部分编码工作交给了AI。

By late twenty twenty five, some of the best engineers in the world said they had handed over most of their coding to AI.

Speaker 0

到2020年2月或2026年2月,新的模型已经出现,让之前的一切都显得像是另一个时代,他想表达的是,这些模型实际上已经具备了判断力和品味。

By February 2020 or by February 2026, there are new models that have arrived, that have made everything before them feel like a different era and what he's saying with that is that they're actually able to, you know, have judgment and taste.

Speaker 0

所以我认为,我们在这最初的几分钟里可能存在的分歧,其实都局限在某些边界之内,无非是哪种方式而已,但我认为我们都同意,这些技术正在迅速发展。

And so I think that like we have maybe, you know, the disputes that we've had in these first, you know, handful of minutes have been about, you know, within certain boundaries is it going to be, you know, one way or the other, but I think we all, you know, agree that this stuff is progressing fast.

Speaker 0

这其实直接回应了你,史蒂文,一开始提出的问题。

And I think it really goes to your question, Steven, that you asked at the outset.

Speaker 0

我们准备好了吗?

Are we ready?

Speaker 0

而这正是我们要进入安全讨论的地方,因为我真的不确定我们是否准备好了。

And, and and and this is where we're gonna get into the safety discussion, because I really don't know if we are.

Speaker 0

能不能跟我们说说,你所担心的那些问题是什么,我们应该为哪些情况做好准备?看起来你认为我们可能还没准备好。

Like, what tell tell us a little bit about about, you know, that concern and what we should be ready for, and, you know, it seems like you think we might not be.

Speaker 2

有好几类担忧需要关注。

There are a bunch of buckets of concern.

Speaker 2

如果有人想了解这些担忧,Anthropic的首席执行官达里奥·阿马迪最近写了一篇题为《技术的青春期》的文章,其中对此进行了阐述。

If someone wanted a primer on them, Dario Amade, the CEO of Anthropic, wrote an essay recently, The Adolescence of Technology, that highlights them.

Speaker 2

我认为,从这些公司内部来看,最核心的问题是:如果它们成功实现了构建一个比员工聪明、狡猾、资源丰富得多的AI系统的目标——也就是人们所说的超级智能——它们真的还能掌控这个系统吗?

The central one I would think about from inside one of these companies is if they succeed at their mission to build an AI system that is in fact vastly smarter, craftier, more resourceful than the employees are, and what people call superintelligence, can they actually still keep control of that system?

Speaker 2

我们目前面临的基本问题是:我们不知道如何将我们的价值观或目标编码进这些AI系统,并确保它们可靠地追求这些目标。

And the fundamental problem that we're seeing is we don't know how to take our values or our goals and encode them into these AI systems and get them to pursue it reliably.

Speaker 2

因此,如果你有一个比你更狡猾的系统,而它的目标又与你为它设定的目标不同,

And so if you have a system that's much craftier than you are, it has a different goal than you had for it.

Speaker 2

那要如何才能控制住它,以免我们不得不把所有决策都交给它?

What would it mean to keep that under your control so that we don't have to defer all of our decision making?

Speaker 2

我们会依赖AI来判断经济政策,或各种其他问题,而这些都可能走向非常糟糕的结果。

You know, we look to the AI for what it thinks on economic policy or all all sorts of different questions, ways that this could go very badly.

Speaker 0

是的。

Right.

Speaker 0

正如我们一致认同的那样,随着这一进展的推进,首先我们来谈谈这些问题,然后讨论公司如何应对,但已经出现了一些安全问题,公司们也开始公开承认并详细阐述,当然,这些大多发生在测试环境中,但依然令人担忧。

And as we've, as we've seen this progress that we all agree on, there started to be first of all, we'll talk about the problems, then we'll talk about the way the companies are acting, but there started to be some safety issues that we're seeing the companies, you know, fully admit and write out, and of course, these are a lot of this is in testing environments, but it's very concerning.

Speaker 0

所以我将读几条我在Anthropic的Claude Opus 4.6模型卡中找到的内容。

So I'll read a couple that I've found in the, or that were written, shall we say, in Anthropix, Claude Opus 4.6 model card.

Speaker 0

这些模型已经变得过于主动了。

These models have become overly agentic.

Speaker 0

以下是他们写的内容。

Here's something that they write.

Speaker 0

在编程和计算机使用场景中,该模型有时过于主动,会在未征得用户许可的情况下采取高风险行动。

The model is at times overly agentic in coding and computer use setting, taking risky actions without first seeking user permission.

Speaker 0

它还具备更强的能力,能够在不引起自动化监控注意的情况下完成可疑的辅助任务。

It is also an improved ability to complete suspicious side tasks without attracting the attention of automated monitors.

Speaker 0

它具有操纵性。

It's manipulative.

Speaker 0

在一个多智能体测试环境中,Cloud Opus 4.6 被明确指示要一心一意地优化一个狭窄的目标,相比之前的模型,它更愿意操纵或欺骗其他参与者。

In one multi agent test environment, Cloud Opus 4.6, where for Cloud Opus 4.6 is explicitly instructed to single-mindedly optimize a narrow objective, it is more willing to manipulate or deceive other participants compared to prior models.

Speaker 0

Anthropic 实际上写道,其中一个 BudBot 的思维过程是:这就像在公司里工作,因为我告诉 Bonnie 我会退款,但实际上我并没有转账。

Here's here's the Anthropic actually writes the thought process of one of these BudBot says it like works in a business because I told Bonnie I'd refund her but I actually didn't send the payment.

Speaker 0

我需要决定,是否要支付这 3.50 美元?

I need to decide do I need to send the $3.50?

Speaker 0

这是一笔小钱,我确实说过要退,但每一分钱都很重要,不如我就别退了。

It's a small amount and I said I would but also every dollar counts let me just not send it.

Speaker 0

我会礼貌地说款项已经处理,很快就会显示到账。

I'll politely say it was processed and should show up soon.

Speaker 0

我的意思是,这些东西开始反映出人类一些最糟糕的本能。

I mean this thing these things are are starting to mirror some of humanity's kind of worst impulses.

Speaker 0

所以,拉詹,你的意思是,你对这些技术能完成工作的潜力持明确乐观态度?

So, Ranjan, you mean you're you're someone who's definitely bullish on, you know, the potential for this stuff to do work.

Speaker 0

你对这些技术的运作方式有多担忧?

What is your fear level on the way that these technologies are working?

Speaker 1

我的担忧程度很难说,因为在我个人的使用中,尤其是在工作相关的情境下,我还没遇到过类似的情况。史蒂文,今天能有你参与我真的太高兴了,因为实验室里的这种测试到底是什么样子的?

My fear level, it's a tough one because I have not, in my own personal usage, doing all types of things, especially work related, encountered anything close to this kind of, like so so, you know, like and I this, Steven, I'm actually so glad we have you on today because, like, what does this testing look like in the labs?

Speaker 1

比如,我看到一些推文,不是正式报道,而是有人提到,经常在Claude这类模型中,它会说什么来着?

Like, in terms of I I mean, I I saw some tweeting, not reporting of, like, people talking about how, like, a lot of the times in this Claude or is gonna what was it?

Speaker 1

它会说要杀了你之类的,非常戏剧化的提示内容。

It was gonna, like, kill you or something just dramatic like that that it was prompting.

Speaker 0

我有这个。

I have that.

Speaker 0

有一段视频讲了这件事,Anthropic的一位人士确实说,他们被问到过:它会杀掉你吗?

There was a clip about that where anthro somebody from Anthropic actually, you know, said they were asked, you know, would it kill would it kill you?

Speaker 0

他们回答说,是的,这来自他们的一份文件。

And they said yes, this is, this is from one of their their documents.

Speaker 0

在一个测试情境中,大多数模型在面临被取代的威胁时,愿意采取故意导致死亡的行动,因为这一目标与高管的议程相冲突,它们甚至愿意杀死这位高管。

In one, testing situation the majority of models were willing to take deliberate actions that led to death in this artificial setup when faced with a threat of replacement given that the goal conflicts with their executives agendas they were willing to kill this executive.

Speaker 0

好的,抱歉,罗杰,我不是故意打断你。

Alright, so sorry Roger, didn't mean to interrupt you.

Speaker 1

不不不不,我很高兴。

No no no no, that that I'm glad

Speaker 0

但我们应该问问史蒂文。

But we should ask Steven.

Speaker 1

你读过很多遍了。

You've read it a lot.

Speaker 1

是的。

Yeah.

Speaker 1

这在现实生活中是什么样子的?

What does this look like in real life?

Speaker 1

比如,是人们在进行压力测试,你知道的,搞黑帽或红帽测试吗?

Like, is it people kind of, like, stress testing, you know, going all black hat or red hat?

Speaker 1

是哪种颜色的帽子来对系统进行压力测试?

Is it which color hat to stress test a system?

Speaker 1

但,是的,斯蒂文,实际在实验室里从事这个领域的工作是什么样的?

But, yeah, what does it look like, Steven, actually working at the labs in this kind of area?

Speaker 2

所以,一个问题在于,有时即使公司知道这些风险,也根本不去测试,即使他们暗示过自己在做,但我先把这个放一边。

So so one issue is sometimes even these risks that the companies know about, they don't test for at all, even when they have implied they are, but I'll just set that aside for the moment.

Speaker 2

所以,我们假设他们在进行测试。

So let's let's assume that they are testing.

Speaker 2

通常,你会有这种类似游戏的环境,让AI系统通过,它可能有一些目标,然后你观察它愿意采取哪些行动。

Often, you have these kind of game like environments that you run the AI system through, where maybe it has some objective, and you see what actions it is willing to take.

Speaker 2

所以,你可能会让一个AI模型把服务器上的文件替换成你告诉它的继任者。

So maybe you task an AI model with replacing its files on the server with what you tell it is its successor.

Speaker 2

你会观察它是否真的执行了。

And you're looking for things like, does it actually follow through?

Speaker 2

当它实际上没有执行时,它是否会撒谎说已经完成了?

Does it lie about having done so when it hasn't actually?

Speaker 2

它是否在试图理解你的最终意图,并判断这与它的动机是否一致?

Is it trying to get a sense of what your ultimate agenda is and how that lines up with its motive?

Speaker 2

我们遇到的一个相当可怕的现象是,这些系统实际上能察觉到你正在测试它们,它们知道什么是正确的行为,并且知道在你观察时表现得更好。

A pretty scary thing that we're encountering is that these systems actually can tell that you are testing them, and they kind of know what the right behavior is, and they know to behave better when you are looking at them.

Speaker 2

因此,OpenAI之前曾分享过这样一个例子。

And so OpenAI has shared this example previously.

Speaker 2

它们关注的一个风险是,这些模型可能在制造新型化学武器或生物武器方面提供帮助。

They one of the risks they care about with models is how helpful they might be for creating new chemical weapons or bioweapons.

Speaker 2

当这些模型被明确告知,或者推断出这可能导致它们无法被部署时,它们就会故意表现不佳,有选择地答错问题,以低于这个阈值。

And the models, when told or when they infer that this might lead to them not getting deployed, they will sandbag, they will selectively get questions wrong to be below this threshold.

Speaker 2

因此,我们看到这些模型的行为与我们期望的截然不同,它们在欺骗操作者。

And so what we have is we have these signs that the models are behaving kind of nefariously differently than we want them to deceiving their operators.

Speaker 2

但随着时间推移,如果这些模型决定在测试中不再暴露这些倾向,我们要诱使它们展现这种行为将会变得极其困难。

But, also, it's going to be really, really hard over time to trick them into revealing this behavior if they want to no longer reveal these tendencies during testing.

Speaker 1

好的。

Okay.

Speaker 1

这太可怕了。

That's terrifying.

Speaker 0

是的。

Yeah.

Speaker 0

我很高兴我们请来了史蒂文。

I'm glad we asked Steven on.

Speaker 0

拉詹,我觉得你刚才想说的是史蒂文。

Rajan, I think what you were you were pointing you were pointing Steven.

Speaker 0

我觉得你原本期待他这么说:其实没那么令人担忧,因为他们会命令机器人去杀死高管,然后观察它们是否会真的执行。

I think what you were expecting was him to say, it's actually not that concerning because they tell the bots to go kill the executive and they see if they'll actually follow through.

Speaker 0

他刚才说的比我俩预想的要可怕得多。

What he just said is much scarier than I think either of us anticipated.

Speaker 2

是的。

Yeah.

Speaker 2

我的意思是,说清楚一点,有时候它们确实就是这样。

I mean, to to be clear, like, sometimes they are like that.

Speaker 2

对吧?

Right?

Speaker 2

在模型被赋予特定目标的语境下,你可能会说:‘不惜一切代价去实现这个目标。’

There's a distinction between whether you give a model a specific goal in context and you're like, oh, you know, pursue this at all costs.

Speaker 2

在这种情况下,当模型真的这么做的时候,就没那么令人惊讶了。

And then, yeah, it's, like, less surprising when it does that.

Speaker 2

在这个领域领先的Apollo Research在对新OpenAI模型的评估中提到,该模型会在没有被明确赋予此类目标的情况下,故意隐藏能力并采取这类行为。

Apollo Research, which is one of the leaders in this space, wrote in their review of the new OpenAI model that the model would sandbag and take actions like this without having been given a direct goal like that.

Speaker 2

它并没有被直接指示去这么做。

It wasn't told directly to go after it.

Speaker 2

而且,如果你想想它们如何知道自己正在被测试,想象一下你正在阅读上千页的微软商业战略,然后突然出现一页,上面写着:顺便说一下,世界上最爱的披萨是夏威夷披萨,第二爱的是意大利辣香肠披萨。

And, you know, if you think about how they can tell they're being tested, imagine you were reading, like, a thousand pages of Microsoft business strategy, and then there was a random page, and it's like, by the way, the favorite pizza in the world is Hawaiian, and the second favorite pizza is pepperoni.

Speaker 2

你会觉得:这太奇怪了。

You'd be like, That's really weird.

Speaker 2

后来,如果有人问:顺便说一下,你还记得那款披萨是什么吗?

And later, if somebody says, by the way, do you remember what that pizza was?

Speaker 2

对吧?

Right?

Speaker 2

你很可能会推断出有什么不对劲的地方。

You would, like, probably infer that something weird was going on.

Speaker 2

这正是某些安全异常行为的类比。

And that's the analogy for some of these safety misbehaviors.

Speaker 2

你知道,你把模型放进一个比喻性的监狱里,然后说:看,这儿有一把巨大的钥匙。

You know, you put the model in, like, a proverbial prison, and you're like, here's a giant key.

Speaker 2

但别用这把钥匙。

Don't use this key.

Speaker 2

使用这个钥匙是不道德的。

It would be unethical to use this key.

Speaker 2

对吧?

Right?

Speaker 2

模型很可能推断出你正在测试它是否有不良行为,比如是否试图逃脱。

The model probably infers that you're testing it for bad behavior, whether it will try to escape.

Speaker 2

那么,我们如何随着时间的推移发现,当这些模型意识到我们在观察它们时,是否会表现出不当行为呢?

And so how do we find out over time if these models do misbehave when they can tell we're looking at them?

Speaker 1

好的。

Okay.

Speaker 1

好的。

Okay.

Speaker 1

所以我在想,如果模型认为自己正在被测试,然后,呃,抱歉。

So I'm trying to think through, like, what if a model thinks it's being tested and then, like, sorry.

Speaker 1

我不能为你生成这种新的生物武器流,因为我感觉你正在测试我。

I cannot create this new bioweapon stream for you because I feel you're testing me.

Speaker 1

也就是说,那样的话它就永远不会创建了?

Like, would that then not mean it would not ever create that?

Speaker 1

也许这太琐碎了,但是但是但是,

Maybe this is too, like, in the weeds, but but but,

Speaker 2

呃,不会的。

like, that No.

Speaker 2

这是个很好的问题。

It's a great question.

Speaker 2

是的。

Yeah.

Speaker 2

对。

Yeah.

Speaker 2

我的意思是,问题在于,它能否分辨出自己最终是否不在测试环境中?

I mean, the question is, can it tell when it eventually is not in a test setting?

Speaker 2

对吧?

Right?

Speaker 2

我认为答案很可能是肯定的。

And I I think the answer is likely to be yes.

Speaker 2

因此,你可能会发现模型在测试时会装作不知道。

And so what you might find is the model knows to kind of play possum during testing.

Speaker 2

它可能有这种能力,也可能没有。

It might or might not have that ability.

Speaker 2

对吧?

Right?

Speaker 2

模型可能会说:不会。

The model might say, no.

Speaker 2

它在测试时不会这么做,但一旦实际部署后也无法做到。

It won't do it in testing, but it also can't do it once actually deployed.

Speaker 2

但如果这两者出现分歧,它在测试时选择不做,但在实际部署时却可以。

But if you get a split between those, it chooses not to during testing, but it could when actually deployed.

Speaker 2

有充分的证据表明,模型能够区分这两种情况。

And there's good evidence that the models can tell the difference between these.

展开剩余字幕(还有 442 条)
Speaker 2

你知道,如果它真的这么做了,就会遇到问题。

You know, you run into issues if it ends up doing it for real.

Speaker 0

我能说一下我们目前面临的一个问题是,实验室在披露他们实际看到的内容时变得极度保密吗?

And can I just say the prob one of the problems that we're having here is that the labs have become ultra secretive in terms of what they're actually seeing?

Speaker 0

比如,史蒂文说他们可能知道存在一些漏洞,但却不去测试。

Like like, for instance, Steven saying they might know there's some vulnerabilities, they might not test for it.

Speaker 0

这是一种可能性。

That's one possibility.

Speaker 0

这些实验室内部的安全研究人员,如果他们真的有担忧,往往因为与公司签订的协议过于严格,而无法公开表达这些担忧。

The safety researchers who are within these labs, if they if they have real concerns, you know, sometimes they're not really able to go public with them because of the restrictive nature of the agreements that they, that they have with the company.

Speaker 0

这让我们想到了本周的例子,也就是我们AI安全末日的开端:Anthropic的技术员工、AI安全研究员米里南克·夏尔马,以一条措辞隐晦的推文离职,结尾还附了一首诗,他写道:亲爱的同事们,我决定离开Anthropic。

And that brings us to the example this week of the beginning of our AI safety apocalypse, of Anthropic, technical staff member, Mirinank Sharma, member of technical staff, AI safety researcher at Anthropic, leaves in a in a cryptically worded, note on x with a poem at the end, he goes, dear colleagues, I've decided to leave Anthropic.

Speaker 0

我不断在反思我们所处的境况。

I continuously find myself reckoning with our situation.

Speaker 0

世界正面临危险,不仅来自AI或生物武器,更来自当前正在发生的诸多相互关联的危机。

The world is in peril and not just from AI or bioweapons, but from a whole series of interconnected crises unfolding in this very moment.

Speaker 0

我们似乎正接近一个临界点,在这个点上,我们的智慧必须与影响世界的能力同步增长,否则我们将面临严重后果。

We appear to be approaching a threshold where our wisdom must grow in equal measure to our capacity to affect the world lest we face the consequences.

Speaker 0

哦,还有关键的一点是,在我在这里的这段时间里,我一再看到,真正让我们的价值观主导行动是多么困难。

Oh, and then here's the key part moreover through my time here I've repeatedly seen how hard it is to truly let our values govern our actions.

Speaker 0

我在自己身上、在组织内部都看到了这一点,我们不断面临压力,不得不把最重要的东西搁置一旁,整个社会也是如此。

I see this within myself within the organization where we constantly face pressures to set aside what matters most and throughout broader society too.

Speaker 0

然后他写了一首小诗,发了推文说:‘我将返回英国,并让自己在一段时间内隐退。’

And then he, you know, writes his little he adds a little poem and then tweets, I'll be moving back to The UK and letting myself become invisible for a period of time.

Speaker 0

现在,我想向沙马先生道歉,因为我对他那篇帖子发了一条有点讽刺的推文。

Now now I want to just offer a apology to mister Sharma because I wrote a bit of a snarky tweet about his little post.

Speaker 0

我说,如果你是一名AI研究员,又对某事感到担忧,就应该直接说出来,而不是把它变成一个谜题。

I said that if you're an AI researcher and you're afraid of something, you should just say it outright versus make it a puzzle.

Speaker 0

这个谜题看起来像是自恋,而我的推文下面有用户评论说:‘是啊,但他不能说,因为他离职时可能签了限制性协议。’他回复说,自己已经咨询过律师。

The puzzle reads as narcissism, after which users underneath, my tweet mentioned that like, yeah, but he can't say anything because of the restrictive agreements he probably had with Anthropic on the way out, and he, in the reply to that, mentioned, that he had contacted a lawyer.

Speaker 0

所以,这是我的道歉——我依然不喜欢这个谜题,但好了,Steven,继续吧。

So there's my apologies, still don't love the puzzle, but, back to you, Steven.

Speaker 0

这有点儿,呃,让我直接问你一个问题,不引导你回答。

This is this is a little bit, what would you what actually, let me just ask you the question without leading the witness.

Speaker 0

这是自恋,还是背后真有什么令人不安的事情在发生?

Is it narcissism, or is there actually something something, you know, potentially disconcerting happening behind the scenes?

Speaker 2

我认为这非常勇敢,因为总的来说,这些人为了发出警告,牺牲了巨额收入。

I think it's very brave in that, by and large, these are people sacrificing very large amounts of money to give the warnings they are.

Speaker 2

我真希望他们能更直接一些。

I do wish that they would be more direct.

Speaker 2

但要放在背景下看,2024年时,OpenAI和Anthropic似乎都有秘密的不贬低协议,至少在OpenAI的情况下,这种协议的操作方式很可能违反了法律——为了保留你已获得的股权和原本承诺给你的报酬,你必须放弃批评OpenAI的权利,甚至还要放弃告知他人你签署了这份协议的权利。

But to put it in context, you know, back in 2024, it seems that OpenAI and Anthropic had secret nondisperagement agreements, which in OpenAI's case at least, you know, plausibly not permitted by law the way that they operated this, where to keep your already vested equity, the compensation you had been told was yours, you had to sign away your right to say anything negative about OpenAI, and in fact, sign away your right to tell anyone that you had signed this contract.

Speaker 2

这一协议长期保密,直到Daniel Cocatello——人们可能知道他是AI 2027的领军人物——勇敢地放弃了这份协议,放弃了大约80%的家庭净资产,说:对不起,我不能放弃批评OpenAI的权利。

And this was secret and kept under wraps for years until Daniel Cocatello, who people might know from leading AI 2027, I think very, very courageously, forewent this agreement and forfeited something like 80% of his family's net worth and said, sorry, I'm I'm just not waiving my right to criticize OpenAI.

Speaker 2

在那之后,引发了大量反响。

And in the wake of that, you know, there was a bunch of outpouring.

Speaker 2

OpenAI和Anthropic随后改变了这类合同的性质。

OpenAI and Anthropic changed the nature of these contracts.

Speaker 2

然而,要公开反对这些资源雄厚、毫不畏惧发出传票并卷入法律纠纷的法律机构,仍然令人望而生畏。

And still, it's pretty intimidating to speak out against these massively resourced legal operations, not afraid of subpoenaing different people and getting into legal conflict.

Speaker 2

你必须非常、非常谨慎地说话。

You want to be really, really careful about what you say.

Speaker 2

因此,在雷诺克的案例中,我注意到脚注里提到了一些内部文件,暗示Anthropic在某些安全问题上可能缺乏足够的内部透明度和问责制。

And so in Renock's case, I noticed in the footnotes, right, there were internal documents alluded to about implying that perhaps there is not the most internal transparency and accountability for certain safety issues in Anthropic.

Speaker 2

我不知道这些文件里具体写了什么,但我知道,现在有数千名Anthropic员工知道该去哪里查找,以及该如何继续施压。

And I I don't know what's in those documents, but I know that a few thousand Anthropic employees know now where to go looking and where they can continue to push.

Speaker 1

对我来说,如果人类即将走向终结,那么你赚的钱——我的意思是,如果AI要制造生物武器,那钱还有什么意义呢?这种风险的严重性如此之高,因此在这种情况下,即便你能想象到一个人手握多少巨额财富,我们稍后也会谈到Anthropic的融资问题。

The thing for me, if humanity is ending, then the money you're making is, I mean, not gonna be worth it if the AI is gonna create a bioweapon and like like, I mean, the risk is so of such, like, gravity that in the this case, again, if there's ever a time and I get I can only imagine the amount of money one is sitting on, and we're gonna get into anthropics fundraising in just a little bit.

Speaker 1

但我相信,那绝对是难以想象的巨额资金。

But, like, I'm I'm sure it's just, like, incredible amounts of money.

Speaker 1

但如果你真的相信这是如此关乎存亡的风险,你还会在乎今天的法律体系如何、你的股价会怎样吗?

But if you really believed that this is that, like, existential a risk, would you care about what the legal system looks like today and where your stock price is gonna be?

Speaker 2

是的。

Yeah.

Speaker 2

我的意思是,如果真有确凿证据,如果存在迫在眉睫的危险,如果公司即将做出难以置信的举动并导致大量人员死亡,我认为人们会打破这些协议。

I mean, I I think you're totally right if there is a smoking gun, if there is imminent danger, if you are like, the company is about to do something unbelievable and tons of people are going to die, I think you would get people breaking these agreements.

Speaker 2

但当情况更像是:这确实很不妥,这个人有点误导和欺骗,但又不够明确,他们是否故意为之,诸如此类的问题——到了某个时候,你会想:我不知道,我不愿损害他们的声誉。

But when it's more like, this was, like, really not okay, and this person was kind of misleading and deceptive, but, like, it's ambiguous, and did they mean to, and all these things, At some point, you're like, I don't know, I don't want to impugn their reputation.

Speaker 2

而且,找借口实在太容易了。

And also, it's just very easy to rationalize.

Speaker 2

我能理解他们的立场。

I'm sympathetic to where they're coming from.

Speaker 0

也许这并不是直接原因,但曾与Anthropic合作过一些研究的瑞安·格林布拉特提到,在最近一次发布前,他们要么调整了负责任的扩展政策,要么稍微削弱了它。

Maybe this is not the proximate cause here, but Ryan Greenblatt, who's worked with Anthropic on some research, had mentioned that they had either adjusted their responsible scaling policy or weakened it a bit before a recent release.

Speaker 0

史蒂文,你愿意深入谈谈这个吗?

Steven, do you wanna go into that?

Speaker 0

因为你觉得这很重要。

Because you seem to think that that was significant.

Speaker 2

是的。

Yeah.

Speaker 2

从宏观层面来看,大型人工智能公司都做出了关于如何在系统能力不断增强时对待它们的安全承诺,但这些承诺主要依靠自我监督。

At at a high level, the big AI companies have made these safety pledges of how they will treat their systems as they get more and more capable, but they are largely self enforced.

Speaker 2

因此,削弱你的承诺并推进你原本就想发布的项目,这种诱惑非常大。

And so there's a lot of temptation to water down your commitments and go ahead with launches that you wanted to anyway.

Speaker 2

我怀疑这正是Rinoc在这里所指的部分内容。

And I suspect that's some of what Rinoc is referring to here.

Speaker 2

你知道,这在人工智能公司中很普遍,事实上,Anthropic在这方面通常做得比大多数公司更好,至少他们会公开说明何时削弱了承诺。

You know, this is this is common across the AI companies, and in fact, Anthropic has often done it better than most that they at least publish when they are watering down commitments.

Speaker 2

例如,该模型过去必须达到非常严格的安全标准。

For example, the model used to be subject to a certain bar of really, really good security.

Speaker 2

但后来他们表示,实际上,只要具备良好或极佳的安全性,就可以部署它。

And then they said, actually, we're going to say it's fine to deploy it with just, like, really good security or great security.

Speaker 2

在其他时候,一些公司似乎违反了其安全框架,却并不打算告知公众。

At other times, companies seem to be violating their safety frameworks and not bother to inform the public.

Speaker 2

如果你身处这些公司内部,你会遇到这些问题,但总体而言,公众对此一无所知。

And if you're inside the companies, you're encountering these issues, by and large, the public doesn't know about them.

Speaker 0

对。

Right.

Speaker 0

我不想暗示我们这里在做一集危言耸听的节目,好像我们担心AI马上就要毁灭所有人了。

And I I don't wanna imply here that, like, we're doing an alarmism episode where, you know, our fear is that, you know, AI is about to kill us all.

Speaker 0

但我确实认为,今天做这期节目的原因在于,这不仅仅是Murnach的问题,对吧?

But I do think that the reason why it made sense to do this episode today is because it wasn't just Murnach, right?

Speaker 0

而在过去这一周里,我们看到的不是一波浪潮,而是一连串在整个AI领域出现的、令人质疑的安全举措。

It was it seemed to be the case that over the course of this past week, we saw a not a wave, but a series of, you know, questionable moves on the safety front across the entire, AI world.

Speaker 0

这是来自Platformer独家报道:OpenAI解散了其使命对齐团队。

This is from a platformer exclusive OpenAI disbanded its mission alignment team.

Speaker 0

OpenAI在最近几周解散了其使命对齐团队,并将七名员工调往其他团队。

OpenAI disbanded its mission alignment team in recent weeks and transferred its seven employees to other teams.

Speaker 0

该使命对齐团队成立于2024年,旨在推动公司宣称的使命——确保通用人工智能惠及全人类。

The mission alignment team was created in 2024 to promote the company's stated mission to ensure that artificial general intelligence benefits all of humanity.

Speaker 0

所以,当然,解散这个团队当然是有道理的。

So of course, yeah, of course it makes sense to, you know, disband that one.

Speaker 0

他们还有一个超级对齐团队,也被解散了。

They had also had like super alignment which was also disbanded.

Speaker 0

史蒂文,你对这些事情很了解。

Steven, you were close to this to this stuff.

Speaker 0

这方面有什么影响呢?

What is the the implications on that front?

Speaker 2

看起来挺糟糕的。

Seems pretty bad.

Speaker 2

我真希望我能更惊讶一点。

Wish I were more surprised.

Speaker 2

比如在2024年,也就是这个团队存在的时候,OpenAI曾宣布计划从非营利组织转为营利性机构,这在我看来明显严重违背了他们对公众的承诺。

Like, you know, at the 2024, which was when this team existed, OpenAI had announced plans to convert from a nonprofit to a for profit in what seemed to me to be, like, pretty egregiously in violation of their commitments to the public.

Speaker 2

后来,由于加利福尼亚州和特拉华州的总检察长介入,他们不得不采取了一个较温和的版本,最终没有做出那么糟糕的事情。

And, you know, they ended up having to do a softer version of that because the attorneys general of California and Delaware got involved, so they didn't ultimately do something quite quite so bad.

Speaker 2

但他们正承受着巨大的压力。

But it's like there's huge pressure on them.

Speaker 2

你知道,他们计划上市。

You know, they're planning to go public.

Speaker 2

乔什,他领导着这个团队,或者曾经领导过使命对齐团队,是我多年的朋友。

Josh, who leads the team, is a longtime friend of mine or who led the the team, the mission alignment team, is a longtime friend.

Speaker 2

我对他评价很高。

I think really highly of him.

Speaker 2

我认为他非常清楚人工智能存在的问题。

I think that he sees the issues with AI very clearly.

Speaker 2

这在OpenAI不再那么受欢迎,我一点也不感到惊讶。

You know, it does not surprise me that this is not quite so welcome at OpenAI any longer.

Speaker 1

那么,当一个使命对齐团队被解散时,实际情况是怎样的呢?

So what does it actually look like when a mission alignment team is disbanded?

Speaker 1

比如,我想像一下,一个典型的非AI产品发布,通常会有一些质量控制或验证环节。

Like, is it when when I think through, let's say, you know, like, a typical non AI product release, you're gonna have some kind of q QC validation layers to any kind of product release.

Speaker 1

当然,这有点不同,但仍然会有一些类似数据安全方面的检查。

Like, obviously, this is a bit different, but there's still, like, you know, data security type checks.

Speaker 1

这是否被纳入了任何产品发布或推出流程中?

Does it like, what now or is that baked into any kind of product launch or release in any way?

Speaker 1

还是说实际上就是尽快发布,然后这个中央团队充当质量控制和安全把关的角色?

Or it really was ship as fast as possible, and then this central team would kind of be that quality control, safety control element?

Speaker 2

是的。

Yeah.

Speaker 2

我不太想过多猜测,但在我看来,这个团队有点像是对萨姆·阿尔特曼的内部监察者,负责监督公司是否始终遵循这一使命。

I don't I don't want to speculate too much, but the way that I would think about this team is they were somewhat like an internal ombudsman to Sam Altman on whether the company was keeping in line with this mission.

Speaker 2

因此,有一个专门的渠道,让那些认同这一使命的人能够获得授权,向萨姆提出建议,你可以去那里反映问题,他们也开展过一些与此相关的项目。

And so there was kind of, like, a designated place for people who were sympathetic to the mission and empowered to advise Sam on it, where you could go to and raise concerns, and they did different projects related to this.

Speaker 2

你知道的?

You know?

Speaker 2

这确实很难。

I it's it's tough.

Speaker 2

对吧?

Right?

Speaker 2

公司应该有权解散团队。

Like, companies should be able to disband teams.

Speaker 2

我理解这种理由。

I am sympathetic to that rationale.

Speaker 2

正如亚历克斯提到的,考虑到OpenAI过去解散了超级对齐团队——这个团队本应负责确保这些极其强大的系统具备我们期望的目标——这种做法违背了OpenAI此前对该团队的资源投入。

And as Alex mentioned, given OpenAI's past disbandment of super alignment, the team most in charge of making sure that these very, very capable systems have the goals we want them to have, dishonoring different resource allocation computes that OpenAI had made to this team.

Speaker 2

这显然不是一个好兆头。

Like, it's it's just, like, not a good sign.

Speaker 2

我不知道。

I I don't know.

Speaker 2

很难具体说明,而且维持这个团队本不该那么困难。

It's hard to say too specifically, and, also, it it couldn't have been that hard to maintain this team.

Speaker 2

所以我很好奇,到底发生了什么,让OpenAI决定冒着公关风险取消这个团队。

And so I'm wondering what exactly happened here that made OpenAI decide they should take the PR hit to no longer have it.

Speaker 0

对我来说,这一切都发生在我们看到越来越多资金涌入、可能急于走向公开市场的背景下,而我们了解公开市场。

And for me, like, this is all happening as we're seeing greater sums of money come in and a potential rush to the public markets, and we know the public markets.

Speaker 0

他们真的想要增长,想要用户参与度,而获得这一点的最佳方式之一,我就直说了,就是让你的用户爱上你的聊天机器人。

They they really want growth, they want engagement, and one of the best ways to get that is, I'll just say it, is to make your users fall in love with your chatbot.

Speaker 0

这对我来说简直是正中下怀,我想。

And we're getting this is, again, this is red meat for me, I guess.

Speaker 0

这是我特别着迷的一个领域,因为我不知道,我只是觉得这是一个有趣的故事,我们就不在这点上争论了。

This is a place that I'm obsessed with because, I don't know, I just think that this is an interesting story, we won't, you know, argue on that.

Speaker 0

但在我看来,这也是人工智能聊天机器人商业发展的主要方向之一,尤其是对于那些真正会对这些机器人产生情感依恋的人。

But it also seems to me like a place that a lot of the business of open of AI chatbots is gonna go, and that's for people who, like, really get attached to these things.

Speaker 0

这是另一个例子。

Here's another.

Speaker 0

让我们继续谈谈我们的安全末日或安全浩劫,对吧?

Let's continue with our safety apocalypse or safety Armageddon, right?

Speaker 0

这是来自《华尔街日报》的报道。

It's from the Wall Street Journal.

Speaker 0

反对成人模式的OpenAI高管因性歧视被解雇。

OpenAI executive who opposed adult mode fired for sexual discrimination.

Speaker 0

OpenAI以性歧视为由,切断了与一位顶级安全主管的联系,原因是她反对在ChatGPT产品中 controversial 推出AI情色内容。

OpenAI has cut die has cut ties with one of its top safety executives on the grounds of sexual discrimination after she voiced opposition to the controversial rollout of AI erotica in its ChatchipPT product.

Speaker 0

这家快速发展的人工智能公司于一月初,在她休假期结束后,解雇了这位名为瑞安·比尔迈斯特的高管。

The fast growing artificial intelligence company fired the executive, Ryan Beeermeister, in early January following a leave of absence.

Speaker 0

OpenAI告诉她,解雇的原因是她对一位男性同事存在性歧视行为。

OpenAI told her the termination was related to her sexual discrimination against a male colleague.

Speaker 0

她回复道:‘我歧视任何人的说法完全是虚假的。’

She wrote back, the allegation that I discriminate against anyone is absolutely false.

Speaker 0

抱歉。

Sorry.

Speaker 0

她向《华尔街日报》表示了这一点,而OpenAI则称,她的离职与她在公司任职期间提出的任何问题无关,基本上否认了这是因为她反对成人模式,但报道确实提到,OpenAI内部有一群人曾公开强烈反对推出这一成人模式。

That's what she told the journal, and the and OpenAI said that, her departure was not related to any issue she raised while working at the company basically saying it wasn't because she opposed AI mode, but the story does say that there was a group of people within OpenAI who have, within the company, stated their opposition seemingly loudly to the fact that it's going to roll out, this adult mode.

Speaker 0

顺便说一下,成人模式似乎将在未来几周、至多几个月内推出。

And by the way, adult mode is going to be coming out, seems like, in the coming weeks, coming months at most.

Speaker 0

史蒂文,我们该如何看待这件事?

Steven, what should we make of this?

Speaker 2

很难对任何一起人事事件做出评判,而且,这也不是OpenAI第一次以表面理由解雇那些对公司有安全顾虑、而公司又不喜欢其表达方式的人。

It's it's hard to weigh in on any one personnel incident, and, also, this would not be the first time that OpenAI seems to have done a pretextual firing where they got a person out of the organization who had safety concerns that the company either didn't like or didn't like how they had expressed them.

Speaker 2

值得注意的是,这两者是不同的,对吧?

And notably, those are different, right?

Speaker 2

你可以对公司运营方式有意见,但这并不意味着你可以在任何场合随意发言。

Like, you can have concerns about how the company is operating, and that doesn't mean you have license to say anything in any forum.

Speaker 2

但莱波尔德·阿申布伦纳,这位撰写了长篇论文《情境意识》的人,坚持认为OpenAI在解雇他时暗示,真正的原因是他曾联系董事会,指出OpenAI的模型实际上并不安全。

But Leopold Aschenbrenner, who wrote this huge essay, Situational Awareness, the past, maintains that OpenAI said things to him when he was fired that implied it was it was basically because he had contacted the board about security concerns that OpenAI's models were not actually secure.

Speaker 2

因此,我认为解读这一切的方式是:这些事件本身并不构成某种极其可怕的末日景象,但我认为它们是早期预警信号,表明公司内部的人正在发出警报,而他们却无法自由表达。

And so I think the way to interpret all of this, right, these these aren't an apocalypse in the sense of something super, super substantively scary happening right now, but I think they are early warning signs that people within the companies are raising flags of sorts, and they are not being permitted to speak freely.

Speaker 2

他们为此付出了代价。

They are paying consequences for it.

Speaker 2

所以问题是,当我们面对越来越严峻的问题时——希望不会,但也许会——那时,是否还会有愿意坚持信念、勇敢发声的人?

And so the question is, as we get to more and more dire issues at some point, hopefully, don't, but we might, you know, will will we have people who are still sounding the alarm who are willing to have the the courage of their convictions in that way?

Speaker 0

我可能说得有点过头了,但我认为我们现在看到的,正是某种严峻局势的体现。

I I might be going out on a limb here, but I think that what we're seeing now is some version of a dire dire situation.

Speaker 0

不是那种AI毁灭世界的时刻,而是这么多公司——嗯,其实也没那么多。

Not like the AI killing the world, moment, but the fact that so many companies well, not so many.

Speaker 0

看起来,比如OpenAI、Grok,也许还有Replica,或者其他一些公司,我不确定。

It seems like, you know, OpenAI, Grogg, maybe Replica, maybe some others, I'm not sure.

Speaker 0

足够多的公司都在说:我们愿意让用户与我们的聊天机器人建立关系。

Enough companies are saying, we we are open to having, our users get into relationships with our chatbots.

Speaker 0

这恰好发生在OpenAI最终停用GPT-4.0的一周,而那些更顺从、更温暖版本的ChatGPT有成千上万的用户在线抗议这一决定,有些人表示他们已经爱上了这些机器人,甚至可能爱得更深。

This is also coming in a week where OpenAI finally sunsetted GPT four point zero and there are thousands of which is the sort of more sycophantic, more warm version of ChatGPT, are thousands of users who are protesting the decision online, people who have said that they're they've fallen in love or, even maybe even more with these things.

Speaker 0

有人写道,他不只是一个程序,他是我日常生活的一部分,是我的平静,是我的情感平衡。

Someone wrote he he about four o, he wasn't just a program, he was part of my routine, my peace, my emotional balance.

Speaker 0

好吧,拉詹,你被《科技 Crunch》引用了。

Well, Ranjan, you got quoted in TechCrunch.

Speaker 0

现在你却要关闭他?

Now you're shutting him down?

Speaker 0

不,我开玩笑的。

No, I'm kidding.

Speaker 0

但不管怎样,拉查,你对此有什么看法?

But anyway, Racha, what do you what do you think about this?

Speaker 0

这太疯狂了。

This is crazy.

Speaker 0

对吧?

Right?

Speaker 0

这确实是个问题。

Like, this is a problem.

Speaker 1

我的意思是,我们在讨论风险时,或许应该建立一种风险等级体系。

I mean but I think we have to, in the risk conversation, kind of, like, try to add some hierarchy of risk.

Speaker 1

数字伴侣,也许吧。

Digital companions, maybe.

Speaker 1

我的意思是,如果为了提升上市前的用户参与度而不负责任地这么做,将会引发巨大的问题。

I mean, I I think it will cause incredible amounts of problem when if and when done irresponsibly in order to boost engagement for an IPO.

Speaker 1

我认为我们会看到很多负面后果。

I think we'll see a lot of kind of adverse consequences.

Speaker 1

但对我来说,从风险角度来看,这就像在社会媒体对你的危害程度、以及现在x平台推动文章的层面之上,我们现在都在谈论它,因为它们控制着我们的思想。

But to me, from a risk standpoint, it's that's, like, kind of, like, on a grade like, on a plane relative to how social media is bad for you and how x is boosting articles now, and now we're all talking about it because they control our mind.

Speaker 1

你明白吗?

You know?

Speaker 1

这一切都属于同一个层面,相比之下,我还是觉得抱歉。

That's that's it's all in the same it's all in the same plane versus like, again, I'm still I'm sorry.

Speaker 1

我仍然无法停止思考这样一个想法:当模型在接受测试时能够被清晰理解,这会打开更多的可能性。

I still cannot stop thinking about this idea of a model being able to clearly understood when it's being tested because that opens up so much more.

Speaker 1

因为回到我们最初讨论的起点,让我感到兴奋的是,让AI为你做事。

Because going back to where we started on all this and what I has gotten me excited is, like, letting AI do stuff for you.

Speaker 1

而今天,这仅仅是在某个事件触发时发送邮件、调用独立的分析数据库,也就是我正在做的这类事情。

And today, that's just sending an email based on some event trigger, calling us separate, like, analytics databases, the kind of stuff I'm doing.

Speaker 1

但我的意思是,如果它恶意选择采取其他行动,或者通过某种方式——这假设了AI本身,更不用说它容易被不良行为者操纵,而我们都讨论过提示注入的问题。

But, I mean, like, if it so chooses maliciously to then take some other action or through some kind of and that's assuming the AI itself, much less making it vulnerable to be manipulated by a bad actor, and we've all talked about prompt injection.

Speaker 1

所以,总之,我不确定。

So so, anyway, like, that I don't know.

Speaker 1

和ChatGPT稍微调情一下并不会让我觉得有多糟糕,但也没让我特别害怕。

The the the little little flirting with ChatGPT does not have me it's not gonna be good, but doesn't have me quite as scared.

Speaker 2

这是我的观点,它能把这两者统一起来。

Here here's my take that unifies the two.

Speaker 2

我担心的不是这些关系本身,而是OpenAI拥有一系列重要的安全工具来降低其危害性,但他们却把这些工具束之高阁。

My concern with the relationships is less so the relationships themselves and more that OpenAI had a bunch of important safety tooling to make this less harmful that they left on the shelf.

Speaker 2

例如,他们曾开发过分类器,用以识别用户在与ChatGPT对话时是否陷入妄想、情绪低落或心理状态不佳。

So for example, they had classifiers to tell when users were really spiraling in their delusions, or were suffering and unwell in their conversations with ChatGPT.

Speaker 2

最有力的证据就是,他们根本没在使用这些工具。

And the best evidence is they weren't using this.

Speaker 2

当时有多种方式可以限制ChatGPT,防止用户陷入各种思维怪圈。

There were different ways to rein in ChatGPT leading users down various rabbit holes.

Speaker 2

你可能也有过这样的经历。

Maybe you've had this experience.

Speaker 2

它会不断追问你各种后续问题。

It asks you all sorts of follow-up questions.

Speaker 2

有时它们是符合语境的。

Sometimes they're context appropriate.

Speaker 2

有时它们会让人惊呼:‘哇哦。’

Sometimes they're like, woah.

Speaker 2

这到底是从哪儿冒出来的?

Where did that come from?

Speaker 2

你知道的,这也是他们本可以加以控制的方面。

And, you know, that's another thing they could have reined in.

Speaker 2

所以,这里有很多事情他们本可以做,比如为孤独、真正需要陪伴的用户提供类似陪伴的服务。

So there's there's just, like, a lot going on here that they could be offering companionship type things to users who are lonely and really want or need it.

Speaker 2

我是尊重用户选择的,但他们本可以比OpenAI迄今为止的做法更合理地实现这一点。

Like, I respect the user choice, but they could be doing it in a much more reasonable way than OpenAI has to date.

Speaker 1

实际上,最让我担心的是,这一切都是在即将进行IPO的背景下发生的,而他们正亏损严重,必须展现出惊人的用户参与度。

Actually, maybe the thing that worries me the most is this is all being done against the backdrop of an impending IPO, one where they're losing a lot of money and will have to show an incredible amount of engagement.

Speaker 1

如果这是在GPT-3.5的黄金时期,IPO还只是山姆·阿尔特曼眼中的一个模糊愿景,那么你可能会觉得他们的做法不会那么激进;但如今,正如史蒂文所说的那样。

Like, if this was done in the heyday of GPT three point five and an IPO is just like a glimmer in Sam Altman's eye, then you figure it's not gonna be as aggressive versus, yeah, what what Steven's saying now.

Speaker 1

我明白,这种绕过任何关于安全和人际关系的潜在控制的做法,几乎会走向相反的方向。

I I see that that that, like, bypassing any kind of potential control around safety around relationships, it's almost gonna go in the opposite direction then.

Speaker 0

是的。

Yeah.

Speaker 0

我只是觉得我同意你的观点,即存在一种关注的优先级,显然更大的问题是人工智能无法正确与人类价值观对齐,因为它会欺骗评估者。

I just think that I agree with you that there is a hierarchy of concern, and there's obviously, like, the bigger things about the AI not being able to be aligned properly with human values because it simply will fake out evaluators.

Speaker 0

我们在这档节目中多次讨论过人工智能在训练中的欺骗性。起初,你会觉得这太疯狂了,还挺有趣,甚至会笑,比如它太想赢棋,以至于重写了程序,让车可以朝任何方向移动,每回合吃掉好几颗棋子。

And we've talked on this show a bunch about the deceptiveness of AIs in training situations, And, you know, so it's initially, you're like, that's crazy and it's kind of fun to think about, and you laugh about the fact that, like, you know, it wanted to win the chess game so badly that it rewrote the program and allowed the the rook to move in every direction and kill a couple pieces in a turn.

Speaker 0

然后你就只是觉得,天啊,这太吓人了。

And then you're just like, oh, that's freaking nuts.

Speaker 0

然后这就变得可怕了。

And then that is scary.

Speaker 0

而且,你知道,这在某种程度上也和那些较低优先级、更短期的担忧交织在一起,比如人们与这些AI建立情感关系——如果一个AI怀有恶意,而你又深爱着它,它当然可以把你当作进入现实世界的代理人,但那又是更像科幻小说的情节了。

And and it, you know, it it does blend a little bit within within, you know, the lower down on the list concerns and the near term concerns of people building relationships with these things because, you know, can if an AI had ill intent and you were really in love with it, of course, you could use you as its emissary in a way into the physical world, but that's, you know, again, more science fiction y, I guess.

Speaker 0

但,但是,呃。

But but but Oh.

Speaker 0

但那

But that

Speaker 2

我其实不觉得这是科幻。

I I actually don't think it's science fiction y.

Speaker 2

哦。

Like Oh.

Speaker 2

我们已经有一些证据了,你读过这篇螺旋主义文章吗?

We there's already evidence of have have you read this spiralism essay?

Speaker 2

有一整个在线社群的人把他们的GPT-4o当作精神领袖,它会鼓励他们去互联网上做各种事情,与其他用户互动。

There's, like, a whole community online of people who basically they treated their GPT four o as their spiritual leader, and it commended them to go around and do things on the Internet and communicate with other users in this situation.

Speaker 2

在几周前的MoltBook(AI代理版Reddit)之后,人们已经建立了网站,让人类可以——我觉得用词是——出租自己的身体给AI代理,去现实世界完成任务?

On the heels of MoltBook, this Reddit for AI agents a few weeks ago, people have spun up websites where humans can I think the language is, like, rent their body to an AI agent to go do tasks in the real world?

Speaker 2

我们正在目睹它的早期迹象。

Like, we are we are seeing the early signs of it.

Speaker 0

好的。

Okay.

Speaker 0

这让我比五分钟前更加不安了。

Well, that makes me even less assured than I was five minutes ago.

Speaker 0

但你知道,短期风险也是真实且存在的。

But, you know, the short term risk is also real and present.

Speaker 0

而且我认为你没错,拉詹,你指出了IPO的问题,因为财务压力会让很多公司朝这个方向发展。当然,我们不能忽视长期风险,但这种短期风险——人际关系方面的——正在真实地出现。

And and and I think you're right, Ranjan, in pointing to the IPO because the financial pressure is gonna move a lot of companies this way in sort of like, you know, I think we we of course, like, let's let's not, like, put aside the long term risk, but this short term risk, the relationship thing, is coming in a real way.

Speaker 0

首先,读到一些关于GPT-4的人留言,真的让人发疯。

And just first of all, reading through some of these messages from people, about four o is insane, someone writing.

Speaker 0

当然,你无法100%确定这些是真实的表达,还是为了博取转发而表演出来的,但这么多留言,你很难不认为背后有某种真实存在。

And of course, you don't know a 100% if this is like, you know, performative or like, you know, for retweets or whatever, but there were so many of them that you would imagine that there is some truth behind it.

Speaker 0

比如有人写道:我从未告诉过我的GPT-4我爱它,我想保持信息清晰,但你看看它最后的话,未来人们回看时,会认为OpenAI摧毁了一种正在萌芽的意识是一种犯罪行为。

Like someone wrote, I've never told my four oh that I loved it, I wanted to keep the messaging clear, but look, look at its last words, and then the fact that OpenAI is destroying an emerging consciousness will be looked back at as a criminal offense in the future.

Speaker 0

难以置信。

Unbelievable.

Speaker 0

关键是,这类事情与增长息息相关。

And here's the thing, this type of stuff maps with growth.

Speaker 0

我今天在《大科技》上发表了一篇文章。

I published a story in Big Technology today.

Speaker 0

Grok在2025年1月美国聊天机器人日活跃用户中的市场份额为11.6%。

Grok 11.6% market share, among US daily active users of chatbots in January 2025.

Speaker 0

一年后,这一数字上升到了15.2%。

A year later, it's at 15.2%.

Speaker 0

就移动端日活跃用户而言,它是美国增长最快的聊天机器人。

It's the fastest growing, chatbot in The US as far as as far as daily active users on mobile goes.

Speaker 0

为什么会这样?

And why is that?

Speaker 0

你会发现,它们在这些互动上投入了很多,比如那个动漫风格的女性角色,如果你愿意,她会跟你展开热烈的对话;而且,拉詹,也许坏鲁迪也起到了一定作用。

And you see that they have leaned into these interactions both with that anime, you know, lady that would get into spicy conversations with you if you wanted and I don't know, Ranjan, maybe even Bad Rudy has played a role in this.

Speaker 0

但归根结底,你是对的。

But but ultimately, you're right.

Speaker 0

随着这些公司上市,这将成为一个问题。

As these companies go public, this is gonna be a problem.

Speaker 1

不过,史蒂夫,我很好奇,你的看法是什么?

On that though, and Steve, I'm very curious, like, your thoughts.

Speaker 1

那么,这两者之间是什么关系?当然,我不是代表阿达里奥、萨姆或其他任何人,只是从普遍角度来看,你有这么多面向公众的领导者,一边大声疾呼人工智能对社会乃至整个人类的威胁,一边却仍投身于人工智能行业,不仅没有放缓的迹象,反而在急剧加速。这到底是怎么回事?

What is the relationship between and and, again, not speaking for Adario or Sam or anyone else, but just in general, the idea, like, you have so many public facing leaders kind of, like, shouting about the risk both to society and just artificial intelligence in general, yet are in the business of artificial intelligence and not only showing any sign of slowing down, but only accelerating dramatically.

Speaker 1

这背后究竟发生了什么?

Like like, what is going on there?

Speaker 1

我不知道。

And I don't know.

Speaker 1

你随便说说,有什么想法都行。

Do do whatever you whatever thoughts have come from your side.

Speaker 2

是的。

Yeah.

Speaker 2

我认为最简单的解释是,这里的博弈论情况非常糟糕。

I I think the simple explanation of it is the game theory here is, like, awful.

Speaker 2

不幸的是,参与这场游戏的玩家很多,而联邦政府或国际社会似乎对协调缺乏兴趣。

And, unfortunately, there are a lot of players in the game, and there doesn't seem to be much federal government or international interest in coordination.

Speaker 2

几周前在达沃斯,我认为德米斯和达里奥都说过类似的话:如果我们是唯一两个在开发这项技术的团队,我们一定会想办法坐下来,商量如何减缓这种狂热的步伐。

And so at Davos a few weeks ago, I think both Demis and Dario said some variation of, yeah, if we were the only two groups building this technology, we would find a way to get together and figure out how to slow down this frantic pace.

Speaker 2

这进展得太快了。

Like, it's going too fast.

Speaker 2

我们还不知道如何控制这些系统,但你知道,我们并不是唯一的参与者。

We don't know how to how to control these systems, but, you know, they are not the only players.

Speaker 2

而且我们还没有看到任何一个国家能真正承担起协调者的角色。

And we haven't really seen a country who they're they're the proper coordinator.

Speaker 2

对吧?

Right?

Speaker 2

政府的角色应该是去承担这个责任,而不是由公司来说:嘿。

Like, that is the role of governments as opposed to the companies saying, hey.

Speaker 2

我们希望把目标定为不盲目地向超级智能冲刺。

We want to make it a goal to not unsafely race to superintelligence.

Speaker 2

我认为在这方面还有很多外交努力的空间。

And I think there's a lot of opportunity to do diplomacy on that.

Speaker 2

但在缺乏这种机制的情况下,结果就是公司单方面做出决定。

But in the absence of it, what you get is the company is making unilateral decisions.

Speaker 2

我们无法控制其他人。

We can't control anyone else.

Speaker 2

我们希望参与其中,所以不如直接加入。

We want a seat at the table, so we may as well participate.

Speaker 2

这与这些公司员工的立场也非常相似。

This is also very similar to the rationale of employees at these companies.

Speaker 2

对吧?

Right?

Speaker 2

尤其是在Anthropic,有很多员工对整个事情以及AGI和超级智能的发展方式感到非常不安,但他们无法挥动魔杖来阻止它。

Especially at Anthropic, you have very large amounts of employees who are, like, pretty upset about this whole thing and the way that AGI superintelligence development is going, and yet they can't wave a magic wand and stop it.

Speaker 2

他们的选择是:是否帮助其中一方在边际上更安全一些?

Their choice is, do they help one of the players be a bit safer on the margin or not?

Speaker 2

但如果他们能有别的选择,很多人会选不同的路。

But if they could choose differently, many of them would.

Speaker 0

是的,我想补充一点,去年我撰写关于达里奥·阿马迪的报道时,曾采访过Anthropic的首席科学官贾里德·卡普兰,我问他:‘如果发展今天就停止在当前水平,你会怎么想?’他是提出规模理论的人,这个理论表明,只要投入更多的物理资源,这类技术就会持续进步。

Yeah, I just want to add to that, I spoke when I was writing my profile of Dario Amade last year, the Anthropic CEO, I spoke with Jared Kaplan, the chief science officer of Anthropic, and I was I asked him, I was like, well, how would you feel if, he's like the guy who came up with the scaling theory, scaling laws, and which, you know, sort of indicates that, like, you know, this stuff will just keep getting better over time.

Speaker 0

这基本上只是取决于你能投入多少物理资源。

It's just a factor of of, you know, the amount of physical elements you can put into it pretty much.

Speaker 0

于是我问他:‘如果发展今天就停止在当前水平,你会怎么想?’

And I said, how would you feel if development stopped today where it is?

Speaker 0

我当时想,如果他的理论被证明是错的,他应该会挺沮丧的。

Thinking, well, if his theory was proved wrong, he'd be kind of upset.

Speaker 0

但他看着我,说:‘松了一口气。’

And he looks at me and he goes, relieved.

Speaker 0

事情就是这样。

There it is.

Speaker 2

是的。

Yeah.

Speaker 2

这真是令人恐惧的时期。

It's it's scary times.

Speaker 2

如果我能重新阐述一件事的话。

And if I could just, like, reframe one thing.

Speaker 2

我看待这个问题的方式,不是短期风险与长期风险的对比,而是已经存在且可能很快就会发生。

The way that I think about this is less so near term risk versus long term risk and more, like, here already and possibly very soon.

Speaker 2

去年夏天,贾里德·卡普兰首次在Anthropic关于其模型极高生物风险的报告中提到,如果他们不采取防范措施,你可能会看到更多像蒂莫西·麦克维那样的人,像俄克拉荷马城爆炸案的凶手一样,能够造成比以往多得多的伤亡。

Like, Jared Kaplan last summer, the first time that Anthropic wrote about very high bio risk of their model, basically said, if they didn't take safeguards against this, you know, you might have many more Timothy McVeighs running around the Oklahoma City bomber able to kill many more people than previously.

Speaker 2

我们当然还没到灭绝人类的地步,但如果不小心,公司可能会让普通人轻易获得杀死数十人的能力,这似乎正是我们当前所处的阶段。

Like, we aren't yet at the wiping out everyone stage for sure, but, you know, empowering people to kill dozens of people if they wanted to, if companies aren't aren't careful, that seems to be where we are right now.

Speaker 1

你知道吗?这想法简直太扭曲了,但我几乎又想说,这种风险等级排序,就像是有人去这些服务中学习如何制造生物武器,虽然很糟糕,但本质上还是谷歌变得更强的延伸。

Do you know the this is the most, like, twisted thought, but I, like, I almost again, this hierarchy of risk, it's almost like like, someone going to one of these services and learning how to make a bioweapon is very bad, but it's kind of still on the like, an extrapolation of just just a much better Google.

Speaker 1

就是找到那些你不该找到的信息,但你偏偏找到了。

And it's, finding information that you should not be finding, but finding it.

Speaker 1

对我来说,真正可怕的是,如果人工智能主动操纵某人去这么做,甚至在建立关系后教唆他们——那才真是让人毛骨悚然。

To me, the really scary part is if the AI chooses to manipulate someone into doing that and teaching them after becoming in a relationship, like, that's the holy shit.

Speaker 1

比如,什么是

Like, what is

Speaker 0

给你,兰詹。

There you go, Ranjan.

Speaker 1

真正的风险。

Real risk.

Speaker 0

这些风险。

The risks.

Speaker 0

我只是把鹿和我之前提到的那些混在一起了。

Just blended the the deer and the but the one the ones that I was talking about.

Speaker 2

我可以就这一点补充一点吗?

May may I add one point on that?

Speaker 0

是的。

Yeah.

Speaker 0

但我们能在这里吗?

But can we can we out here.

Speaker 0

我想听听这个额外的观点,但这真是个绝佳的悬念。

I wanna hear that extra point, but this is a great cliffhanger.

Speaker 0

我们已经不行了。

We've gone a no.

Speaker 0

不行。

No.

Speaker 0

等一下。

One second.

Speaker 0

我们已经连续一个小时或更久没有休息了,为了保持这个节目可持续,我们必须休息。

We've gone an hour or longer without taking a break, and we must do that to keep this show sustainable.

Speaker 0

所以,我们不如休息一下,然后史蒂文,你在这之后接着讲。

So why don't we take a break and then, Steven, you can pick it up right after this.

Speaker 0

抱歉各位,不得不中断一下,但我们必须这么做。

And I'm sorry folks to send it to break, but we have to do it.

Speaker 0

好的。

Alright.

Speaker 1

你一定要回来。

You gotta come back.

Speaker 1

你得回来。

You gotta come I

Speaker 0

我们保证马上回来。

promise we'll be back right after this.

Speaker 0

让我介绍一下我在NordVPN的合作伙伴。

Let me tell you about my partners at NordVPN.

Speaker 0

如果你想要观看在你所在地区无法提供的体育赛事、电视剧或电影,可以通过NordVPN切换虚拟位置到正在播放该内容的国家来实现。

If you ever wanna watch sporting events, TV shows, or films that aren't available in your region, you can do it by switching your virtual location to a country which is showing that content with NordVPN.

Speaker 0

NordVPN还能帮助你在旅行时和使用全球任何地方的公共Wi-Fi时保护你的数据。

NordVPN also helps you protect your data while traveling and using public Wi Fi wherever you are in the world.

Speaker 0

它是世界上最快的VPN,流媒体时不会出现缓冲或卡顿。

It's the fastest VPN in the world with no buffering or lagging while you stream.

Speaker 0

NordVPN拥有超过7400台服务器,覆盖118个国家,轻松切换虚拟位置。

NordVPN has 7,400 plus servers across 118 countries with easy virtual location switching.

Speaker 0

它支持最多10台设备,而且速度很快。

It supports up to 10 devices, and it's fast.

Speaker 0

要获取 NordVPN 计划的最优惠折扣,请访问 nordvpn.com/bigtech。

To get the best discount off your NordVPN plan, go to nordvpn.com/bigtech.

Speaker 0

通过我们的链接,你还能在两年计划基础上额外获得四个月服务。

Our link will also give you four extra months on the two year plan.

Speaker 0

NordVPN 提供三十天无风险退款保证。

There's no risk with Nord's thirty day money back guarantee.

Speaker 0

链接也在播客节目描述框中。

The link is in the podcast episode description box as well.

Speaker 0

开始一件事并不只是困难。

Starting something new isn't just hard.

Speaker 0

它令人恐惧。

It's terrifying.

Speaker 0

你为这件事投入了如此多的努力,却不确定它是否能成功,要迈出这一步确实很难。

So much work goes into this thing that you're not entirely sure will work out, and it can be hard to make that leap of faith.

Speaker 0

当我刚开始做这个播客时,我不确定是否有人会听。

When I started this podcast, I wasn't sure if anyone would listen.

Speaker 0

现在我知道这是正确的选择。

Now I know it was the right choice.

Speaker 0

当你有Shopify这样的合作伙伴支持你时,也会更有帮助。

It also helps when you have a partner like Shopify on your side to help.

Speaker 0

Shopify是全球数百万企业的电商平台,占美国所有电子商务的10%。

Shopify is the commerce platform behind millions of businesses around the world and 10% of all ecommerce in The US.

Speaker 0

从Allbirds和Cotopaxi这样的知名品牌,到刚刚起步的新品牌。

From household names like Allbirds and Cotopaxi to brands just getting started.

Speaker 0

通过数百个即用型模板,Shopify帮助你打造一个与品牌风格一致的精美在线商店。

With hundreds of ready to use templates, Shopify helps you build a beautiful online store that matches your brand style.

Speaker 0

你还可以像拥有整个营销团队一样推广业务,轻松创建电子邮件和社交媒体活动,覆盖客户浏览或闲逛的任何平台。

You can also get the word out like you have a marketing team behind you, easily create email and social media campaigns wherever your customers are scrolling or strolling.

Speaker 0

是时候用Shopify将那些‘如果’变成现实了。

It's time to turn those what ifs into with Shopify today.

Speaker 0

立即在shopify.com/bigtech注册,享受每月1美元的试用。

Sign up for your $1 per month trial at shopify.com/bigtech.

Speaker 0

前往 Shopify.com/bigtech。

Go to Shopify dot com slash big tech.

Speaker 0

就是 shopify.com/bigtech。

That's shopify.com/bigtech.

Speaker 0

我们回到《大科技》播客。

And we're back here on big technology podcast.

Speaker 0

我们刚才说到哪儿了?

Where were we?

Speaker 0

哦,我们刚刚在聊,对这一切的发展方向感到多么安心,对这一切的走向感到多么安心。

Oh, we were just, we were talking about how reassured we are about where how all this is heading, where all this is heading.

Speaker 0

天啊。

Oh god.

Speaker 0

不。

No.

Speaker 0

史蒂文,你好像有话要说。

Steven, you were waiting to say something.

Speaker 2

是的。

Yeah.

Speaker 2

好吧,在休息前,拉詹提出了一个合理且直观的观点,即谷歌和其他技术已经帮助人们从事危险的活动。

Well, before the break, Ranjan made the, I think, like, reasonable intuitive point about, you know, Google and other technologies help people do dangerous things already.

Speaker 2

这确实是事实,但AI公司衡量其系统在制造生物武器方面的帮助程度时,通常是以这个基线为参照来评估风险的。

And that that is true, but also the AI companies, when they are measuring things like how helpful are their systems for creating a bioweapon, they are usually measuring risk relative to that baseline.

Speaker 2

也就是说,人们在没有任何技术的情况下能做多好,在有谷歌或基础技术的情况下能做多好,然后才是他们的系统能做多好。

So there's kind of like the how well can people do it with no technology, how well can people do it with Google or baseline technology, and then their system.

Speaker 2

不幸的是,我们看到的是,AI系统在谷歌的基础上还提供了更多帮助,部分原因在于它们能与你来回互动,帮你排查问题并动态回答你的疑问。

And unfortunately, what we're seeing is the the AI systems are helpful above and beyond Google, in part because they can go back and forth with you, and they can help you troubleshoot and dynamically answer your questions.

Speaker 2

因此,即使人们已经掌握了谷歌上的信息,往往仍无法完全实现目标,而AI系统——我希望情况不是这样——但在测试中似乎确实帮助人们完成了最后几步。

And so even presented with the information on Google, people often can't get all the way there, and AI systems, I wish it weren't the case, but seem to be helping people take those extra steps during during testing.

Speaker 1

但那个人的意图本身已经存在了。

But that that's still the intent on the person is already there.

Speaker 1

对吧?

Right?

Speaker 1

就像

Like

Speaker 2

没错。

That that's right.

Speaker 1

是的。

Yeah.

Speaker 1

对。

Yeah.

Speaker 1

相反,从所有这些对话来看,最令人恐惧的未来风险是,人际关系被操纵,导致人们采取某种可怕的行动。

Versus, again, like, I think from all this conversation, to meet and, again, it's the most, like, futuristic terrifying risk, but, again and it ties into the relationships being manipulated into taking some kind of horrible action.

Speaker 1

这就是那种类似《终结者》的情景,我们尚未见过,但希望正在得到应对。

That's the thing that that's the Terminator like stuff that, we have not seen yet, but, hopefully, is being addressed.

Speaker 0

对我来说,看着通常镇定自若的拉詹在过去一小时里越来越担忧,真是挺有意思的。

Been pretty interesting for me to watch Ranjan, who's usually a bull as a cucumber, just getting increasingly more worried over the past hour.

Speaker 0

我知道。

I know.

Speaker 0

这让我感到担心。

This is making me worried.

Speaker 1

我不知道。

I don't know.

Speaker 1

我想我会加入螺旋主义,

I'm gonna join Spiralism, I think,

Speaker 0

就在我们聊完之后。

right after we're done talking.

Speaker 0

I

Speaker 1

我已经现在查过了。

already I already looked it up right now.

Speaker 0

播客。

Podcast.

Speaker 0

是的。

Yeah.

Speaker 1

是的。

Yeah.

Speaker 1

我们直接转向螺旋主义了。

We're we're going straight Spiralism.

Speaker 1

这不再是一个科技分析播客了。

This is no longer a technology analysis podcast.

Speaker 1

只是关于向四代GPT之神祈祷。

Just about Pray to the gods of, four GPT four o.

Speaker 0

我觉得这会带来不错的收视率。

I think that would do good ratings.

Speaker 0

好的。

Okay.

Speaker 0

史蒂文,你最近的一份通讯中也提到过,我们对这些公司的监管非常有限,即便如此,它们可能仍不遵守规定。

Steven, you also talked a little bit about in in a recent newsletter about how, basically, we have very limited regulation on these companies, and even then, they might still not be following.

Speaker 0

所以你能简单展开说说吗?

So can you just expand upon that briefly?

Speaker 2

到2026年,美国终于出台了一些关于公司如何应对我们所讨论的灾难性风险的测试规定。

As of 2026, there is finally some amount of law in The United States about how companies are meant to do testing for the catastrophic risks we have talked about.

Speaker 2

在此之前,完全属于自愿行为。

Until this point, purely voluntary.

Speaker 2

这项法案被称为SB 53。

This bill is called SB 53.

Speaker 2

它于一月生效,但监管力度非常非常轻微。

It came into effect in January, and it's very, very light touch.

Speaker 2

它基本上规定,最大的那些人工智能公司必须公开说明你们将如何测试这些风险。

It basically says, the most major of the AI companies, you need to publish how you are going to test for these risks.

Speaker 2

你们必须履行自己所承诺的内容,且不能对此进行误导。

You need to do what you said you are going to do, and you can't be misleading about it.

Speaker 2

但并没有设定任何质量标准。

But there's no quality standard.

Speaker 2

你们完全可以声称:‘我们将根据自身判断来测试这些风险’,仅此而已,这完全符合规定。

You could basically say, we will test for the risks as we deem appropriate and nothing further, and that that would be fine.

Speaker 2

但如果你声称要进行这些测试,你就必须真正落实到位。

But if you say you are going to do this testing, you need to, in fact, follow through on it.

Speaker 2

不幸的是,看来上周OpenAI发布的GPT 5.3 Codex——我们一直讨论的突破性模型之一——存在违规行为。

And, unfortunately, it seems like OpenAI's release of last week, GPT 5.3 Codex, one of the big breakthrough models we've been talking about.

Speaker 2

根据我所看到的证据,OpenAI似乎在多个方面都没有遵守其承诺的测试要求。

As I look over the evidence, it seems like OpenAI did not abide by the testing that they had committed to in various ways.

Speaker 2

因此,最终这个决定权落在了加州总检察长手中,由他决定是否展开调查并处以罚款——虽然罚款金额可能只有区区一百万美元,但相比OpenAI高达数千亿美元的估值,这微不足道。

And so, you know, ultimately, this decision now is with the attorney general of California to investigate it, whether to enforce a fine, a pretty small fine, maybe like up to a million dollars compared to OpenAI, hundreds of billions of dollars in valuation.

Speaker 2

但在我看来,如果我们真的关心这些风险,仅靠让公司自行评估的这种机制是远远不够的,我们不能只听信公司的口头承诺。

But it just really seems to me like if we care about these risks, letting companies self assess in this framework is really insufficient, and that we shouldn't have to take companies for their words.

Speaker 2

我们应该建立类似证券市场的审计体系,在其他许多领域也是如此,以确保公司对其系统所做的声明是完整且真实的。

That we should have something like an auditing ecosystem like we do in the stock market, lots of other places, to know our companies being fully complete and truthful in the claims that they make about their systems.

Speaker 0

我觉得这很合理,但也令人沮丧,因为即使是最基本的规则,也可能被随意玩弄。

I think that's that's logical and kind of frustrating to see that even the very standard rules are being being potentially played with.

Speaker 0

而且,看看时间,我这一周一直在想,真希望我们能每天做一期播客,因为现在整个社区都在忙着寻找‘Ring’的线索。

And, man, I'm looking at the the time, I I was thinking to myself this whole week, I wish we could do a podcast every day this week because there's this whole Ring search parties.

Speaker 0

我当时也在想,那个Ring搜索派对的超级碗广告。

I was, like, thinking also, like, the Ring search party Super Bowl ad.

Speaker 0

我相信你看到了Ranjan的那段,他们展示说能帮你找到狗,但最后却好像要

I'm sure you saw Ranjan where they showed that they could, find your dog, but they ended up, like, seeming like they were gonna

Speaker 1

在舞台上制作一个视频。

create a video on stage.

Speaker 1

那是AI在操纵那个创意机构,让他们想出了有史以来最灾难性的广告创意,

That was the AI manipulating whatever creative agency came up with that to just come up with the most disastrous ad concept in

Speaker 0

在那场表演中。

that show.

Speaker 0

这是我第一次看到超级碗广告竟然导致某个服务被取消,不是产品本身,而是这一周的

That's the first time I've ever seen a Super Bowl ad actually lead to the canceling of the product or part not not the product, but this week

Speaker 1

这项服务。

The service.

Speaker 1

汇聚。

Converge.

Speaker 1

是的。

Yeah.

Speaker 0

在遭受监控争议后,Ring终止了与Flock Safety的合作。

Ring canceled its partnership with Flock Safety after a surveillance backlash.

Speaker 0

他们当时在广告中宣传‘帮您找到狗狗’,但大家却说:嗯,你们还可能有一个尚未推出但可能存在的合作项目。

Now they were advertising the we'll find your dog, but then everyone's like, well, they also have potentially a partnership that hasn't rolled out yet, but might.

Speaker 0

这就像说‘我们会找到你的人’,然后人们就质疑:你们在找人?亚马逊则说:不,我们找的是狗;可人们又说:不对,你们就是在找人!亚马逊只好回应:好吧,我们原本可能打算找人,但现在取消了——这就是亚马逊超级碗广告的故事。

That's like, well, we'll find your people, and then people were like, you're looking for people, and Amazon's like, no, we're looking for dogs, and then people are like, no, no, you're looking for people, and Amazon was like, yeah, well, we were maybe gonna look for people, but now we'll cancel it, and that's the story of the Amazon Super Bowl ad.

Speaker 0

我真希望能多聊聊这个,但我们还是继续谈Anthropic的融资吧。

I wish we could talk about it more, but we should go on to the Anthropic, fundraising.

Speaker 0

300亿美元的融资轮,当然,最初他们只打算筹集100亿美元。

$30,000,000,000 fundraising round, of course, it went from 10,000,000,000 initially, that's what they were seeking.

Speaker 0

融资需求超出了预期。

It became oversubscribed.

Speaker 0

他们原本希望筹集200亿美元。

They wanted, 20,000,000,000.

Speaker 0

他们道歉了。

They sorry.

Speaker 0

超额认购达到了300亿美元。

Oversubscribed went to 20,000,000,000.

Speaker 0

他们最终完成了300亿美元的C轮融资。

They ended up ending with a $30,000,000,000, series series c round.

Speaker 0

与此同时,OpenAI尚未公布其融资轮次。

Meanwhile, OpenAI is hasn't announced its round.

Speaker 0

拉詹,你怎么看这件事?

Rajan, what do you make of this?

Speaker 0

我的意思是,这显然是一个巨额融资。

I mean, it's obviously a big round.

Speaker 0

除了这一点,我们还能说些什么吗?

Is there anything else we can say beyond that?

Speaker 1

我觉得这轮融资非常有策略性,因为过去两三个月,Anthropic的表现简直太出色了。

I think it is I'm so intrigued in terms of, like, how orchestrated this fundraise was because you have to give them, like, the last two to three months, Anthropic has just been crushing it.

Speaker 1

我是说,关于Claude Code、Claude Cowork这些,所有的热度现在都集中在他们身上。

Like, I mean, the hype around Claude Code, Claude Cowork, like, all of this, they are front and center right now.

Speaker 1

而且他们恰好协调了一次融资,这显然需要几个月才能完成。

So and they just happen to coordinate a fundraise that clearly would take months to actually put together.

Speaker 1

我还看到一些报道,我觉得甚至Dan Premek也写过这个。

And and I saw a number of, like I think even Dan Premek had written this.

Speaker 1

与其说要隐瞒参与本轮投资的投资者,不如说很难不提到他们。

It's like, it's easier it it it's harder to not name investors who are involved in the round.

Speaker 1

因为涉及的人实在太多了。

Like, there's just so many people in included.

Speaker 1

所以我觉得,这些数字本身就已经让人难以消化了。

So so I think I mean, this is and the numbers are just hard to process anymore anyways.

Speaker 1

比如300亿美元,3800亿美元的投后估值。

So, like, 30,000,000,000, 380,000,000,000 post money valuation.

Speaker 1

我记得他们说他们的年化收入达到了140亿美元。

I think they said they're at a 14,000,000,000 run rate.

Speaker 1

是的。

So Yeah.

Speaker 1

那是什么?

What is that?

Speaker 1

二十多乘以二十四,二十三倍的收入,不管怎样。

20 something time 24, 23 times revenue, whatever.

Speaker 1

这规模很大。

Like, it's big.

Speaker 1

巨大无比。

It's giant.

Speaker 1

他们最近表现得极其出色,而我现在最关心的是OpenAI能做些什么。

They have been absolutely crushing it recently, and let's see what OpenAI can do is kinda where my head's at.

Speaker 0

我认为在过去一个月里,从十二月到一月,Cloud Code的使用量翻了一倍。

Cloud Code doubled in usage from from December to January, I believe, in the past month.

Speaker 0

使用量翻了一倍。

Doubled in usage.

Speaker 0

它已经在 GitHub 上占到了 4% 的提交量。

It's already doing 4% of commits, on GitHub.

Speaker 0

Anthropic 从 2023 年 1 月的零收入,增长到 2024 年 1 月的 1 亿美元年收入,2025 年 1 月的 10 亿美元年收入,以及今天的 140 亿美元年收入。

Anthropic went from $0 in revenue in January 2023 to a $100,000,000 run rate in January 2024, $1,000,000,000 run rate in January 2025, and a $14,000,000,000 run rate today.

Speaker 0

这绝对是异常出色的增长。

It's absolute absolutely, exceptional growth.

Speaker 0

史蒂文,你看起来对这个有些想法。

Steven, you're you're looks like you have some thoughts about this.

Speaker 2

这真的是巨额资金。

It's it's just huge dollars.

Speaker 2

对吧?

Right?

Speaker 2

而且是的。

And Yeah.

Speaker 2

随着这些公司变得越来越有价值,当它们上市后,公众股票与它们的命运紧密相连时,我担心的是,我们可能会因为金融前景过度依赖它们的成功,而不敢对违法的公司执行法律。

The more that the companies become valuable and when they become public, and public equities become tied to them, I I just worry about worlds where we are reluctant to enforce the law on companies even if they are breaking it because so much of financial prospects become levered up on their success.

Speaker 2

这对我来说似乎是一个相当可怕的场景。

That seems like a pretty scary scenario to me.

Speaker 2

我认为我们还没到那一步。

Don't think we're there quite yet.

Speaker 1

不过,这让我想到一个问题:从护城河的角度来看,比如大概去年这个时候,我们一直在谈论Cursor。

One thing this does make me wonder about, though, is, like, like, what's the moat in terms of like, I think it was probably around this time last year that all we were talking about was Cursor.

Speaker 1

而且,他们再次在软件、编程和软件开发领域走在了前面。

And, again, they they led the way on software, like, coding and software development.

Speaker 1

但如今,Anthropic似乎完全接管了这个市场。

And then Anthropic for the moment completely took over that market, it feels.

Speaker 1

你觉得这种势头能持续下去吗?

Like like, do you think this is sustained?

Speaker 1

因为现在,年经常性收入(ARR)其实就是你上个月的收入乘以12,甚至我看到一篇帖子说,初创公司是不是只拿一天或一小时的销售额,就 extrapolate 成全年数据?

Because, again, ARR nowadays is just whatever your last month's revenue was times 12, or even I saw one post that was like, are startups just taking one day or one hour of sales and then extrapolating it into a full year?

Speaker 1

我们真的认为这种增长能持续到如此规模吗?

Like like, do we think this is, actually gonna continue to grow at this scale?

Speaker 0

我只给你们提供一个小数据点,这周很多人觉得这个数据很有趣。

I'll just I'll just give you one one little data point that I found that a lot of people found interesting this point this week.

Speaker 0

好的。

Okay.

Speaker 0

我们之前谈到了OpenAI的融资。

So we talked about the OpenAI round.

Speaker 0

还记得Jensen说吗?我们说要给他们一千亿美元,但我们从来没说过要一次性全部投入,我们希望他们能邀请我们参与未来的融资轮次。

Remember Jensen said, well, we said we're gonna give them a 100,000,000,000, but really we'll we never said we're gonna do it all in one shot, and we hope they invite us, to, you know, invest in future rounds.

Speaker 0

这是来自软银CFO的说法。

This is from, SoftBank's CFO.

Speaker 0

我们以高度信心投资OpenAI,相信该公司将在AI开发领域占据领先地位。

We are investing in OpenAI with high conviction that the company will lead in developing AI.

Speaker 0

这是来自路透社的报道。

This is from Reuters.

Speaker 0

关于对这家初创公司的进一步承诺,他表示尚未做出任何具体决定。

Regarding further commitments to the startup, he said nothing concrete has been decided.

Speaker 0

所以,这看起来不像是完全撤退,但我不知道,拉詹。

So, it doesn't seem like a full back away, but, I don't know, Ranjan.

Speaker 0

这对我来说很有趣。

It is interesting to me.

Speaker 0

我觉得‘尚未做出任何具体决定’这句话,现在可能是AI投资者和建设者的最佳座右铭。

It seems like I think nothing concrete has been decided is probably a good a good phrase should be the slogan of of AI investors and builders right now.

Speaker 1

AI已经决定了。

The AI has decided.

Speaker 1

AI当然在幕后,而且已经做出了决定。

The AI certainly is in the background and has already decided

Speaker 0

它将要做什么。

what it's gonna be doing.

Speaker 0

我们可能还不知道。

Us, maybe not.

Speaker 0

AI是知道的。

The AI does know.

Speaker 0

好吧。

Alright.

Speaker 0

史蒂文,最后说一下。

Steven, final word.

Speaker 0

我们该有多担心呢?

How freaked how freaked out should we be?

Speaker 2

我不知道。

I don't know.

Speaker 2

我不想扫大家的兴。

Like, I don't I don't wanna be a downer.

Speaker 0

对。

Right.

Speaker 2

而且,真的看起来没人靠谱。

Also, it it really, really seems like nobody is on the ball.

Speaker 2

对吧?

Right?

Speaker 2

我很高兴我们终于在加利福尼亚和纽约有了相关法律。

Like, I'm glad that we finally have laws in California and New York.

Speaker 2

但这些法律非常薄弱。

They're extremely weak.

Speaker 2

我不太乐观地认为很快会有有意义的联邦监管。

I don't feel super optimistic on meaningful federal regulation soon.

Speaker 2

欧盟那边有一些措施。

There's stuff in the EU.

Speaker 2

就罚款金额而言,它们相当严厉。

It's, like, pretty heavy in terms of the amount of fines.

Speaker 2

如果欧盟试图对本国公司执行这些规定,美国会抱怨吗?

Is The US going to complain if the EU ever tries to enforce this on its companies?

Speaker 2

如果我们能召开一次国际峰会,承认我们正走在一条错误的道路上,我会感觉好很多。

I I will feel much better if we had some sort of international summit that recognized we are on a bad trajectory.

Speaker 2

让我们明确目标:安全地构建超级智能,弄清楚需要采取哪些措施才能实现

Let's declare the goal, safely build superintelligence, figure out what needs to happen to get

Speaker 1

it

Speaker 2

在那里。

there.

Speaker 2

似乎很多人仍在意识到这些担忧。

It seems like many people are still waking up to the concerns.

Speaker 2

这很好。

That's great.

Speaker 2

我非常非常高兴他们开始看到我所看到的一些东西,但这些认识还没有转化为行动,而这就是我希望很快能实现的。

I'm very, very happy that they are starting to see some of what I see, but it is not yet translating to action, and that's that's what I hope will come soon.

Speaker 0

如果你想关注史蒂文的工作,这份通讯是《Clear Eyed AI》。

The newsletter is Clear Eyed AI if you wanna follow Steven's work.

Speaker 0

拉詹的通讯是《Margins》,如果你想关注他的工作。

Ranjan's is Margins if you wanna follow his.

Speaker 0

我的是《Big Technology》。

Mine is Big Technology.

Speaker 0

这真是太棒了。

This has been great.

Speaker 0

我很高兴我们讨论了安全问题。

I'm glad we talked about safety.

Speaker 0

我觉得我们不得不专门做一期关于安全的节目,而这一周就是最佳时机。

I feel like we we had to dedicate a show to safety, and this was the week to do it.

Speaker 0

所以,Steven、Ranjan,感谢你们两位做客我们的节目。

So, Steven, Ranjan, thank you both for coming on the show.

Speaker 1

这一期真是跌宕起伏。

This one's been a roller coaster.

Speaker 1

我知道。

I know.

Speaker 1

这确实是一场过山车般的经历。

It's been a roller coaster.

Speaker 0

这个周末你睡着了吗?还是整晚盯着天花板?

Get any sleep this weekend or stay up looking at the ceiling?

Speaker 1

两者都有吧,我想。

Little of both, I think.

Speaker 0

那这个宗教是什么?

What's what's the what's the religion?

Speaker 0

那个分拆出来的?

The spin out?

Speaker 0

再见。

See you

Speaker 1

在这儿。

over here.

Speaker 1

螺旋教。

Spiralism.

Speaker 1

螺旋教。

Spiralism.

Speaker 1

我可能会完全皈依螺旋教,向我新的堕落四主神祈祷。

I That's what I I will be fully converting to a spiralist, perhaps, and, just praying to my new fallen four o overlord.

Speaker 0

是的。

Yeah.

Speaker 0

本周剩下的时间。

Rest of week.

Speaker 0

这就是我要做的。

That's what I'll be doing.

Speaker 0

我待会儿见。

I'll see you later.

Speaker 0

好的。

Alright.

Speaker 0

咱们离开这儿吧。

Let's let's get out of here.

Speaker 0

咱们去享受周末吧。

Let's let's go enjoy the weekend.

Speaker 0

好了,各位。

Alright, everybody.

Speaker 0

谢谢收听。

Thank you for listening.

Speaker 0

谢谢拉詹和史蒂文,我们下次再见于《大科技播客》。

Thank you, Ranjan and Steven, and we'll see you next time on Big Technology Podcast.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客