本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
如果你是非技术人员,你习惯的世界是:你发送一个提示后,几分钟内就能得到回复。
If you're a non technical person, you are used to a world where you send a prompt and then you get a response within a couple minutes.
一旦你发送了一个提示或聊天消息,你就不能再对那个AI做其他操作了。
And once you send a prompt or a chat, you can't do anything else with that AI.
这个工具专为以异步方式与你的AI协作而设计。
This is built for working with your AIs in an async way.
这款新的Claude Cowork应用是代理原生架构的一个绝佳例子,这意味着在应用底层,不是使用由确定性规则驱动的软件,而是有一个代理,并且这个代理与应用的用户界面相连。
This new Claude Cowork app is a really good example of agent native architectures, which means at the bottom of the app, instead of having, like, software that works by deterministic rules, you have an agent, and the agent is wired up to the UI of the app.
我之前在Anthropic待过一段时间,但这个产品是我团队在这里开发的。
I've been in Anthropic for a little bit, but, this is the product that my team has built here.
我们过去一周半一直在全力开发这个产品。
We sprinted at this for the last week and a half.
我们试图
What we're trying
过去一周半?
The last week and a half?
就这样?
That's it?
别闹了。
Come on.
我们有了一个新的Anthropic更新。
We've got a new Anthropic drop.
Anthropic刚刚发布了Claude Cowork,这基本上是给非技术人员用的Claude Code。
So Anthropic just dropped Claude Cowork, which is basically Claude Code for non technical people.
我们在Every提前获得了访问权限。
We got access to it early at Every.
所以我来给你快速介绍一下它是什么以及它是如何工作的。
And so I'm gonna give you a quick run through of what it is and how it works.
我们几个小时后可能会在Every上发布一篇完整的文章。
We will have a full write up on Every in a in a few hours probably.
我们正在摸索如何做这些事情。
We're just we're kinda figuring out how to do these things.
所以,我要把基兰加进来。
So and I'm gonna add Kieran.
基兰在这儿。
Kieran is here.
你好,基兰。
Hello Kieran.
你怎么样?
How you doing?
嘿。
Hey.
最近怎么样?
What's up?
我刚才跟大家说,如果你刚来,我们马上要对Anthropic的新功能Claude Cowork做个体验评估,这个功能是面向非技术人员的简化版Claude Code。
I was just telling everybody, if you're if you just got here, we are about to do a vibe check of Anthropic's new Claude Cowork feature which is a basic Claude Code but for non technical folks.
我现在就在这里给你们演示一下。
I'm gonna just demo it for you right now in here.
好的,那我来共享一下屏幕。
And okay, so let me just share my screen.
好了,这就是它的样子。
All right, so this is what it looks like.
这是 Claude Cowork。
It's Claude Cowork.
你会看到,聊天窗口在左边。
So you'll see we've got the chat on over here to the left.
你仍然有普通的聊天功能,这里有代码,还有 Cowork。
So you still got regular chat, you've got code here and then we've got Cowork.
所以是三个 C。
So it's three, the three C's.
它从‘让我们完成你清单上的一件事’开始。
It starts with let's knock something off your list.
我喜欢这里的文案。
I love the copywriting here.
从它的设计方式可以看出,它比普通聊天更擅长在你的电脑上执行更深层次的任务。
You can tell from where they did this that the it's really designed to do deeper tasks on your computer than maybe chat is.
比如创建文件、处理数据、制作原型、发送消息、整理文件等。
So create a file or crunch data or make a prototype, send a message, organize files.
这里还有进度显示、产物和上下文信息。
It's got over here progress, it's got artifacts, context.
我们已经玩了几个小时了。
We've been playing around with this for a couple hours now.
我觉得这很酷。
I think it's cool.
我觉得这很酷。
I think it's cool.
从用户体验的角度来看,聊天、代码和代码协作之间有很多非常有趣的问题。
I think there's a lot of like really interesting questions from the UX perspective of chat versus code versus code work.
我觉得你可以把协同工作看作是能访问你的电脑并长期运行的聊天。
And I think you can really think of co work as being it's like chat that has access to your computer and runs for a long time.
这本质上就是Claude Code,但没那么令人望而生畏。
Which is essentially Claude Code but just less intimidating.
所以我想分享一些我们做的、我觉得挺有意思的事情。
So a couple of things that we did that I think are kinda interesting to look at.
好的。
Okay.
这是它运行的一个例子。
So here's an example of it working.
我让它去访问every。
I asked it to go to the every.
访问网站,找出五家与我们提供类似咨询服务的竞争对手,然后分析我们的定位。
To website and find five competitive companies that do the kind of consulting that we do and then analyze our positioning.
你会看到它直接用我的电脑去执行了。
And you'll see it like just went and used my computer.
它正在一个长时间循环中运行。
It's running in a long loop.
这是一个非常有趣的情况,你用普通的Claude也能做到,但它要经历的迭代次数非常多,花费了数分钟甚至更久的时间。
So this is a really interesting one where you can do this with regular Claude, like Claude can do this, but the number of iterations that it's going through, this is many, many, many minutes of iterations.
所以这看起来更像是——等等,你能把它转成Markdown格式吗?
So it looks a lot more like, actually can you make this markdown?
所以这看起来很像Cloud Code,但又足够友好,任何人都能使用。
So it looks a lot more like a cloud code, but it's friendly enough for anyone to use.
好吧,我们再来看看我不得不做的另一件事。
Well, we'll look at the another thing I had to do.
我明天晚上得参加一个晚宴,需要准备一些发言稿。
I have a I have an I have a dinner that I have to go to tomorrow night that I had to prepare to prepare some remarks for.
于是我让它去我的Gmail里帮我准备发言内容。
And I asked it to basically like go go to my Gmail and prepare remarks.
它有一个连接Gmail和Google日历的插件,但那个插件当时无法正常工作。
It has a, it has a connector to Gmail and to Google Calendar, but the connector wasn't working.
我想这是因为这还处于非常早期的测试阶段,而我测试时这个功能还没发布。
I think because this is like very beta and it was not out when I was testing this.
我们来看看。
And let's see.
我让它起草一个回复,它确实起草了一个回复。
And I asked it to draft a response and it draft a response.
我觉得这个回复其实很不错。
And I think that the response is actually pretty good.
这听起来真的像我本人。
So this actually sounds like me.
这有点疯狂。
This is kinda crazy.
基兰,你得看看这个,因为这对Quora有影响。
Kieran, you gotta look at this because this is that's an implications for Quora.
所以基本设置是,我明天有个晚宴,需要准备发言稿。
So basically the setup was I have a dinner tomorrow that I have to do remarks for.
我让它帮忙,而晚宴的组织者问我:你能告诉我你打算谈些什么吗?
And I asked it and the person who organized the dinner was asking me, can you tell me like what you're gonna what you wanna talk about?
所以我只是说,找到那封邮件,根据你对我会怎么说的判断,起草一个回复。
And so I just said like, read find the email and draft a response based on what you think I would say.
我觉得这个回复我几乎不用怎么修改就能发出去,这真的很棒。
And this is something that I could I think I actually could send with minimal edits, which is it's pretty cool.
基兰,你看到这个了吗?
Are you seeing this Kieran?
是的。
Yes.
对。
Mhmm.
看起来不错。
Looks good.
我觉得这件事很棒的一点是,它经历了许多步骤,既识别了我会说什么,也模仿了我的表达方式,还掌握了所有上下文,这真的很厉害。
I think one of the the things that's good about this is it it's gone through many many steps to both identify what I would say and how I would say it and has all the context which is like, you know, it's pretty cool.
对。
Yeah.
我,是的。
I yeah.
去吧。
Go for it.
是的。
Yeah.
那么,Cowork 和聊天有什么不同?
So how is Cowork different than chat?
比如,Cowork 更适合共享屏幕,而且它更设计用于长时间使用,真正地完成某项工作。
Like, Cowork is more made to, you can you can share your screen, but, like, it's more made to, like, go longer, like, really work on something.
所以它更专注于帮助你完成工作。
So it's more focused on on getting you some work done.
这是我们已经在闭源代码中知道的,比如当你触发 Ultra Think 或启动规划模式时,诸如此类的功能。
And this is what we know in closed code already, like, when you trigger ultra think or your trigger planning mode, things like that.
所以它在某些方面是相似的,但也会解锁很多你作为开发者在闭源代码中使用的功能,现在这些功能突然也能很好地用于非开发者任务了。
So it it is similar in certain ways, but it also it will unlock a lot of things that you use as a developer in closed codes that now suddenly work very well for non developer tasks.
是的。
Yeah.
我能想象这会极大地加速我们的增长团队,或者我们咨询业务中那些想完成这些任务的人,他们可能在使用Claude Code,但可能觉得没那么直观。
I can sort of see this being a thing that just really accelerates our growth team or people who are in our consulting business who wanna do some of these tasks and maybe are using Claude Code, but maybe it's like a little bit less intuitive.
现在就应该发布了。
It should be released now.
我觉得他们可能在推迟发布,但博客文章已经上线了。
I think that they're maybe holding it back but the blog post is live.
所以你可以去看看这篇博客文章。
So you can take a look at the blog post.
这是一个CoWork研究预览版。
It's a co work research preview.
我发到聊天里。
I'll throw it in the chat.
是的。
And yeah.
所以我觉得它很快就会发布了。
So I think it'll be it'll be out soon.
让我想想,我可以问问我在Anthropic的人。
Let me I'll I'll I'll like I can check with my Anthropic people.
但好吧。
But okay.
让我回到这个话题。
Let me go back to this.
所以我还让它做了日历审计。
So I also asked it to do a calendar audit.
看起来它实际上还在运行这个,真是不可思议。
This looks like it's actually still running on this, which is crazy.
我一个多小时前就问它了。
I asked this like an hour ago.
帮我梳理一下上个月的日历并做个审计。
Go through the past month of my calendar and do an audit.
告诉我这如何反映我的优先级,以及是否与我的目标一致。
Tell me how this reflects my priorities and whether it's aligned with my goals.
只是在浏览Chrome。
Just browsing Chrome.
你能再共享一下屏幕吗?
Can you share your screen again, please?
哦,糟了。
Oh, shoot.
是的。
Yeah.
谢谢。
Thanks.
好的。
Yep.
告诉我如何回顾我上个月的日历并进行审计。
Tell me how go through the past month of my calendar and do an audit.
告诉我这与我的目标有什么关系。
Tell me how this relates to my goals.
而且它只是,你知道的,在我的电脑上浏览了几个小时。
And it just, you know, it's been browsing on my computer for like hours and hours.
有一件事,不是几个小时,而是大约一个小时。
One thing that's or not hours and hours, but like for about an hour.
这件事不同之处在于,当我在Claude中开始聊天并它在响应时,你必须停止它才能发送任何消息。
One thing that's different about this, which I think is really interesting is in Claude when you start a chat and it's responding, you have to stop it in order to send any message.
但这里它直接输出,我可以直接添加到队列中。
But this just output, I can just add to the queue.
所以这更像Claude Code的体验。
So this is a little bit more like the this is more like the Claude Code experience.
所以我认为这是一种情况,它被设计成让你在它工作时不断发送内容,而不是像‘一条消息,一个回复,一条消息,一个回复’那样。
So I think this is this is one of those situations where it's built for you to send stuff as it's working, and it's not like a like, one message, one response, one message, one response.
所以它实际上是为长时间任务而设计的。
So it's it's really more built for long long tasks.
是的
Yeah.
而且右侧还内置了待办事项列表,这很不错。
And it has the to do task also built in on the right, which is nice.
所以你可以看到它当前的位置和正在做什么。
So you can see where it is and what it's doing.
没错。
Yep.
正是如此。
Exactly.
我做的另一件事是给它喂了一本书,这本书叫《局外人》,是我正在阅读并用于写作参考的一本书。
Another thing I did is I fed it a a book and I this is a book that I it's called The Outsider that I've been reading for a a book that I'm writing.
我只是让它大致读完这本书,然后构建出所有主要人物和思想的分类体系。
And I just asked it to like basically read the book, read the entire book and construct a taxonomy of all the main characters and ideas.
看起来它确实做到了。
And it looks like it did this.
而且,你也可以用普通的Claude做这件事,但结果会没这么详细。
And this is something that you could also do with with the with regular Claude, but it would be it would just be less detailed.
这真的很有趣。
This is really interesting.
令人惊叹。
Wonder.
所以是的。
So yeah.
所以如果你想触发那些耗时较长的任务,可以直接说‘制定计划’或‘去做这件事’。
So if you wanna trigger those longer running tasks, like, you can just say make a plan, do this.
所以我猜,如果你让它多做这类任务,它就会运行得更久。
So I assume if you just push it to do that more, it will just run for longer.
所以如果你说‘花一小时,读完每一封邮件’,它就不会像以前那样轻易放弃,而是会持续下去。
So if you say take an hour, read through every single email, it won't give up as easily as before, and it will just keep going.
所以请试试这些功能。
So please try those things.
是的。
Yeah.
它确实有一个计划模式。
It has a it has a it does have a plan mode.
我说过我还没用过计划模式,但这挺酷的。
Said I have not used the plan mode yet, but that's pretty cool.
我还让它去分析我们的PostHog数据并进行一些数据收集。
I also asked it to go through our PostHog analytics and do some data gathering.
所以我们发布了一份指南。
So we published this guide.
如果你还没看过这份指南,我一直在跟每个人提起它,大家都笑话我,因为我停不下来地谈论原生智能体架构。
If you haven't seen this guide, it's like, I keep talking about this everyone and every is like making fun of me because I can't stop talking about agent native architectures.
我们上周发布了这份指南,里面包含了‘用Claude阅读’和‘用Teche Boutine阅读’这样的按钮。
We published this guide last week and we have these buttons, read with Claude, read with Teche Boutine.
我很好奇,到底有多少人点击了这些按钮?
I was curious, okay, how many people click these buttons?
这种事我本来会去问负责我们平台的安德烈,让他去查一下。
And that's the kind of thing where I would go ask Andre who runs our platform and be like, can you go look this up?
但我直接说了:你能不能去 PostHog 里找找这些数据?我刚去了 Cowork,然后让你去 PostHog 把这些信息都找出来。
But instead I just said, can you go into post I just went to Cowork and I said, can you go into PostHog and just find all this stuff?
它回答说:好的。
And it said, okay.
点击‘用 Claude 阅读’按钮的总次数是 4000 次。
Total chat with Claude Button Cooks, 4,000.
这简直太疯狂了。
That's actually freaking crazy.
天哪。
Oh my god.
我们应该赚钱。
We should get money.
我们应该做推广。
We should referral.
推荐费用,宝贝。
Referral fees, baby.
那和ChatGPT的聊天呢?
What about chat with ChatGPT?
那代理的文案呢,还有
What about copy for agent and
这很酷,因为我们今天早上开始得很匆忙,当时还没设置任何MCP。
So this is cool because we we were in a rush since we started this morning, and we didn't have any MCP set up.
我们刚才做的就是连接了Chrome,Dan在Chrome上登录了PostHog。
What we just did was we connected Chrome, and Dan is logged in on Chrome in PostHog.
所以只要浏览一下,就能获取到这些信息。
So just browse through the thing, did got the things.
这非常方便。
So that's that's very handy.
比如MVP,只需确保连接了Chrome。
Like, MVP, just make sure you connect Chrome.
这已经是一个非常好的功能了。
That's a very good one to add already.
你也可以用普通方式做,但它在浏览和发现问题方面非常出色,尤其是现在它不会那么快停止,这真的非常方便。
You can also do that with the normal, but it it's very good at browsing and figuring things out, especially now it doesn't stop as quickly, which is really, really handy.
因此,任何你在浏览器中能做的事情,现在都可以用 Cowork 来完成,比如运行更长时间的任务、启动操作,而且你还可以使用多个标签页。
So anything you can do in a browser, you can now use Cowork for to, like, use longer running tasks and kick off things, and you can use multiple, tabs as well.
你可以让五个标签页同时由 Claude 控制运行,这很棒。
So you can have five tabs, being controlled by Claude and running, which is great.
完全正确。
Totally.
确实如此。
Totally.
而且听起来,这非常适合普通人使用。
And it sounds like, it's this is down for people.
所以这周来一起玩吧。
So come come hang out this week.
是的
Yeah.
对我而言也不行,但对丹来说是正常的,所以
It's down for me as well but but for Dan it's working so
只有我这里能打开。
I'm the only one it's open for.
所以我们在这里垄断了Anthropic Cowork的内容。
So we've got a monopoly here on on Anthropic Cowork content.
是的
So Yeah.
如果你在这里并且有东西想让我试试,尽管告诉我。
If you're here and you have stuff you want me to try, just let me know.
我很乐意把它发到聊天里。
I'm happy to happy to throw it in the chat.
而且等等。
And and make sure wait.
我有一件有趣的事情要做。
I I have I have something fun to do.
确保你使用Every,因为你需要Every才能紧跟AI的前沿。
Make sure that you use Every, you read Every because Every is the only subscription that you need to stay at the edge of AI.
Every。
Every.
当新模型发布时,我们有五个检查项,会提前准备好。
To, we've got these five checks when new models come out, get them beforehand.
几个小时前我们就有了这个。
We had this several hours ago.
我们从上周就知道它要来了。
We knew it was coming since last week.
我们总是拥有你可能需要的最新内容。
And we always have all the up to the minute stuff that you might need.
我们还有一套自己开发的应用程序组合。
We also have a bundle of apps that we make.
我们有一个类似基兰开发的Quora的应用,它是一个邮件助手。
We have an app like the one that Kieran makes called Quora, which is an assistant, an email assistant.
我们还有一个叫Sparkle的应用,能帮你整理文件。
We've got one called Sparkle that helps you organize your files.
我们还有一个叫Spyro的应用,能帮你写作。
We've got Spyro, which helps you write.
我们还有一个叫Monologue的语音转文字应用,我稍后给你们演示。
And we've also got Monologue, is a speech to text app, which I will show you shortly.
我们还是回到Claude的演示吧,快速看一下。
And let's actually go back to, let's actually go back to the Claude demo real quick.
所以,基兰,如果你看到有人提问,就直接。
So, and Kieran, if you see anyone asking questions, just.
是的,确实,亨特提了一个好问题。
Yeah, well, yeah, there's one good question from Hunter.
他说,它在研究方面的表现如何?
He says, how good is it at research?
我喜欢Claude的地方之一就是它的深度研究能力。
Like, one one of the things I love Claude's normal for is the deep research.
这真的存在吗?
Like, does that exist?
那我们来看看它是否能对某个主题进行深度研究。
So maybe we can see if if it can do deep research on something.
是的。
Yeah.
如果你有想让我尝试的研究问题,请告诉我,我可以给你演示一下,你知道吗?
Let me know if you have let me know if you have a research query you want me to try, but, like, I can show you, you know actually, you know what?
我要共享另一个屏幕。
I'm gonna share a different screen.
等一下。
Hold on.
加上共享屏幕。
Plus share screen.
好的。
Okay.
我正在调试我的直播设置。
I'm figuring out my live stream setup.
这是我们第一次真正为 Vibe Check 做直播。
This is the first time that we've really done a live stream for Vibe Check.
好的。
Okay.
现在你应该又能看到我的屏幕了。
So now you should be able to see my screen again.
关于研究,好的。
For research okay.
这取决于你所说的‘研究’是什么意思,对吧?
Like It depends on what you mean by research, right?
因为这确实是一个研究问题。
Because this is a research query.
你能连接到 PostHog 并告诉我,我们上周发布的代理原生指南中,有多少人点击了聊天按钮吗?
Can you connect to PostHog and tell me for the agent native guide that we published last week, how many people clicked the chat?
这算是研究,而且确实给了我一个非常好的答案。
Like that is research and it does give me actually a really good answer.
这太有趣了。
This is so interesting.
但我们之前谈到的另一种研究方式是分析竞争对手。
But another form of research that we talked about is analyze my competitors.
这特指针对每一家咨询公司进行竞争对手分析。
This is specifically analyze our competitors for every consulting business.
我要说的是开放和验证。
And I'm gonna say open and proof.
验证是我上周末开发的代理原生 Markdown 编辑器。
Proof is the agent native markdown editor that I built over the weekend.
你能相信我刚说了这些吗?
Can you believe I just said that?
所以这份研究文档是Claude Cowork整理的,你知道,它并不是我见过最差的媒体内容。
And so this is a this is a research document that Claude Cowork put together, which, you know, it's not like it's not the medias thing I've ever seen.
我们来看看。
Let's see.
我觉得这还不错。
I think this is not bad.
我没有注意到它和普通Claude生成的内容之间有特别明显的差异。
I'm not noticing like a super significant difference between this and like what a normal Claude would pull out.
但我能看出,这份研究本身比普通Claude所做的要详尽得多。
But I can see the research itself is much more extensive than normal Claude would do.
让我直接把这个放进普通模式里看看。
And let me just actually throw this into normal
你也可以看到,如果你做深度研究,
You can see also so if you do deep research, the
这这
the the
在聊天中使用研究代理,这通常已经很深入了,但这里你能看到更多。
research agents in chat, That's pretty extensive normally, but here you can see more.
是的。
Yeah.
所以是的。
So yeah.
我不确定。
I I don't know.
也许那个版本中也有这个功能。
Maybe that is also available in in that version.
我们还在摸索有哪些功能可用,但它已经非常接近Claude Code能做的事情了。
We're still figuring out what everything is that is available, but it's very close to what you can do in Claude Code.
所以这种,是的。
So the kind of Yeah.
聊天与Claude Code的融合就是Cowork。
Fusion of chats and Claude Code is is Cowork.
是的。
Yeah.
对。
Yeah.
这非常像,你可以看到他们把这些称为任务,而不是聊天。
This is very like, you can see that they're calling these tasks as opposed to chats.
所以我认为,这里有一个很好的理解方式。
So it's supposed to be I think here's a good way to think about it.
一个好的理解方式是,如果你是非技术人员,你习惯于这样一个世界:你发送一个提示,然后在几分钟内得到回复。
A good way to think about it is if you're a non technical person, you are used to a world where you send a prompt and then you get a response within a couple minutes.
一旦你发送了一个提示或聊天,你就不能再对那个AI做任何其他操作。
And once you send a prompt or a chat, you can't do anything else with that AI.
只能去处理别的事情。
Have to like move on to something else.
这是为以异步方式与你的AI协作而设计的。
This is built for working with your AIs in an async way.
所以一切都以任务的概念、队列的概念来设计。
So everything is set up like the idea of a task, the idea of having a queue.
这样设置的目的就是让你可以说‘去完成某件事’,然后暂时不去管它,过一会儿再回来查看。
This is all set up so that you can say go do something and then not think about it for a while and then come back.
这与Claude非常不同,在正常的Claude应用中,你希望很快就能得到答案。
Which is very different from Claude where the normal Claude app you're trying to get an answer pretty quick.
我认为这就是最好的思维转变。
And I think that's the best mental shift.
我真正想知道的是,这是否值得拥有一个独立的标签页?
I think the real question that I have is, is this deserving of its own tab?
其中一个原因是,你对待这些任务的方式可能与对待那些任务的方式不同。
For one reason it might be is there's a difference between like how you might treat one of these versus one of these.
这些更像是临时性的、一次性使用的。
Like these are more throwaway.
这些可能是更大规模的工作任务。
These are probably like bigger chunks of work.
但说实话,这有点让人困惑。
But honestly, it's like kind of confusing.
我宁愿直接把所有内容都放在一个标签页里,然后根据任务的不同,让系统进行不同层次的研究和思考,并以异步方式处理,也许还可以通过设置来决定,比如明确说:‘认真深入地想想这个问题。’
I would rather just, I think that they, I would rather just have it all in one tab and then have it do different levels of research and thinking and and async based on the task and maybe based on, you know, a setting, like saying, like, really fucking think about this.
我不确定。
I don't know.
你觉得呢,基兰?
What do you think, Kieran?
你会怎么解决
How would you solve
这个问题?
this?
是的。
Yeah.
对。
Yeah.
对我来说,这也挺让人困惑的。
Like, for me, it's also confusing.
哦,太好了。
It's like, oh, great.
又多了一个标签页,我得先想想该去哪儿。
There's another tab, and I have to, like, first think where to go.
但我能理解,因为是的。
But I do get it because Yeah.
我见过工程师们经历的这种转变。
I I've seen this transition as engineers.
作为工程师,我们曾经只是把代码复制粘贴到ChatGPT里,后来演变成Cursor,更具有代理性,再后来,我甚至不再看代码了。
As an engineer, like, we we had the, like, copy paste into ChatGPT, obviously, and then, like, that evolved into cursor, like, more agentic, which evolved to, like, I don't look at code anymore.
我认为从事研究或协作的人也会经历类似的转变。
And and I think there will be a similar transition for people that do research or coworking.
比如,现在人们习惯打开Chrome浏览器,查看发生了什么,但未来可能会演变成:我只是让它自行运行,做它该做的事。
Like, maybe now people are used to go into Chrome and, like, seeing what's going on, what's happening, to towards more of, like, I'm just going to let it rip and do its thing.
等它返回结果后,我会直接查看输出内容,而不是一步步理解每一个细节。
And then after it gets back, I'm going to review whatever the output is rather than understanding every single step all the way.
而你显然已经到达那个阶段了。
And and then you're, like, clearly already there.
这就是你的思维方式。
That's how you're thinking.
但如果还没到那一步,比如你还在用聊天模式,只是说‘现在去做这个’,我也能理解。
But I do understand if you're not there, if you're more like in the chat thing where it's like, oh, do this now.
你为什么不看看这个?
Why don't you look at this?
如果你进行这样的对话,那更像是聊天;而协同时,更像是你把任务交给你的人工智能代理。
Like, if you have this conversation, it's maybe more chat and co work is more you hand off a task to your agent.
你的代理完成任务后,你会审查成果,并可以跟进,同时给出额外的指导。
And your agent comes back and you review the work and you can follow-up, but you can give extra directions.
但我确实理解为什么要新增一个标签页,因为那样你需要切换思维模式来操作。
But I do understand to introduce a new tab because there you need to shift your mind to do that.
而且显然,我们正在这么做。
And obviously, we're doing that.
但我理解世界上很多人,尤其是非程序员,仍然需要经历这个转变:你把任务交出去,然后让它运行三十分钟或一小时,再回来检查结果。
But I understand lots of people in the world, especially that are not coders, still have to make that transition where you just hand off something and then let it do stuff for thirty minutes or an hour and then come back and review that.
所以我理解他们为什么想把它分开。
So I do understand why they wanna separate it.
尽管这一切都是相同的技术,因为聊天、代码和协同工作都基于同一个模型,而且非常相似,只是围绕它进行运用。
Even though it's all the same technology because chat, code, and Cowork, it's all the same model and it's very similar, like, harnessing around it.
从哲学上讲,或者从使用方式上看,可能略有不同。
It is philosophically or how you use it maybe a little bit different.
所以我想这就是他们这样做的原因。
So I guess that's why they did that.
是的。
Yeah.
这也很有趣,因为当我们最初做Vibe Check时,就在几个小时前,我和基兰以及Every公司内部的其他几个人在电话里一起演示它。
It's interesting too because we know, when we did this when we did this original Vibe Check, when we got it a couple hours ago, had it we had me and Kieran and a couple other people on the phone from internally at Every and we were demoing it together.
他们最初的反应是:不知道这有什么不同。
And their initial reaction was like, don't know how this is different.
这和普通的Claude或Claude Code相比,真的有那么有用吗?
Is this like even that that useful compared to regular Claude or Claude Code?
因为很多人直接在用Claude Code。
Because a lot of them are just using Claude Code directly.
我觉得这对我来说特别有意思,你如果不亲自上手,根本不会意识到它有多有用。
And I think that that's like, that was a really interesting thing for me where you're not gonna actually realize how useful this is until you get your hands on it.
而且很可能存在一个学习曲线,对于那些不习惯把工作交出去然后回头再看的非技术用户来说。
And there's probably gonna be a learning curve it where if you're a non technical user who is used to Who's not used to the idea that you can just like hand off your work and then come back.
要真正理解并适应这种交互范式,可能需要一段时间。
It's probably gonna take a while to like actually figure that out and get used to this as a UX paradigm.
所以,也许把它放在单独的标签页里确实有好处,让人们能意识到:哦,对,这个不一样,我应该用不同的方式对待它。
So maybe there's some benefit then to like having it be a separate tab so people can like basically realize, oh yeah, this is this is different and I should treat this differently.
这确实需要一个适应过程。
It's a real adjustment.
是的
Yeah.
对
Yeah.
好
Yeah.
而且这
And that's
所以如果你想了解这种调整,比如我们正在写关于编码方面的内容,但你其实可以把这里发生的变化应用到Cowork上,比如这种转变是如何发生的,过程是怎样的。
So if if you wanna learn about that adjustment, like, we're writing about that for coding, but you could really apply whatever is happening to coding probably for Cowork as well, like, how how that shift happens, how how that goes, how that goes.
所以,是的,我们确实思考过并写过一些这些内容,我还围绕这个想法开发了一个插件。
So, yeah, like, we we we thought about we wrote about some of these things, and I created a plug in around this idea.
因此,在接下来几天里,我也会看看复合工程这个模式或范式是否适用于Cowork。
So what I will do in the coming days as well is, like, see if the pattern or the paradigm of compound engineering, if that applies to Cowork.
我能在Cowork中实现这个吗?
Can I get that working in Cowork?
因为我非常好奇,想进一步扩展这个想法,看看它在这里是否也适用。
Because I would be very curious to to expand that and see see if it works inside here as well.
安东尼,我注意到你来自Anthropic。
Anthony, I see that you're from Anthropic.
你愿意加入直播吗?
Do you wanna come on the stream?
我会发你一个链接。
I'm gonna send you a link.
我非常想听听你对这个话题的看法,以及我们该尝试什么或遗漏了什么。
I would love to hear what you have to say about this and anything that we should try or anything that's missing.
稍等一下。
Here, give me a sec.
复制。
Copy.
我们看看。
Let's see.
好的。
Alright.
我给你发了一个直播链接。
I sent you a stream link.
如果你愿意加入,随时欢迎。
If you wanna come on, feel free.
如果你觉得不太合适,那也完全没关系。
If you if you're, you know, if you're not feeling it, that's also totally fine.
是的。
Yeah.
他说这个看起来或更接近Claude Code而不是Chats,而且感觉确实如此,因为所有工具,比如询问用户问题的工具之类的,都有用户界面,这很不错。
He says it looks or is closer to Claude Code than Chats, and it feels like that because all the tools, like the ask user question tool and stuff like that, have a UI, which is nice.
所以你可以让它说,嘿。
So you can ask it to say, hey.
你能采访我吗?
Can you interview me?
问我几个问题。
Ask me a few questions.
而且有一个不错的界面,带有多项选择等功能。
And there's like a nice UI with multiple choice and stuff like that.
所以,这个界面部分真的很棒。
So, yeah, it's really cool that UI part.
确实很酷。
It is really cool.
好的。
Okay.
那我们继续看看这个。
So let's keep let's keep looking at this.
它还在运行。
It's it's still working.
我注意到的一件事是,当它出错时,显示的是来自Claude Code的错误,内部错误信息就是Claude Code。
One of the things I noticed is when it was erroring, it had it gave an error from Claude Code, like the internal error message is Claude Code.
所以看起来它真的只是Claude Code的一个UI包装器,而不是一个不同的智能体框架,或者可能是Claude SDK。
So it's it seems like it really is just like a UI wrapper on Claude Code rather than a different agent harness or maybe like a Claude SDK.
安东尼,如果我们理解错了,或者任何正在收听的Anthropic同事,我非常希望你们能对此表示赞同或反对。
Anthony, if we're wrong about that or anybody else from Anthropic who's listening, I'd be very curious for your thumbs up or thumbs down on that.
但这种不直接使用真正的Claude Code的设计选择很有趣。
But that's like interesting design choice not to use just like actual Claude Code.
我猜这是因为它已经集成在应用里了。
I assume that's because it's already in the app.
所以这样做非常简单。
So it's like pretty easy.
你不需要使用SDK,但看到这一点真的很有趣。
You don't need to use the SDK, but it's a really interesting thing to see.
我想给大家展示一件事,如果你还没关注我们Every团队的全部动态——如果你真没关注,我不知道你为什么没关注。
One thing to One thing that I wanna show people in case you have not been watching everything that we're doing at Every and if you haven't been, don't know why you would not.
我不知道你哪里出了问题。
I don't know what's wrong with you.
展开剩余字幕(还有 480 条)
但有一件事我们一直在深入思考,那就是原生智能体架构。
But one thing that's really cool that we're thinking a lot about is agent native architectures.
而原生智能体架构,这个应用就是一个很好的例子。
And an agent native architecture, this app is a really good example.
这个新的Claude Cowork应用正是原生智能体架构的绝佳范例。
This new Claude Cowork app is a really good example of agent native architectures.
我们把原生智能体架构理解为穿上风衣的Claude Code。
Where we think about agent native architectures as sort of like Claude Code in a trench coat.
这意味着,在应用的底层,你不是拥有遵循确定性规则的软件,而是拥有一个智能体。
Which means at the bottom of the app instead of having software, you have like software that works by deterministic rules, you have an agent.
这个智能体与应用的用户界面直接连接。
And the agent is wired up to the UI of the app.
因此,当你点击一个按钮时,实际上只是向智能体发送了一个提示。
And so when you click a button, it is actually just going to the agent with a prompt.
我认为这是我们一直在Every内部探索的一种全新的应用构建方式。
And I think this is a new way of building applications that we've been working on internally at Every.
如果你对此感兴趣,我强烈推荐你看看这份指南。
If you're interested in that, I highly recommend that you look at this guide.
我们会把链接放到聊天窗口里。
We'll put a link in the chat.
但这份指南详细介绍了如何使用或构建代理原生架构。
But this guide goes through how to use or how to build agent native architectures.
这真的很酷,而且能让你动手做出东西来。
It's like it's pretty cool and it's like it makes you build stuff.
比如我就做了这个。
Like I built this.
这是一个我上周末用Claude Code做的Markdown编辑器。
This is a markdown editor that I built with this is a markdown editor that I built with Claude Code over the weekend.
所以在过去几天里,我完成了整个这个项目。
So in the last couple days, I built this whole thing.
它帮我进行跟踪。
It helps me track.
我们内部在使用它。
We we use it internally.
它叫 Proof。
It's called Proof.
它帮助我进行跟踪。
It helps me track.
当我在 Claude 那里收到一个计划时,它能帮我跟踪:我已经批准了哪些内容?
When I get a plan from Claude, it helps me track, okay, what things have I approved?
所以你可以看到,我正在这里批准一些内容。
So you can see I'm approving things here.
因此我可以追踪我批准了哪些内容?
So I can track what have I approved?
我查看了哪些内容?
What have I looked at?
哪些已经完成了?
What's done?
哪些还没完成?
What's not done?
而且,确实这里有很多很酷的功能。
And, yeah, there's a lot of there's a lot of cool stuff here.
我不再多谈原生智能体了,还是回到Claude吧。
I'm gonna stop blabbing about agent native and go back to Claude.
基兰,你有什么想补充的吗?
Kieran, anything you wanna add here?
实际上,作为高级用户,我能加载自己的插件之类的东西吗?
Really, what I'm I'm curious for is, as a power user, can I load my own plugins and stuff like that?
还有人问我们能不能自定义MCP?
And there were questions about can we can you do custom MCPs?
我认为所有MCP你都可以做,而且这个应用可以访问你的设备。
I think all the MCPs you can do, and also this app has access to your machine.
所以它能使用Apple脚本来加载你设备上的内容,这真的很棒。
So it can use Apple Scripts to load things on your machine, which is really cool.
是的。
Yeah.
而且,嗯,我真的很想知道它会走向哪里。
And and, yeah, I'm I'm really curious for how where where does it go.
很明显,它想让普通用户也能使用 Claude Code,但作为高级用户,我也会用它,因为每天早上我会启动 Claude Code 来做每日计划。
Like, clearly, it's trying to get Claude Code to the normal user, but but I think as a power user, I would use this as well because in the morning, I start up Claude Code to make a daily planning.
比如,今天我要做什么工作?
Like, well, what am I going to work on for the day?
但我觉得在应用里做这件事可能更舒服。
But it feels it's probably nicer to do in an app.
而且如果这个功能能延伸到移动端,那就太强大了,因为你可以在手机上完成更多不一定是编程相关的工作。
And also if this translates to mobile, this is super powerful because you can do more powerful work on your phone that's not necessarily code related.
嗯。
Mhmm.
所以,是的,我们需要多做些实验。
So, yeah, we need to experiment.
但还有很多有趣的地方。
But there are lots of interesting thing.
作为一名高级用户,即使这是相同的技术,我也非常好奇。
I'm I'm very curious as a power user even though it is the same technology.
也许这是一种全新的使用方式,这很有趣。
Maybe it's a new way to use it, which is interesting.
是的。
Yeah.
这是我让AI帮我做的日历审计。
So this is my calendar audit that I asked it to do.
我让它审计我的日历,并与我的目标进行对比。
So I asked it to audit my calendar and compare it to my goals.
我审查了您整个月的日历活动。
I reviewed your entire month of calendar activity.
这是一般版的Claude可能不会做的事情。
So this is something that like regular Claude probably would not do.
我有很多会议和每日站会。
I have a lot of meetings and stand ups.
所以这是一个有趣的点。
So that's an interesting one.
实际上我并没有参加很多这些会议,所以这可能不太公平。
I actually don't go to a lot of these, so that's probably not fair.
而且我还有很多单对单会议。
And I also have a lot of one on ones.
我安排了很多播客,但我正开始取消它们。
I have a lot of podcasts scheduled, which I'm starting to get rid of.
所以它实际上做得相当不错。
So it did it actually did a pretty good job.
内容媒体、健康、不可妥协,等等等等。
Content media, health non negotiable, blah blah blah.
很多天都有十到十五个以上的安排事件。
Many days have 10 to 15 plus scheduled events.
这听起来大概是正确的。
That sounds probably right.
这很有趣。
This is interesting.
它问:你的首要任务是什么?
It said, what are your top priorities?
我本来以为Claude会知道这个。
I would expect Claude to know this.
我在想它是否已经访问了我所有的记忆,因为Claude肯定知道我的优先事项是什么。
I wonder if it has access to all my memories yet because Claude definitely knows like what my priorities are.
但不管怎样,这挺酷的。
But anyway, this is pretty cool.
我喜欢这个。
I like this.
看看它有没有做到,哦,我们得到了分类体系。
See if it did oh, we got we did we got the taxonomy.
感觉它像是在慢慢懒加载整个对话历史。
It's it feels like it's it's like slowly lazy loading all the conversation history.
所以它并没有发生,就像这件事没发生一样。
So it doesn't it's like this isn't happening.
这已经是我们做过的了。
This is already our this is already done.
只是它还没加载出来。
It just didn't load it.
我觉得这里的很多功能还没来得及实现状态提示,所以很容易看出来,比如在代码标签页里,我通常能清楚地看到哪些内容已经合并了,哪些还在等我处理,但这里却只是一个杂乱无章的列表,大概是按时间新旧排序的,而且没有任何视觉上的区分,这挺有意思的。
I feel like a lot of the affordances here, they haven't built the statuses yet, so it's easy to see, like, you know, in in the code tab, for example, like, I could pretty easily see usually what I've merged and what's waiting for me and stuff, but this is, like, just an own an unorganized list, I guess, by recency or it's not there's no visual differentiation, which is interesting.
你好,凯特。
Hello, Kate.
我们的主编凯特就在镜头外。
Our editor in chief, Kate, is just off camera.
凯特,事情进展得很顺利。
Kate, things are going well.
我们这里有2300人。
We have we've got 2,300 people here.
太惊人了。
Incredible.
一起查看Claude Code。
Looking looking at Claude Code together.
这太令人兴奋了。
That's so exciting.
是的。
Yep.
哦,你想要,好的。
Oh, you want okay.
让我先,嗯。
Let me just yeah.
也许我该试试这个。
Maybe I should try that.
如果有人来自Anthropic想来直播和我们交流,我能看到你们在聊天区留言。
If anyone from Anthropic wants to come on the stream and talk to us, I can see you commenting in the chat.
我会发给你们一个StreamYard的链接,只要说一声‘是的,我想来直播聊聊’就行。
I'll send you a link to StreamYard just like say, yes, I would like to come on the stream and chat.
我们非常友好,如果你们已经在场,我相信很多人都很想听听你们的声音。
We're very, very friendly and I think a lot of people would love to hear from you if you're already here.
我会更新这个应用。
And I will update the app.
这确实是,这其实很重要。
This is yeah, this is actually, this is important.
我现在有个测试版,所以这个功能我还没更新到我的应用里,我有点不敢在直播中更新。
I have a beta build right now, so this is something that was not I didn't I haven't updated my app, and I'm a little afraid to do it on a livestream.
基兰,你的应用里有这个功能吗?
Kieran, do you have it in your app?
我很好奇。
I'm curious.
我正试着让它运行起来,但这里一度瘫痪了,不过现在看起来又恢复了。
I I'm trying to get it working, but it it was down for a while here, but it looks like some it's it's wake up again.
太好了。
Cool.
还有其他人有什么问题或者想让我试试的吗?
Anybody have other questions or things they want me to try?
我在收请求。
I'm taking requests.
所以尽管提出任何有趣的查询吧。
So ask for ask any interesting queries.
最有趣的查询,我们会在每次进行Vibe Check时都尝试一下。
More most interesting query, we'll put in every when we do our Vibe Check.
我们来看看。
Let's see.
我想知道我能不能用这个来写代码。
I wonder if I could use this to code.
我有点想看看。
I'm kind of like let's see.
哦,这个它确实对我们每个代理原生指南进行了视觉检查和审计。
Oh, this it did our vision it did our audit of every the every agent native guide.
你能看看它有没有出现伪影吗?
Can you see if it has artifacts as well?
如果类似的话,可能吧?
If it's similar, maybe?
好的。
Okay.
它确实有,嗯,这个没有产生伪影,但我们确实有其他伪影,好的。
It does have a well, it didn't do this one in an artifact, but we do have artifacts Okay.
在另一个地方。
In another place.
让我找一下。
Let me just find it.
我注意到右侧明确地写出了这个上下文。
I do the fact that that context is clearly spelled out on the right.
是的。
Yeah.
这很不错。
It that is nice.
看,上下文就在这里。
See, yeah, context is here.
我曾在某个地方看到过工件标签。
I saw the artifacts tab somewhere.
是的。
Yeah.
它对我来说就像是一个更友好的 Claude Code 版本。
It's just like a friendlier ver it's like a friendlier version of Claude Code to me.
是的。
Yeah.
我的也在正常运行。
Mine is working as well.
你能找到我在Cascade项目中的Every Proof仓库吗?
Can you find my Every Proof repo in Cascade projects?
我 basically 想让你总结一下我一直在Every Proof中开发的新功能——即溯源追踪功能,并写成一个漂亮的HTML文件工件,发给Kieran,向他解释新的溯源功能将如何工作。
Basically, want you to do a summary of the new feature, the the provenance, the new feature provenance tracking that I'm I've been building in Every Proof and write it up in a nice, like, HTML file artifact that I can send to Kieran to to explain to him how the new provenance is gonna work.
我觉得这是一些既与开发工作相关,但又可能不是我会让Claude Code去做的那种任务。
I think it's this is a combination of something that is kinda it's dev work related, but it's probably not something I would ask Claude Code to do.
在这个Linux虚拟机环境中,我无法访问你的max文件系统。
I don't have access to your your max file system in this Linux VM environment.
有意思。
Interesting.
这很有趣,因为它确实有访问权限。
So that's interesting because it definitely does have ask access.
我们来看看。
Let's see.
我觉得让人困惑的一点是,当它在你的电脑上运行时,和不在你电脑上运行时的区别。
One of the things that gets confusing about this I guess is when it's running on your computer versus when it's not.
我认为对普通人来说,可能很难理解的是,当你在聊天中使用它时,所有操作都是在线进行的。
I think it's sort of unintuitive to the average person probably that when you're using it in chat, it is all online.
而当你在这里使用它时,实际上是在你自己的电脑上运行。
And when you're using it in here, it's actually on your own computer.
我很好奇,他们在用户体验方面是如何设计来明确这一点的。
I'm very curious like how they thought about making that clear from a UX perspective.
如果有人来自Anthropic并正在观看这个直播,为什么它会认为无法访问我的max文件系统?
And if anyone is from Anthropic on this stream, why does it think that it can't access my max file system?
哦,也许我需要专门添加这个文件夹。
Oh, maybe I have to add that folder specifically.
哦,对了。
Oh, yeah.
原来就是这个原因。
That's what it is.
这太有趣了。
That's so interesting.
说实话,我只是想随便给它访问我文件系统的权限。
I just want I just wanted to YOLO give it YOLO access my file system, to be honest with you.
看看这些项目,这就是你知道我有问题的方式。
Look at all these projects, by the This is how you know, like, I I have a problem.
这些基本上都是Vibe Coded项目。
Like, these are all Vibe Coded projects, basically.
我们来看看。
Let's see.
每个证明。
Every proof.
你知道吗,就是这个。
You know what, it's this one.
好的,始终允许。
Okay, always allow.
好吧。
All right.
现在我可以了,独白的好处是。
And now I can The nice thing about monologue.
所以独白是我们Every的一个应用,我可以直接用快捷方式。
So monologue is one of our apps at Every is I can just There's a shortcut for this.
我只需点击一下,然后重新粘贴,真棒。
I can just click it and then repaste And cool.
但说实话,我真正想要的是访问我整个电脑。
But yeah, what I really want, I just wanted to access my whole computer.
它有那个,这很有趣。
It has that That's interesting.
它有那个文件清理提示。
It has that file cleaning prompt.
我想知道如果它无法访问,这个功能是怎么工作的。
I wonder how that works if it can't access.
是的
Yeah.
对,没错
There yeah.
对,我现在也在测试它
There I'm I'm testing it now as well.
我想看看能不能运用我已有的技能
I'm trying to see if I can use the skills I already have.
我也可以
I can also
哦,有意思
Oh, interesting.
我说过要整理并隐藏我的下载文件,结果它似乎自动弄明白了,这真是太原生了
So I said organize and hide up my downloads, and then it seems like it figured out This is so agent native.
它居然知道怎么选择,就好像我手动选了那个文件夹一样,嗯
It figured out how to select It's as if I selected that folder in the Mhmm.
在用户界面上。
In the UI.
就是因为我是特别这么要求的。
It just Because I specifically asked for it.
这很酷。
That's cool.
我觉得这是一个非常聪明的设计。
I think that's a really smart affordance.
我真希望我在这里也能被激活。
It just like I wish I had gotten activated here too.
它知道我正试图访问级联项目中的一个文件夹,就好像我点击了这个一样。
Like, it it knows that I'm trying to access a folder in cascade projects and it is as if I clicked this.
你确定吗?
Are you sure?
这是很自然的。
It's the natural.
看看这个是否真的有效。
See if this is actually working.
费利克斯,你加入吗?
Felix, are you joining?
太棒了。
Amazing.
我在 X 应用、也就是那个万能应用上找到了你。
Me just find you on the x app, x the everything app.
而且
And
如果你能共享我的屏幕,我就可以展示一个简短的操作,我会这么做。
Then if you can share my screen, then I can show a short I will do that.
好的。
Yeah.
我来共享一下。
Let me share it.
你那边操作的时候。
While you do that.
实际上,这很有帮助。
Actually, this is helpful.
好的。
Alright.
我要删掉了。
Gonna remove.
搞定。
There you go.
你已经开始了。
You're off the races.
所以我刚才在试这个。
So I was trying this out.
帮我生成一个VST插件,问我一些用户问题,我找到了一点,发现了一些东西。
Help me generate a VST plugin, ask me user questions, and I found a little so I found some things.
所以,基本上,询问用户问题这个功能很棒,因为它的界面就是这样。
So, basically, the ask user question, love because it's this UI.
它会引导你一步步进行,你可以点击一、二、三、四、五,这非常不错。
It, like, runs you through, and you can hit one, two, three, four, five, which is very nice.
奇怪的是,我根本没有回答,它却自动跳过了这一步。
The weird thing is I didn't answer it, and it started automatically skipping this.
也许现在修好了,但在另一个版本里,它会自动开始——看这里。
Maybe it's fixed now, but in the other one, it started, automatically, ah, here.
跳过。
Skipping.
好了,就这样。
There we go.
看到了吗?
See?
所以如果你的鼠标不在这里,它就会以为用户不在,于是干脆直接跳过这一步,这非常令人困惑。
So if your mouse is not on here, it just thinks, oh, this user is not here, so we're going to skip this altogether, which is very confusing.
但我也很喜欢,因为我是个喜欢随意跳过权限的人,我能理解。
But, also, I love it because I'm a dangerously skip permissions person, and I understand.
但奇怪的是,如果你在这里,滚到最顶部,它就会跳到下一个。
But it's weird because if you're here, scrolled all the way up, it will skip to the next.
所以就让它这样吧,这里有点奇怪。
So just leave it, strange, here.
但酷的部分在这里。
But the cool part is here.
我可以说‘三’,它就会继续下去。
I can say three and it will go and continue.
所以我觉得它会一直继续下去,而且设置成会持续运行直到完成。
So there's like I like that it keeps keeps going, and it's set to, like, keep going and finish.
这里的跳过界面有点奇怪,但比如说多点延迟。
The skip UI here is a little bit weird, but let's say multitap delay.
你可以看到,它和我的技能,或者说是技能加成在协同工作。
And you can see, it's working with my skill, or skills skills juice.
哦,是的。
Oh, yeah.
这是我的。
It is mine.
Happy Sharp Hopper 只是这个事件发生的本地地点。
Happy Sharp Hopper is just the local place where this is happening.
说一下,这是一个以前从未存在过的界面,这很酷。
Say this so this is a interface that never existed before, which is cool.
是的。
Yeah.
这有点奇怪,因为它在这里是按行排列的,但实际上你是在回应。
There is a it's it's a little bit weird because it's in line here, but in reality, you're answering.
所以,我相信,我们来看看。
So, I'm sure let's see.
发送请求。
Sending requests.
是的。
Yeah.
所以这个跳过部分非常令人困惑。
So the skipping part is very confusing.
我不明白为什么现在会跳过,但我更希望在开始会话时就能知道这是什么类型的会话。
I don't know why it's skipping now, but I do like like, I would rather say maybe when I start the session, what kind of session it is.
比如,如果是‘YOLO,冲吧’会话,或者它在协作者标签里明确提醒我:嘿。
Like, if if it's, like, YOLO, let's go session or where it, like, pings me and, like, very clearly in the Cowork tab says, like, hey.
你需要填写一份问卷。
You need to answer a questionnaire.
我需要你的关注,因为有些事情需要同时告知双方,但现在它只是模糊地藏在邮件里,不够清晰。
I need your attention because, like, there's something to be said to both and now it's kind of somewhere in the mail where it's not super clear.
所以我更希望它能直接喊我:嘿。
So I rather have it say yell at me and say, yo.
我需要你的输入,但我并不想参与创建或使用这些功能。
I need your your, I need your input on something, and, I don't wanna give input on, like, creating or using things.
但如果是的话,它会改变这个提问方向,那这个功能应该挺有用的。
But, like, if it if it will change the direction of, what this will be with the ask user question probably that that is handy to have.
到目前为止是这样。
So far this.
挺有意思的。
Interesting.
既然这里好像有一些Anthropic的人,我有个想法。
I I have a since we have it seems like there are some Anthropic people on here.
我对‘询问用户问题’这个功能有些感受。
I do have a a feeling about ask user question.
Kieran,我想听听你的看法。
I'm I'm curious what you think, Kieran.
我只是觉得,它显示的字符数有限,超出的部分就会被隐藏,这真的让我特别烦。
I just there's a limit to how many characters it displays, and then it just goes over and hides the rest of my answer, and that just annoys the shit out of me.
你知道我说的是什么吗?
Do you know what I'm talking about?
是的
Yeah.
我知道。
I I know.
是的
Yeah.
它需要更具灵活性一些。
It it needs to be a little bit more flexible.
而且,为什么不能有20个选项呢?
Also, like, I want, like, why not 20 options?
为什么上限是五个呢?
Like, why is five the maximum or something?
有时候你确实需要20个。
Like like, sometimes you just have 20 that you need to.
我明白,但确实也是。
Like, I get it, But also yeah.
别问问题了。
Stop questions.
去构建吧。
Go build.
这样很好。
So that is nice.
现在就做出来。
Just make it now.
好的。
Okay.
所以它并没有真正像……它那边还在做些事情,但这里,嗯,还是有点奇怪。
So it did not really like, it's still doing stuff there, but here, like, it's I mean, it's a little bit wonky still.
但我很喜欢询问用户的问题流程。
But I love I love the ask user question flow.
这非常有用。
It's very useful.
好的。
Okay.
所以它在这里执行,你可以看到右边的待办事项。
So it does it here, and you see the to do right, which is here on the right.
等等。
Wait.
我刚才走神了。
I was distracted.
你在这里正在开发什么插件?
What plugin are you building here?
你能倒回去一下吗?
Can you can you back up?
我只是在埋头苦干。
I'm just grinding.
我想到了一千个。
I think about a thousand.
好的。
Okay.
所以我有一个叫做‘果汁技能’的技能,它对VST开发了如指掌,我想请你帮我头脑风暴一下如何设计一个VST插件。
So I I will I have a skill called the juice skill, which knows everything about about VST development, and I'm creating a I asked, can you help me brainstorm a VST plugin?
我们正在做一个延迟效果。
And we're doing a delay.
哦,就像是一种音频效果。
Oh, like, it's it's a audio effect.
所以这是你在音乐制作中使用的数字声音处理技术。
So digital sound processing that you use in your
我明白了。
I see.
是的。
Yeah.
音乐制作。
Music making.
是的
Yeah.
对
Yeah.
我现在正在制作一个延迟效果,并且正在构建它。
And I'm making a a delay now, and it's building the delay.
通常来说,构建延迟时,Claude Code 很棒,但有时候你只是想头脑风暴一下。
And, like, normally normally, the building of the delay, Claude Code is great for, but sometimes you wanna brainstorm.
你并不想真的去构建。
You don't wanna build.
这就是我喜欢 Cowork 的原因,我只是想稍微头脑风暴一下。
And that's why I like Cowork because I just wanna brainstorm a little bit.
有趣的是,它具备这些技能。
And the cool part is it has these skills.
我明白了。
I see.
所以,是的。
So, yeah.
凯里安,我想打断你一下,因为现在有来自Anthropic的团队成员在直播中。
I wanna interrupt you really quick, Kieran, because we have a of the team from Anthropic here on the stream.
费利克斯,欢迎。
Felix, welcome.
大家好。
Hi, friends.
你们怎么样
How are
大家?
y'all?
嘿。
Hey.
很好。
Good.
你怎么样?
How are you?
我们以前没见过面。
We never met before.
跟我讲讲你自己吧。
Tell me about you.
你喜欢什么?你在Anthropic做什么工作?
What do you like, what do you do at Anthropic?
你是怎么参与进来的?
Like, how are you involved in this?
我怎么参与进来的?
How am I involved in this?
我在Anthropic待了一段时间,但这是我的团队在这里开发的产品。
I've been in Anthropic for a little bit, but this is the product that my team has built here.
我们过去一周半一直在全力推进这个项目。
We sprinted at this for the last week and a half.
我们试图做的
What we're trying to do
和一半?
and a half?
就这样?
That's it?
拜托。
Come on.
明确地说,我认为很多人已经意识到,像Claude Code这样的工具用于非编程工作会对人们有所帮助,而我们真正想做的是帮助人们完成工作,无论是个人事务还是企业事务。
To be clear to be clear, I think many people have had the idea that something like Claude Code for noncoding work would be helpful and useful to people and fundamentally what we wanna do here is we do wanna help people out with their work, like whether that's a personal thing or a corporate thing.
我们之前有过多个原型,尤其是在圣诞节前,但我觉得在假期期间,我们发现了一件事——我相信很多人都注意到了,越来越多的人正在像我们一样,用Claude Code来做几乎任何事情。
And we've had a different number of prototypes, in particular before Christmas, but I think over the holidays, one thing we have seen, I'm sure many people have seen this, is that an increasing number of people is using Claude Code for almost anything just like we are, right?
我们正在用Claude Code自动化我们的整个生活。
We're sort of like automating our entire lives with Claude Code.
所以我们思考,有什么小而早期的功能,我们可以尝试推出,与用户一起迭代,真正弄清楚什么样的用户体验才是正确的,我们最终应该构建什么样的产品。
So we were thinking what is a small early thing that we can try out and ship to people and iterate with them together to really figure out what is the right is the right user experience, what is the right thing we aim to build.
这就是了。
And this is it.
这是个研究预览版,非常早期的alpha版本,有很多粗糙的地方,就像你已经看到的那样。
This is the the sort of like research preview, very early alpha, a lot of, like, a lot of, like, rough edges as you've already seen.
对吧?
Right?
关于它,我觉得我们很快就能改进很多地方。
There's a lot of things about it that I think we're going to improve very quickly.
嗯哼。
Mhmm.
但这是我们尝试公开构建,并与外界的人们共同合作的努力。
But this is our attempt to, like, build in the open and work together with people out there.
我非常喜欢。
I love it.
跟我们说说你们做出的一些设计决策吧。
Tell us about, like, some of the design decisions you made.
比如,早期的一个设计是,我们没有把协同工作模式加到聊天标签页里,而是增加了一个第三方标签。
Like, an an an early one, for example, is there's a third tab instead of maybe adding a co work mode into the chat tab.
你们在思考和制定当前产品功能设计时,过程是怎样的?
Like, how did you think about and what was the process to to to come to the the design that you have currently for how the product works?
这是个很好的问题。
That's a great question.
我认为,目前你在各类智能体技术应用中看到的用户界面——不仅限于Anthropic,而是整个行业——在未来一两年内可能会发生巨大变化。
So I think I think one belief I have is that the current user interface that you see across agent tech applications, not just at Anthropic but across the industry, it's probably going to change pretty dramatically in about a year or two.
现在,我们有大量高度专业化的独立输入框,以及围绕特定任务的大量自定义框架。
Right now we have these hyper specialized individual input fields and we have a lot of custom scaffolding around the specific task that you're going to do.
但随着模型智能的提升,以及整个行业或许能更好地解决通用性问题,我预计我们最终会看到更少的界面,却能支持更广泛的应用场景。
But as we see the intelligence of models improve and as we also, like, maybe holistically as an industry figure out a little bit of the generalization problem, I expect that we're actually gonna see a smaller number of interfaces for a wider range of use cases.
目前我们这样分开设计的原因,是为了明确表明:这个独立的功能模块仍处于建设中。
For now, what we're doing is the reason we broke it out is because we wanna be pretty transparent that this separate thing is a construction site.
对吧?
Right?
我们相当于让你走进了我们的厨房。
That we're sort of letting you into our kitchen.
我们希望和你一起合作。
We want to, like, work together with you.
我们几乎每天都会上线一些新功能、修复一些bug,尝试一些新东西。
We wanna ship almost every single day some new features, some bug fixes, try out some things.
所以这个独立的标签页相当具有实验性。
So this separate tab is fairly experimental.
你可以说它处于前沿甚至最前沿,只是没那么精致,迭代速度更快,这也是为什么它被放在独立标签页里的主要原因之一。
It's you could say on the frontier or the bleeding edge, but it's just a little bit a little bit less polished and a little faster pace, and that's one of the main reasons why in a separate tab.
也有一些技术上的原因。
Are some technical reasons too.
我可以告诉你其中一个原因:目前这个功能是在你的电脑上运行的,所以你的聊天记录是本地的。
I could tell you, like, one of them is that, currently, this is running on your computer, so your chats are local.
它们不会与其他设备共享。
They're not shared with other devices.
我们在如何赋予Claude更多代理能力方面,正变得更加积极主动。
We're being being a little bit more aggressive in how many agentic abilities we give Claude.
这些是主要原因。
Those are the main reasons.
你是怎么考虑的呢?因为我觉得这会是一个巨大的用户体验障碍。
How did you think about because I feel like that's such a huge UX hurdle to get over.
你们是怎么考虑让用户知道的呢,嘿。
How did you think about letting people know, hey.
这个功能实际上是在你的电脑上运行,而聊天功能虽然在同一个应用里,却不是这样,对吧?
This is actually running on your computer versus chat, which is in the same application is not?
是的。
Yeah.
这看起来太难了。
That seems so hard.
对。
Yeah.
我认为我有一个梦想,我相信很多人也有这样的梦想,那就是它其实并不重要。
I think the dream that I have, and I'm I'm sure many people have this dream, the dream that I have is that it doesn't really matter.
对吧?
Right?
比如,你的代码运行在哪里,这应该只是一个技术实现细节,就像你访问newyorktimes.com时,根本不在乎它用的是WebSocket还是别的技术一样。
Like, where your code runs, it should be technical implementation detail, and it should matter to people as much as you visit the newyorktimes.com, like, is it using WebSockets or not?
谁会在意呢?
And it's like, who cares?
是的。
Yeah.
我认为,对于我们来说,现在是一个机会,可以更快地推进、更快地发布产品,同时也更紧密地与我们为之开发产品的人合作。
I think for us right now, it's an opportunity to move a little bit faster and to ship a little bit quicker and also, like, work a little bit closer with the people for whom we're building this.
我坚信,独自一人闭门造车是很难打造出优秀产品的。
I have this strong belief that it's very hard to figure out a great product in isolation by yourself.
你一个人躲进洞里,埋头苦干一年,最后终于把东西做出来了。
You sort of go up into a cave and you work on something for a year and eventually comes out.
我认为以这种方式打造一个好产品真的很难,我经常提醒人们,就连初代iPhone也缺失了许多我们现在视为基本功能的东西。
I think it's really hard to build a good product that way, And I often like to remind people that, like, even the first iPhone was missing missing a bunch of things that we sort of consider to be table stakes.
所以,是的,我认为这是一个相当大的障碍。
So, yeah, I think it's a pretty big hurdle.
但我们目前对此并不在意,因为我们确实希望那些加入这次旅程的人能有意识地参与进来。
But we're okay with that for now because we do want people who are signing up for this ride to, like, sign up for it fairly intentionally.
是的。
Yeah.
我认为这是一个非常有趣的模式:让我们快速发布,把它作为一个新功能放进应用里,可能只有少数人会点击,这样我们就能把它公开出来,一起开始迭代,而不是试图把它做到完美。
I think that's a really interesting pattern is, like, let's ship really fast and we'll ship it as a new thing in the app that maybe fewer people will click on so that we can get it out in the open and start iterating together rather than, like, try to make it perfect.
尤其是在这个世界上,你说你为这个版本花了半个月时间,这简直不可思议。
Especially in this world where it says, you said you were working on this for week and a half, this version for a week and a half, which is kind of insane.
基兰,你有什么问题吗?
Kieran, do you have any questions?
是的。
Yeah.
我很好奇。
I'm I'm curious.
比如,现在发布的是这个版本,但你们心里设想的版本是什么样的呢?
Like, clearly, is the version that's out now, but, like, what is the version, like, in y'all's head?
你们接下来想往哪个方向走?
It's like, what are the like, where where where do you wanna go next?
或者,你们梦想中的功能有哪些?
Or, like, what are the things you're dreaming about?
你刚才用了‘梦想’这个词。
You you used the word dream.
你们真正想实现的那些目标是什么?
Like, what are those things where you wanna wanna go?
你们说,我肯定团队里的每个人都有过疯狂的想法,然后被否决了。
What what is because I'm sure everyone on the team had, like, wild ideas, and then they were like, no.
我们得在周一上线,所以你们能分享一下那些被放弃的想法吗?我们很想知道。
We need to ship Monday, so let's just, like what are if you can share any of those, we'd love to hear those.
我非常喜欢这个问题,因为我觉得我其实也想问你们两个同样的问题:你们希望这个产品往哪个方向发展?
I I love that question so much because I think I actually have the same question for the two of you, right, which is where do you want this to go?
你们想做什么?
What do you wanna do?
我已经听你们说过,你们想让这个工具能访问你们的整个电脑,就像那个多项选择的功能,但我们可以稍微调整一下做法吗?
I've already heard you say you kinda wanna give it to access to your entire computer, the multiple choice thing where, like, actually, can we, like, shift around a little bit on how we wanna wanna do this?
我觉得现在我更倾向于先看看大家的想法,然后尝试无数种可能性。
I think right now, I am much more in a mode of, okay, let's see what people think and then try out a billion things.
其中一些可能并不对。
Some of them will probably be the wrong thing.
但也会有一些是正确的。
Some of them will be the right thing.
但对我来说,更重要的是人们想用这个工具做什么,而不是我自己的个人梦想或愿景。
But I think it's much more interesting to me what people wanna do with this rather than like what's my own personal dream or vision.
这说得通。
That makes sense.
我过去所构建的一些东西,总是会发生这种情况。
And the things that I've sort of built in the past, this was always this was always the thing that happened.
对吧?
Right?
你总会有一个想法,觉得人们会如何使用你所构建的东西。
You have like an idea of how people will use the thing that you build.
但他们实际上会以各种你意想不到的方式使用它,然后你就顺势而为。
They actually find a use for it in all these other ways, and then you lean into that.
所以我非常希望,我们能深入了解人们想要什么、不想要什么、喜欢什么、不喜欢什么。
So I'm really hoping that I'm really hoping that we can learn a lot about what do people want, what do people not want, what do they like, what do they dislike.
我肯定人们会对这个产品有一些不满,然后我们就能调整和迭代。
I'm sure people will, like, dislike a few things about this, and then we, like, adjust and, like, iterate on it.
基兰,这正是Go Fork的精彩之处。
That's a really cool thing about a Go Fork, Kieran.
是的。
Yeah.
所以鲍里斯在构建Claude Code方面非常出色,能让人们弄清楚自己想要什么。
So Boris is very good in building Claude Code in a way that people can figure out what they want.
比如,你有没有采用类似的策略,提供一些构建模块或工具给我们?
Like, is there a way like, do you use that strategy in a in a way as well where we where you get some building blocks or things for us?
例如,我可以添加自己的插件或技能吗?
Like, for example, can I include my own plugins or skills?
或者,人们是否也能在Cowork中以非编码的方式进行实验?
Or, like, is there a way for people to experiment inside Cowork as well in the, like, maybe the non coding way?
还是说,这本身就是最终产品?
Or is it really like, this is the product.
它就是这样的。
It's that's what it is.
因为Claude Code和使用者之间有一个很棒的平衡,它非常易于改造。
Like, how because there's a cool balance between, like, how Claude Code works and people that use it because it's super hackable.
那么在Cowork中,对于非技术人员,也有类似的哲学吗?
Like, is there a similar philosophy in Cowork as well for noncoders?
是的。
Yeah.
就像非常可组合的。
Like, very composable.
对吧?
Right?
你刚才提到鲍里斯非常擅长引导Claude Code朝向尽早发布并迭代、观察用户使用方式的方向,这其实挺有意思的,因为我认为我们今天之所以发布,甚至可能发布得更早,正是因为鲍里斯推动了我,他说:嘿。
Like, the the the first thing you said about, like, Boris being very good at steering Claude Code in this direction of shipping early and then iterating on it and seeing how people use it, it's really funny that you mentioned that because I think one of the reasons we've shipped today and maybe shipped a little earlier was because Boris pushed me and was like, hey.
你最好把这个展示给人们看看。
You should probably show this to people.
看看他们会怎么做。
See see what they do.
关于可组合性这一点,我觉得在过去几周、甚至过去两个月里,我自己最印象深刻的是,我越来越依赖技能了。
And on the composable piece, I think the the thing that I found most impressive in my own work over the last couple of weeks and maybe sort of the last two months is that I am really leaning into skills.
所以,以前我会写MCP工具,或者一些非常针对Claude的特定框架,但现在我直接写技能。
So instead of, like, previously writing MCP tools or, like, this, like, very specific harness that is, like, very tailored towards just Claude, I instead just write Skills.
有时候我仍然会写一个二进制程序,然后在技能文件中描述如何完成某件事。
Sometimes I still write a binary, and then I describe in a Skill how to do something.
对吧?
Right?
我在想,有什么好例子呢?
I'm like what's a good example?
我正在为自己制定一个马拉松训练计划,写了一个小的二进制程序,从各个平台获取我的运动数据。
I'm working on like a marathon training plan for myself and I wrote a little binary that fetches all my athletic activities from various pages.
但接着我只在技能文件的Markdown里写:嘿,Claude,如果你想制定一个训练计划,请遵循以下指南。
But then I just write in markdown in a skill file, hey, Claude, if you wanna if you wanna make a training plan, please follow the following guidelines.
我们自动将你在Cloud AI中安装的任何技能加载到Cowork中,我认为这一点会变得越来越重要,尤其是随着Opus四或五的出现,情况更是如此。
We do automatically load any skill you have installed in Cloud AI into Cowork, and I think that's probably going to be increasingly, especially as well as it's harder and especially with Opus four or five.
它在遵循技能方面表现得非常好。
It is so good at following skills.
所以,技能可能是我目前主要利用的可扩展接口。
So Skills is probably the primary hackable surface that I'm exploiting right now.
这很棒。
That's great.
你之前在对话中提到,你觉得将来会越来越少有UI服务。
One thing you said earlier in the conversation is you're you think that there's gonna be fewer like UI services.
这意味着随着时间推移,UI服务会越来越少吗?
Does that does that mean that like over time there'll be fewer UI UI services?
过去几年里,大家一直在争论聊天是否是AI的最终交互形式,很多人都说不,还需要更多UI。
Does that mean because there's a lot of debate over the last couple years about is chat the final form factor for AI and everyone was like, no, need more UI.
你是否坚定地认为,自然语言才是未来的主流,我们会逐渐减少UI服务,转而通过与一个代理或代理协调器对话来完成任务,而这个协调器再去与多个其他代理交互?这就是你所推崇的方向吗?
Are you is that are you putting your stake in the ground as natural language actually is here to stay, and we're gonna have fewer UI services where you just talk to an agent or maybe an agent orchestrator that goes and talks to a bunch of other agents, and that's the kind of form factor you're you're pushing towards.
所以它看起来有点像今天Claude Code的样子。
So it looks a little bit like how Claude Code does today.
是的。
Yeah.
我认为这仍然存在很大争议,目前并没有Anthropic的官方立场。
I think this is still very heavily debated, there's certainly no Anthropic viewpoint.
我甚至不确定我们这个小团队能否在整体上达成一致的看法。
I I not even sure that there's a viewpoint that my fairly small team would, like, holistically agree with.
对吧?
Right?
我认为人们对未来如何与AI和模型互动有着非常不同的设想。
I think people have very different visions about how will people interact with AI and models in the future.
如果你问我个人的看法,我认为我信奉两点。
If you ask me very personally, I think I believe two things.
第一点是,聊天输入及其各种形式,不仅针对模型,而是普遍而言。
One is that the chat input and its various forms, not just for models, but in general.
对吧?
Right?
比如那个文本框,你可以把它放在任何你想放的地方。
Like the idea of this text box and you put into the text box where you want.
如果你足够广义地理解,甚至把google.com或者Chrome的地址栏都看作是‘我想要某样东西’的输入框,我认为这种形式会比我们所有人想象的都要持久得多。
If you generalize it enough to say even google.com or the address bar in Chrome is like a I want something input box, I think that is going to stick around for much longer than we all think.
这是我首先想到的。
That's the first thing I think.
我认为我们会继续保留类似‘搜索我想要什么’的输入框。
I think there's we will continue to have something that looks like a search I want something box.
但第二个问题是,你到底有多少个独立的输入框?
But the second question is like how many separate boxes do you have?
对吧?
Right?
比如,你是否有一个专门用于代码的输入框?
Like do you have one box for a code?
你是否还有另一个用于个人娱乐的输入框?
Do you have another box for maybe like a personal entertainment?
你是否还有一个用于医疗相关事务的输入框?
Do you have another box for healthcare related concerns?
我不认为我们会拥有太多这样的输入框,也许我还是会回到谷歌。
I'm not sure we're gonna have too many boxes of those, and there too maybe I would go back to Google.
我觉得我依稀记得2000年代初,每个Google子产品都有独立的搜索框,但如今,你只需在Chrome的搜索栏里输入你想找的内容,就可以了。
I think I sort of remember the early two thousands weighted different search box for every single Google sub product, and increasingly, you just type what you wanted to your Chrome search bar and you Yeah.
你不再需要逐层点击进入子页面了。
You don't you no longer actually, like, go to a subpage of subpage.
比如我现在专门在找购物相关的东西,就会直接去Google购物。
Like, I am in the mode right now of looking specifically for shopping things, I go to Google Shopping.
是的。
Yeah.
如果未来我们看不到一种更智能的通用化趋势,能自动判断你的意图,我会感到惊讶。
And I would be surprised if we don't see a similar generalization that is smarter about figuring out what you wanna do in the future.
我们可能仍会保留不同的界面,它们会区分并理解你想要做什么,因此为你展示对应的操作界面,但入口点呢?
We might still have different interfaces where it sort of like splits out, right, and understand why I understand that you're trying to do x, therefore I'm gonna show you UI for x, but the entrance point.
我觉得
I think
是的。
the yeah.
对此的一个有趣反例是微软的Excel,我认为它在某种程度上与AI的通用工作方式有相似之处,它是一个通用型产品。
The interesting counterpoint to that is something like Microsoft Excel, which I think it also has some similarities to the way that, you know, just generally AI works is it's this general purpose product.
上手非常简单。
It's super simple to get started.
你可以用Excel做出无限复杂的东西,而Excel催生了B2B SaaS浪潮。
You can make things real like endlessly complex with Excel, and then Excel sort of spawned this the b to b SaaS wave.
如果没有Excel,可能就不会有B2B SaaS。
Like, you probably don't get b to b SaaS without Excel.
所以我认为还有另一种观点:你拥有这些极其通用的工具,然后人们在其中发现了强大的工作流,这些工作流随后被拆分出来。
So there's I think there's also the other argument that you have these sort of general really general tools, and then people find power workflows within them that then get split out.
是的。
Yeah.
对。
Yeah.
我认为Excel是一个绝佳的例证,它体现了太多东西,因为对许多开发者来说,它某种程度上存在于边缘地带。
I think I think Excel is, like, such a beautiful example of so many things because it's, for many developers, something that sort of exists a little bit on the periphery.
对吧?
Right?
我经常听到这样的类比:Excel的日活跃用户数量,和全球开发者总数相比。
I've often heard the analogies between, like, how many daily daily active users Excel has versus, like, how many developers even exist on the planet.
是的。
Yeah.
这是一个有趣的数字。
It's an interesting number.
我认为,让我觉得有趣的是,Excel及其核心用户对产品的忠诚度在于,这些核心用户并不太在意边际性的生产力提升或界面体验的微小改进,而更看重对产品的深度熟悉。
And I think I think the thing that I find interesting about Excel and the, you know, the commitment it has from its power users is that those power users are not too interested in marginal productivity gains or marginal UI or UX gains over deep familiarity with the product.
我觉得这很有趣。
And I think that's interesting.
我认为这其中蕴含着某种启示,而且我确实在其他各种场景中也见过类似情况:作为开发者,你有时会看到某人的工作流程,心想:‘如果我为他专门开发一个工具,或许能让这个流程稍微好一点。’
I think there's a lesson there in some shape or form and I think I've I've actually seen that across like various other surfaces where you as a developer sometimes look at someone's workflow and you say, oh, I can make this workflow, like, slightly better for you if I make you a specific use case tool over here on the side.
但人们却往往不采纳这个新工具,因为他们更习惯在现有产品中完成特定操作。
And then people sort of fail to adopt that thing because they're actually more comfortable doing specific things within their product.
举个例子,我认为这是我学到的一个教训。
As an example, I think that's a lesson that I have learned.
我以前在Slack工作了很多年,在那里我一再学到这个教训:你可以创建一些看似更能满足用户需求的独立功能,但用户还是会继续在原产品里完成这些操作。
I was previously at Slack for many years, and that's a lesson that I've learned there over and over again is that you can make these separate surfaces that you think might serve people's use cases much better, but they will continue to just do it in
这真是一个非常非常好的教训。
That's a really, really good lesson.
我非常喜欢这一点。
I love that.
说到这个,今天的内容面向非开发者,但我感觉现在有很多开发者正在观看。
Speaking of speaking of that, I think there's there's today is for the non developers, but I feel like there's a lot of developers who are watching this right now.
而你,如果你开发了这类应用,那你一定深入理解了如何构建原生智能体应用。
And you're someone who's, if you built this, so like you're deep into how to build agent native applications.
关于我们一直在思考和讨论的内容,Every最近发布了一份名为《原生智能体架构》的指南。
And this is saying that we've been thinking about and talking about a lot at Every, we just published a guide called agent native architectures.
我们一直在思考原生智能体应用的核心原则是什么。
And we've been thinking about what are the core principles of agent native apps.
我很好奇这些观点是否引起你的共鸣,或者你认为它们有误,又或者你们在Anthropic构建智能体时还有哪些其他关键点可以补充。
And I'm really curious if these resonate with you, if you think they're wrong or if there are things that you would add that are part of how you guys at Anthropic think about building agents.
举个例子,就是对等性。
So an example is parity.
我们在Every内部构建智能体时,会考虑一点:用户通过界面能做的任何操作,智能体也应该能够做到。
So one of the things that we think about when we build agents internally at Every is whatever the user can do through the UI, the agent should be able to do.
我看到Claude Code在这方面有类似的设计,而在你们开发的Cowork中,我也看到一些类似的思路——比如,如果你没有手动选择文件选择器,它会自动判断你希望它选取某个文件夹,并在无需你干预界面的情况下完成操作。
And I see that a little bit that's basically how Claude Code works, but I see that a little bit in in what you built with Cowork where, for example, if you didn't pick the file picker, it'll automatically determine that you are asking it to pick a particular folder and it will do that for you without you having to touch the UI.
这就可以算是对等性的一个例子。
So that would be an example of parity.
是的。
Mhmm.
另一个原则是粒度,也就是说,工具应该比功能更底层,而功能应该体现在提示词或技能中,这样你就能以之前未曾预料的方式组合这些工具。
Another another one is granularity, which is basically tools should be mostly at a lower level than features, and the feature should live in the prompt or the skill so that you can combine tools in new ways that you didn't predict previously.
而这又带来了第三个原则——可组合性,你可以将这些新组合方式结合起来,进而产生第四个原则:涌现能力。
And then that allows for the third one, which is composability, which is you can combine you can combine those new ways and you get the the fourth one, is emerging capabilities.
所以人们只是在做你没想到的事情,你看到了潜在的需求,然后你就为此进行构建。
So people are just doing it for stuff that you didn't expect, and you met you see the latent demand and then you, like, build for that.
因此,我认为这基本上是我对Claude Code工作方式的总结。
And so that's this is essentially, I think, like, a lot of my summary of how Claude Code works.
我想知道,你对这些看法如何?你觉得我们遗漏了什么吗?或者你在大规模生产环境中实践时,有没有学到什么可以让人更好地构建这类应用的经验?
I'm curious of, like, how this sounds to you and if you think that they're we're missing anything or are there any things that you've learned from doing this in production at a huge scale that could make it make people better at building these kinds of applications?
我觉得这些观点非常引起我的共鸣。
I think this really resonates with me.
对吧?
Right?
我认为,在新兴能力背后隐藏着一点:个人和孤立团队很难预测,如果你给代理提供相当基础的工具,它最终会如何变得极其有用。
And, like, I think one thing that's hidden in emerging capabilities is the the inability, I think, especially individuals and silo teams have to predict how an agent actually ends up being super useful if you give it fairly primitive tools.
我认为将工具下沉到一个通用空间是非常强大的。
I think pushing down tools into, like, a general space is very powerful.
工具的可组合性越强,通用性越高,你就越能从模型智能的提升中获益。
Like, the more composable they become, the more generalizable the tool is, the more you will benefit from improvements on model intelligence.
我认为,对于我过去接触过的许多开发者来说,模型智能以及模型有效调用工具的能力提升速度,实际上远快于你开发更多工具并教育用户使用它们的速度。
And I think for for many developers that I've been talking to in the past, it seems like the rate at which model intelligence and model's ability to call tools effectively improves is actually much faster than your ability to, like, maybe churn out additional tools and, like, educate users on them.
我认为,如果你能退一步思考,如何构建一个高度通用的工具,
And I think if you if you sort of take a step back and you think, well, how can I build a very generalizable tool?
你就更有可能打造出能够适应新使用场景的东西。
You have a much better chance to build something that can adapt to new use cases.
所以我觉得这一点非常契合我的想法。
So I think that resonates with me quite a bit.
是的。
Mhmm.
确实如此。
It really does.
那关于权衡呢?
What about the, like, trade offs?
我一直在和基兰讨论工具方面的权衡。
Like, I've been talking to Kieran about the trade offs in tools.
Kieran,你能不能谈谈你在Quora观察到的现象,以及你现在的想法?
Kieran, do you wanna talk about the what you're kind of what you noticed in Quora and what you're thinking about?
是的。
Yeah.
我觉得把内容放进提示词里很棒,再配合工具使用。
So I I think putting things in a prompt is great and then having the tools.
但我们现在突然需要创建一些能读取技能之类的东西的工具。
But there's like we need to now suddenly create tools that then read Skills or something like that.
所以我们得发明这样一个元层。
So, like, we have to invent this meta layer.
比如,技能就像是即时提示注入,但我们需要构建这样的机制。
Like, Skills is like just in time prompt injection, but, like, we need to create that thing.
现在,所有在构建东西的人,除非你使用Claude Code或Claude SDK,否则都是自己从头搭建的。
And now everyone that's building stuff, unless you use Claude Code or the Claude SDK, it's all built.
但问题是,现在出现了一种困境:要么在工具里描述内容,要么创建一个包装它并调用其他东西的工具。
But, like, it's this thing, like, now there's, like, this struggle of, like, oh, but, like, tools are that you can describe stuff in the tool or you create a tool that then wraps around it and then calls something else.
所以这里存在某种摩擦,但让事物可组合确实很棒。
So there's like this friction there and, like, it is great to make things composable.
比如,最初你创建五个工具调用,想要搜索邮件、阅读邮件,等等这些操作。
Like, if originally you create, for example, like, five tool calls, wanna search email, like, read email, this and this and this.
但你也可以选择说不。
But you can also say no.
我们可以直接执行一个工具调用,然后创建能完成这些任务的技能,或者使用MCP,或者某种封装机制。
We do just do, like, an execute tool call and we create skills that can do those things or an MCP or some obstruction there.
所以现在正在发生这种转变,显然Claude Code和Claude SDK在这方面是很好的推动力。
So there's like this change happening and obviously this is like the Claude Code is like and the Claude SDK is a very good push for that.
但我在这里感受到了摩擦。
But I feel friction there.
我相信你也感受到了这种摩擦。
I'm sure you felt that friction too.
所以你有没有一些最佳实践,可以帮那些还停留在传统AI模式、需要转向更原生智能体的人?你从使用Claude SDK中总结出了哪些经验或观察?
So maybe you have some best practices for people that are stuck in the old fashioned AI world and need to go to the more agent native, things that you learned or that you've noticed, because you've implemented on top of, I I assume, Claude SDK
是的。
Yeah.
也许吧,或者说你可以使用它并在其上构建东西。
Maybe, or some very like, there is so you use that and you implement things on top.
所以我很好奇,你在那里学到了什么吗?
So I'm very curious if you learned anything there.
我不确定我有什么高深的见解,能比你的更有价值。
I'm not sure that I have any, like, wisdom from the mountain that it's going to be more valuable than yours.
但我认为,你说的让我产生共鸣的是,你得做出一个决定。
But I think I think what you're saying I think what you're saying that resonates with me is that you sort of need to make a call.
对吧?
Right?
比如,你希望输出的哪些部分是非确定性的,哪些地方你可以放心依赖模型的智能?
Like, which part of the outputs do you want to be nondeterministic, and where are you comfortable where you're comfortable relying on model intelligence?
而每次你依赖模型智能时,如果你选了一个更便宜的、或者更笨的模型,嗯。
And every single time you do rely on model intelligence, if you pick a cheaper model or, like, a dumber model Mhmm.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。