本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
嘿,大家好。
Hey, everyone.
欢迎收听《Laid in Space》播客。
Welcome to the Laid in Space podcast.
这是我们位于Kernel的新工作室录制的第一期节目。
Our first one in the new studio at Kernel.
这是Kernel Labs的创始人Alessio,我身边的是《Laid in Space》的编辑swyx。
This is Alessio, founder of Kernel Labs, and I'm joined by swyx, editor of Laid in Space.
是的。
Yeah.
能来到这里真是太好了。
So nice to be here.
感谢TJ、Alessio和Alan帮忙布置一切。
Thanks to TJ, Alessio, Alan helping to set everything up.
看起来真漂亮。
It looks beautiful.
我们外面甚至还有logo。
We even have the logo outside.
是的。
Yeah.
真的挺棒的。
It's like really nice.
我觉得当客人走进这里时,会立刻觉得:哦,这是一场专业的制作。
I think when you walk in here as a guest, you're like, oh, this is a serious production.
你马上就能感受到。
You're feel it immediately.
是的。
Yeah.
菲利克斯,你目前是Cowork的产品经理,或者从技术上讲,是负责人。
Felix, you've been you're you're currently product manager of Cowork or really lead, technically.
负责人。
Lead.
对。
Yeah.
这些职位头衔都有点模糊不清。
The the identities are kind of vague.
技术团队成员。
Member technical staff.
我懂的。
I know.
技术员工这个头衔算是我们会一直沿用的正式官衔了。
Member technical staff is like the official title we'll carry around forever.
对。
Yeah.
我最近有点想……我们这段时间一直挺着迷的。
I I recently kind of wanted like, we've been kinda obsessed.
我最近经常用这个工具,哪怕是用来管理潜在空间,Cowork都能帮我上传视频、给内容加标题、完成编辑等各种事。
I I've been using it a lot, even for managing latent space, like, Cowork helps me upload videos and like title things and like edit and everything.
这真的太棒了。
It's it's like really amazing.
酷。
Cool.
他多次提到Cowork说GI在
He's had multiple times Cowork has said GI in
群聊里。
the group chat.
是的。
Yeah.
所以我们为In Space TV设了第二个频道,这基本上就是我们的Discord聚会。
So so we have a second we have a second channel for the In Space TV, and I and we basically, is our Discord meetup.
而且我们有像Claude Cowork这样的东西,它可能是AGI。
And I I we have like Claude Cowork, it might be AGI.
我不确定我们有没有上传过,但其中一次会议就像一个Cloud Cowork的活动。
I don't know if we we have uploaded it yet, but one of the sessions was like a like a Cloud Cowork thing.
我非常想
I would love
看看它。
to see it.
说实话,我特别好奇,我工作中最有趣的部分之一就是不断看到人们用Cowork做的各种奇怪事情,因为显然我们很难为特定使用场景做设计。
Like, I'm so curious, like, one of the most fun parts of my job is that I constantly see the weird things people use Cowork for, because it's obviously like very hard for us to actually design for specific use cases.
我们确实如此。
We do.
但每个最惊讶的人,通常都是对某个我根本没想到Cowork能做好的事情感到惊讶。
But every single person who's most amazed is usually amazed about a thing that I didn't even expect Cowork would be good at.
我们有一位新设计师,这是他接手的第一个小任务。
We have a new designer, and it's one of the first small tasks.
我当时说:嘿,需要为Cowork设计一个新表情符号,用在我们内部的Slack上。
I was like, hey, need a new emoji for Cowork, for our internal Slack.
这是一件很小的事情。
It's a pretty small thing.
我说:你能帮忙做一下吗?
Was like, can you please do it?
他画了一个SVG,直接给了Cowork,然后问:你能把这个表情动起来吗?
And he drew an SVG and just gave it to Cowork, and was like, can you animate this emoji?
现在它有了一个非常漂亮的环形动画。
And now it has like this beautiful loopy animation.
我的意思是,显然这说明了代码实际上能做的事情比你想象的要多。
And, I mean, I think obviously this goes down to like, it turns out you can do more things with code than you expected.
但正是这类事情让我觉得特别有趣。
But it is like that kind of stuff that is really fun to me.
所以长话短说,我真的很想看看你们都在做些什么。
So long story short, I would love to see, like, the kind of things you're doing.
我马上调出来。
I'll pull it up.
我马上调出来。
I'll pull it up.
是的。
Yeah.
是的。
Yeah.
但在我们深入之前,我想先从一个高层次的角度介绍一下,对于没听说过或没试过的人,Claude Cowork 到底是什么?
But before we get into it, I I think always wanna start with, like, a top level, what is Claude Cowork for people who haven't heard of it, haven't tried it out.
好的。
Okay.
简单来说,Claude Cowork 是 Claude Code 的用户友好版本。
Real quick, Claude Cowork is a user friendly version of Claude Code.
它的基本运作方式是:我们有 Claude Code,去年十二月,我们注意到越来越多的人在使用它——即使他们并不懂技术,不熟悉命令行;或者即使他们熟悉命令行,也开始用 Claude Code 来处理非编程类的任务。
So the way it basically works is, we have Claude Code, and for us, fairly impressive agent harness that over December, we noticed more and more people are using either even though they're not technical, they they're not at home in the terminal, or they are at home in the terminal, but they started using Claude Code for non coding workloads.
对吧?
Right?
比如管理开支、填写报销单,或者整理知识库。
Like managing expenses, or like filling out receipts, or organizing a knowledge base.
就像很多人喜欢的那个 Obsidian 的关键时刻一样。
Like, there was a big obsidian moment that a lot of people liked.
我们想抓住这个机会,同时也让那些不熟悉终端、不知道如何用 brew 安装工具的人也能使用这项功能。
And we wanted to capitalize on that, but also bring bring this capability to people who are not terminal native, and who might not know how to, like, brew and install something.
所以 Cowork 就是在虚拟机中运行的 Claude Code,增加了一些缓冲和更多限制,让它对那些不想一上班就打开终端的人来说更安全、更方便。
So Cowork is Claude Code running in a virtual machine with a little bit of padding, a little bit more guardrails, making it a little safer and a little bit more convenient for people who don't wanna first open up the terminal when they go to work.
有趣的是,它被定位为更用户友好的工具,但对我来说,我一直觉得它更像是一个高级工具,因为我对 Claude Code 很熟悉,我们之前做过一期关于 Claude Code 的节目,是的。
It's interesting that it's kind of pitched that way as a more user friendly thing, because I always feel like it it to me, I I treat it as like, why I'm familiar with Claude Code, like we we did a Claude Code episode Yeah.
一年前。
A year ago.
但这个版本更像是面向高级用户的功能,因为它与 Claude 和 Chrome 以及其他所有工具的集成要好得多。
But this one is like even more power user tools, because it it kind of integrates much better with, like, Claude and Chrome, and in all the all the other tooling.
但也许这只是一个认知上的差异。
But, like, maybe maybe that's like a perception thing.
对吧?
Right?
就像
Like
不。
No.
说实话,我觉得你没说错。
Honestly, don't think you're wrong.
这真的是我过去两周一直在思考的问题。
This is like a thing I've been thinking a lot about for, like, the last two weeks.
当人们说用户友好时,意思就是简化版。
So People when they say user friendly, it's like, it's the dumbed down version.
但其实,这才是超集。
But no, actually, this is the superset.
是的。
Yeah.
就像大约十年前,或者十二年前,我在微软工作时,我们开始开发 Electron 和基于浏览器的跨平台技术,类似的事情也发生在我身上。
Like, think a similar thing happened a similar thing happened to me about ten years ago, like maybe twelve years ago when I was at Microsoft, and we started working on on Electron and like browser based technologies and cross platform stuff.
其中一个最早的应用场景是 Visual Studio Code,它原本是一个网站。
And one of the first use cases was Visual Studio Code, which used to be a website.
最初的宣传说法是,Visual Studio Code 是 Visual Studio 的更用户友好版本。
And the initial narrative was, well, Visual Studio Code is is like a more user friendly version of Visual Studio.
但类似地,当时也有人表示,这不适合严肃的开发者使用。
But in a similar vein, I think there were some voices saying, oh, this is not for serious developers.
比如,我们不会用它来做任何正经事,对吧?
Like, we're not gonna use this, right, for like anything.
我认为最终,人们对 Visual Studio Code 为何如此成功有不同的说法。
And I think in the end what happened is people have different stories about why Visual Studio Code became such a big thing.
但我个人认为,它的可定制性和可扩展性起到了相当重要的作用。
But my personal my personal belief is that the hackability and the extendability has like played a pretty big role.
对吧?
Right?
你可以将 Visual Studio Code 挂接到几乎任何工作负载上。
You can hook in Visual Studio Code to like almost any workload.
它非常容易修改,也很容易为其开发扩展。
It's so easy to hack on, so easy to build extensions for it.
我认为Cowork可能正面临类似的情况——它非常容易扩展,也极易融入你的工作流程。
And I think Cowork might be hitting a similar thing where it's very easy to extend, and it's very easy to bring into your workflows.
所以,便捷性我认为是我们开发者一直追求的目标。
So the convenience, I think, is a bit of a it's obviously the thing we strive for as developers.
但我觉得,人们发现其价值的方式,可能是将其映射到自己工作中实际要完成的任务上。
But I think the way people find value in it then is by probably mapping it onto whatever they actually have to do in their job.
所以去年年底,你看到Claude Code在非技术人员中的使用量出现了激增。
So end of last year, you see the spike of like non technical usage in Claude Code.
你们是如何决定要开发Claude Code的?
What's the design process to say we should make Claude Code work?
毕竟,你们只用了十天就把它做出来了。
Because I mean, you built it in only ten days.
我肯定之前有过一些讨论,关于‘更易用’到底意味着什么?
I'm sure there was some discussion before on what does easier to use mean?
制作一个桌面图形界面显然是一种方式,但这个产品中有很多细微之处。
Making like a desktop GUI is obviously one way to do it, but like there's a lot of nuance in the product.
也许可以谈谈,是什么促使我们决定要开发一个独立的东西?
Like maybe talk people through what was, like, the trigger of, like, we should build a separate thing.
我们不应该再搞一个不同的代码工具。
And we should not build, a different plot code thing.
然后是一些你可能没有做出的、更有趣的设计决策。
And then maybe some of the more interesting design decisions that maybe you didn't take.
是的。
Yeah.
我认为,在Anthropic,我们一直在思考如何让那些习惯用Claude回答问题的人,进一步利用它的强大功能来为你执行任务。
I think, at Anthropic, we've been thinking about ways to move people who are comfortable with using Claude to answer questions, and bring more of the power of, like, this thing to now, like, execute tasks for you.
对吧?
Right?
它能帮你解决问题吗?
Can, like, solve problems for you?
能为你构建东西吗?
Can, like, build things for you?
我们如何将这种能力带给那些目前主要习惯于在聊天中使用问答模式的人?
How do we bring that capability to people who are currently mostly comfortable with, like, a like, question answer paradigm within the chat?
我们为此做了很多原型。
And we've had a lot of prototypes around that.
这可以追溯到至少一年半以前。
This going back as far as, like, easily a year and a half.
我们有很多人一直在做这方面的研究。
Like, we had a lot of people working on that.
在内部,Anthropic 是一种非常注重原型和演示的文化。
And internally, Anthropic is a very prototype demo first culture.
我们有很多内部原型从未面向公众发布。
We have a lot of like internal prototypes that don't reach the public.
Cowork 最终成型的过程,就是从我们众多原型中挑选出最合适的部分。
And what Cowork actually became is like we sort of picked the right pieces out of the many prototypes that we had.
对吧?
Right?
而且,当人们提到这个十天的数字时,我认为这也是一个重要的限定条件。
And that's that's maybe also like, I think an important qualifier whenever people mention this like ten day number.
我觉得有必要说明,我们并不是从零开始的。
I do think it's important to me to mention that we didn't start with scratch.
当时已经有很多工作在进行了。
There was like a lot of stuff already happening.
对吧?
Right?
而且我认为,当人们建网站、使用React时,会用到很多其他东西,这一点很重要。
Like and I think it's important for people to remember that when you build a website, use React, you use like a bunch of other things.
而我们现在的情况也类似,很多组件我们早就有了。
And this is like a similar scenario with like a lot of pieces we already had.
就决策路径而言,我认为我们正生活在一个执行成本实际上很低的全新世界里。
In in terms of decision path, I think we live in like an interesting new world where execution is actually quite cheap.
嗯。
Mhmm.
所以也许你可能会做的就是
So maybe maybe what you would do that's
听起来太疯狂了。
so crazy to hear.
我知道这很离谱。
I know it's wild.
你的想法不值钱。
You should be ideas are cheap.
执行才是最难的部分。
Execution is the hard part.
不。
No.
与
Unlike the
我们过去生活在一个世界里,产品经理会去接触许多潜在客户,通过这种低频的方式,试图摸索出他们面临的问题、愿意购买什么,以及你能构建什么来满足他们的需求。
we we used to live in this world maybe where you would take a product manager, and the product manager would go to a number of potential customers, and then this like very low bandwidth way would try to try to like tease out what are the problems they're having, what are they willing to buy, and then maybe what can you build to drive out their need.
然后你回去起草一份规格文档,仔细思考,设计产品,再执行开发。
And then you go back and you draft a spec, you think about it, and then you make a design, and you execute it.
我们在Anthropic内部现在可能已经接近这样的阶段:根本不需要写备忘录。
We internally at Anthropic Up, now probably much closer to the point where we're like, don't even write a memo.
直接动手构建吧,让我们快速做出所有可能的候选方案。
Just like build like, let's build all the candidates very quickly.
让我们直接把所有方案都做出来,嗯。
Like, let's let's just build all of them Mhmm.
然后挑选出最好的那些。
And then pick the best ones.
我认为,目前对产品和用户来说最具影响力的那个决定,是我们如何在你的本地计算机上体现价值。
I think the the decision that is most impactful, both for the product as well for the users right now, is like the way we put value on your local computer.
我觉得这是一个关键的决策点。
I think that's a big decision point.
很多人一直在思考,这个东西——无论它是什么——最终应该运行在你的电脑上,还是应该运行在云端?
A lot of people have thought about, should this thing, whatever it is, should it ultimately run-in your computer, or should it run-in the cloud?
因为这两者之间有重大的权衡。
Because they're big trade offs.
对吧?
Right?
我想,如果我们解决了身份验证问题,在云端实现就会容易得多。
I guess, like, if we solved auth, it will be easy to do in the cloud.
但我认为,我能从任何地方下载任何文件,然后直接放到 Cowork 里,这真是一个巨大的突破。
But I think, like, the fact that I can just download any file from anywhere and then put it in Cowork there, it's like a big unlock.
有意思的是,你提到了复用某些部分。
I mean, it's interesting you mentioned reusing certain pieces.
这正是我一直在思考的问题。
I think this is something I've been thinking about.
即使是 Claude Code 也是如此。
Even with Claude Code.
对吧?
Right?
写代码的成本正在趋近于零,诸如此类的说法。
The price of like writing code is going to zero, blah blah blah.
但实际上,拥有某种平台基础架构的价值似乎正在上升。
But it actually seems like the value of having some sort of platform substrate is like increasing.
因为当你构建这些新事物时,你可以将它们相互连接起来。
Because as you build these new things, you can kind of plug them together.
是的。
Yeah.
所以当人们说,很多软件的价值会归零,因为你能够重新创建它时,
So I almost feel like when people are saying, oh, the value of a lot of software is gonna zero because you can recreate it.
在我看来,这几乎恰恰相反。
To me, it's almost like the opposite.
拥有一个现有的平台作为基础,反而更加有价值,因为你可以在上面轻松地附加功能。
It's like having an existing platform to build on top of is like even more valuable because you can kinda bolt things on.
是的。
Yeah.
你显然有MCPs,有技能,还有模型,这很重要。
You have obviously MCPs, you have skills, you have like, obviously the models, which is a big part.
所有这些因素都结合在一起。
All these things kinda come together.
你觉得这种看法合理吗?即人们应该更多地投资于这些基础组件,以便重新构建?
Do you feel like that's a valid way to think about it, where people should invest even more in kind of, like, these primitives to rebuild on?
还是说你每次都在重新创造很多东西,因为事物在变化,重写比复用更容易?
Or are you, like, recreating a lot of it each time because, like, things change and it's easier to rewrite than reuse?
你知道吗,我觉得你说得对。
You know, I think I think you're right.
我觉得你是对的,整体平台确实非常有用。
I think you're right that the holistic platform is really useful.
这可能与AI领域许多人的观点相悖。
And this is maybe a whole, like, a somewhat contrarian view to a lot of people in AI.
我实际上并不认为未来会发展到每个人都运行自己版本的超个性化软件。
I actually don't think that the future is going to be hyper personalized software down to the point where everyone is running their own version.
事实上,我认为让Autopus开发我们自己的内部聊天工具会非常困难。
Like, actually think it's going to be quite hard for Autopus to have our own internal chat tool.
如果我想和你聊天,那这该怎么实现呢?
And like, if I wanna talk to you, like, how is that gonna work?
对吧?
Right?
在COWORK的背景下,以及我们构建它的方式,我认为这有点是两者的结合。
In the in the context of Cowork and how we build it, I think it's a bit of a combination.
比如,变得便宜的执行方式并不一定是重新构建所有基础组件。
Like, what the the execution that gets cheap is not necessarily rebuilding all the primitives.
我认为从一开始,这样做也没有太多价值。
I think a priori, there's also not a lot of value in it.
所以,例如,我的团队就没有考虑过重新构建Claude Code。
So for instance, my team did not think about rebuilding Claude Code.
我们一开始的核心理念就是这应该是Claude Code。
We're like very much started with the with the core thesis of this should be Claude Code.
是的。
Mhmm.
然后我们会在其基础上构建其他功能。
And then we'll like build things on top of it.
执行中稍微便宜一点的部分是,如何把这些乐高积木以对用户有意义的方式组合起来?
The part of the execution that gets a little cheaper is, like, how do you take all of these Lego pieces and put them together in a way that makes sense for users?
这实际上是有价值的。
It's, like, actually valuable.
现在你有这么多不同的方法,来决定哪些东西真正应该提升为基本组件?
You have so many different approaches now in terms of what kind of what kind of things do you actually elevate to a primitive?
你是否坚信,你的所有产品都应仅通过组合公众也可用的基本组件来构建?
Do you strongly believe that all your products should be built by just combining primitives that the public also has available?
你会保留一些东西完全自研吗?
Do you keep some things in total?
我认为这仍在发展中,但我觉得可能会消失的是——我不确定是否会完全消失,但就我个人而言,我大概不会再试图在没有与人测试的情况下推出一个好产品了。
And I think that's still evolving, but I think what's probably gonna go away is like I'm not sure if it's gonna fully go away, but I'm gonna say I think for me personally, I will probably no longer try to come up with a really good product without testing out with people.
这并不是一个新概念。
This is not a new concept.
但过去你不得不在选择技术A还是技术B,或者以这种方式构建还是以另一种方式构建之间做出昂贵的决策,而我现在已经有了非常明确的想法。
But wherever you used to have to make costly decisions around do we pick technology a or technology b, or do we like build it this way, build it the other way, I have really strong ideas now.
你只需把所有方案都做出来,找一小群目标用户试用,然后选效果更好的那个继续推进。
You just build all of them and try them out with like a small focus group, and then whatever whatever is better is what you go with.
对吧?
Right?
这甚至可能和我们一年前的工作方式大不相同。
And that that is probably quite different even from how we maybe worked a year ago.
对吧?
Right?
我觉得这种事情是最近才发生的。
Like, think I think this happened very recently.
是的。
Yeah.
我开始用 Electron 做一些东西,既然你在这儿,真是巧合。
I started building something in on Electron, since you're here, coincidence.
但 Electron 和 SQLite 在开发和构建过程中总会遇到一些问题。
But then Electron and, like, SQLite are like there's, like, some issues that, like, be doing development and, like, building anyway.
于是我决定干脆用 Swift 重做整个项目,完全用 Swift 重新实现一遍。
And I was like, let's just rebuild the whole thing in Swift and just recreated the whole thing in Swift.
结果它变得。
And it's like, it's gotten.
你知道的。
You know?
他说,然后我就没有
He goes, then I didn't
费任何力气。
take any effort.
我根本不会Swift。
I I I don't even know Swift.
没错。
Exactly.
我当时想,反正我也不会审查代码,随便吧。
I was like, I'm the I'm not reviewing it anyway, whatever.
你可以用任何你选择的语言来写。
You can write in whatever language you pick.
但我真正做的重要工作并不是写Electron的绑定。
But the important stuff that I did was not write the Electron bindings.
是的。
Yeah.
而是应用里发生逻辑的那部分,你知道的?
It was like the logic of what happens in the app, you know?
然后模型就是,没错。
And then the model is like, yep.
我可以直接重现和之前一样的东西。
I can just recreate the same thing as with.
是的。
Yeah.
我认为,特别是对于从事高性能软件、非常复杂软件的人来说,你仍然需要对架构有一个清晰的了解,但你可以用 Markdown 来实现这一点。
I I think you still want especially for people who are doing like high performance software, know, like, very complex software, you still want, like, some view of the architecture, but you can use markdown for that.
对。
Right.
是的。
Yeah.
你其实并不需要阅读代码。
You don't actually have to read the code.
再说一遍,我仍然在纠结这个定义性的问题。
Again, I'm still, like, on the the sort of, like, a definitional thing.
是的。
Yeah.
我们能对 Claude Cowork 建立一个良好的心理模型吗?
Can we build a good mental model of Claude Cowork?
这就是我的理解。
This is what I have.
对吧?
Right?
就像你说的,它本质上就是 Claude Code,我们不想碰它。
Like, you said it's like fundamentally Claude Code, we don't wanna touch it.
有 Cloud 应用,还有 Chrome 中的 Cloud。
There's the Cloud app, there's Cloud in Chrome.
我觉得你们在规划上做了些不同的事,但我跟 Cloud Code 团队的 Tarik 聊过,他说,不,我们只是把规划暴露出来了。
I think you guys do something different in planning, but I've been talking with Tarik, who is on the Cloud Code team, and you guys are he's like, no, we just exposed planning.
也许你能澄清一下,人们应该了解构成 Cowork 的主要组成部分有哪些?
Maybe you can clarify, like, what are the major pieces that people should be aware goes into Cowork?
我觉得你基本上已经说全了。
Like, okay, I think you basically have them.
真的吗?
Really?
你可以基本把规划功能去掉。
You can take planning more or less out.
我认为Cowork中有一些非常有价值的东西。
I think there's a few things that are really valuable in Cowork.
虚拟机可能是最强大的功能。
The virtual machine is probably the most powerful thing.
所以我们目前运行的是一个轻量级虚拟机,并将ClockTalk放入该虚拟机中。
So we currently run like a we currently run like a lightweight VM, and we put ClockTalk into the VM.
我们这样做的原因有很多。
And we do that for for a number of reasons.
安全性和保密性是一个重要因素。
Safety and security is a big one.
但即使你暂时忽略安全性和保密性,只是想:‘管他呢,我要这东西做任何事’,给Claude一个子计算机也相当强大。
But even if you even if you ignore for a second safety and security, and you're just like, okay, YOLO, I want this thing to do whatever, it is quite powerful to give Claude a subcomputer.
这通常是个好主意。
That is like generally a good idea.
在Anthropic我们所设计的架构、用户体验等方面,往往非常有用的是,你得积极地把Claude当成一个真人来看待。
And in terms of architecture and UX and everything else that we've been working on Anthropic, it often is quite useful for you to like anthropomorphize Claude aggressively and just be like, this is a person.
是的。
Yeah.
如果你有个真人,你会怎么做?
What would you do if you had a person?
对吧?
Right?
对。
Yeah.
今天早上我给爸爸举了个例子,他仍然坚持用聊天方式来处理编程之类的事情:假设你是个开发者,你的雇主告诉你不需要电脑,只会给你发邮件附上代码,你也通过邮件回传代码。
And the analogy I've given my dad this morning, who is still, like, quite insistent on using chat even for, like, coding things is, if you were a developer and your employer told you that you don't need a computer, they're just gonna, like, send you emails with a code and you send emails with code back.
这种方式也许适合处理小补丁,但效率并不高。
Like, that maybe work for patch files in the back, but that is not very effective.
所以我们用虚拟机的好处是,因为它是一个 Linux 系统,Claude Code 几乎可以自由安装任何它需要的软件。
So what we can do with the VM is because it's a it's a Linux system, Claude Code has more or less free rein to install whatever it needs to install.
它可以安装 Python。
It can install Python.
它可以安装 Node。
It can install Node.
JavaScript。
Js.
我们确实有严格的网络入站和出站控制,因此你仍然可以用通俗的人类语言,明确告诉整个系统你允许什么、不允许什么。
We do have strict network ingress and egress controls, so you can still, as as a user in like plain human language, make it clear to to the entire system what you're okay with and what you're not okay with.
但我们从来不需要去询问一个真实的人,比如市场部或法务部门的员工。
But at no point do we have to ask a real person, like a like a person who might be in marketing or a lawyer.
我总不能跑去问法务:‘你同意我安装 Homebrew 吗?’
I'd have to go to a lawyer and be like, are you okay with me installing Homebrew?
是的。
Yeah.
是的
Yeah.
对吧?
Right?
因为这个问题和答案的含义复杂而微妙,不容易推理。
Because the implications of the question and the answer are complex and nuanced and, like, not not easy to reason about.
这为我们提供了很多抽象能力,使Claude非常强大。
And this gives us a lot of abstraction that makes Claude very powerful.
围绕这一点,我们可能还有一些功能,几乎每周都在增加,你可能已经注意到这些功能让CoWork在某些任务上比单纯的云平台更出色。
Now then around it, we we do probably have a number of things that also keeps growing almost every single week that you're probably noticing that make CoWork maybe better for certain tasks than just cloud cloud on its own.
是的
Yeah.
但大多数这些功能实际上都存在于系统提示中。
But most of those actually live in the system prompt.
它们关乎我们可以推断出你从事的工作是什么?
They're about like, what can we infer about the work that you do?
我们可以在系统提示中引入什么来让这一点更有效?
What can we what can we introduce into the system prompt to make that more effective?
当然,这与云和Chrome的紧密集成有关。
It's of course the like very tight integration with cloud and Chrome.
你注意到,随着模型变得更好,很多人在MCP连接器这方面都束手无策了。
You're noticing that a lot of people, especially as the models get better, a lot of people throw up their hands when it comes to MCP connectors in this area.
我不会去逐一尝试25个MCP连接器,到处点击授权,结果还有一半根本没法完成我想做的事。
I'm not gonna I'm not gonna go through like 25 MCP connectors, click auth everywhere, and then like half of them don't let me do the things anyway.
所以Claude和Chrome非常强大,因为我们只需与Claude和Chrome的子代理对话,它就会帮你完成这些操作。
So Claude and Chrome is quite powerful because we can just talk to the Claude and Chrome sub agent, and that will just do things for you.
是的。
Yeah.
所以举个例子,在MCP方面,说实话,我觉得MCP的集成现状真的非常困难。
So one example, right, in MCP, honestly, I think that the state of MCP is kinda kinda, like, really hard to integrate.
我需要把Figma MCP添加到我使用的编码代理中。
I need to I needed to add Figma MCP to the coding agent that I use.
是的。
Yeah.
但我懒得看文档,就让Claude帮我做了。
And but I didn't wanna read the docs, so I just had Claude to it.
它读文档的能力真的很棒。
And it's it's great at reading docs.
同样地,我得为我正在做的一个项目设置一个Google Cloud账户,并获取某个API密钥。
And in same same way, I had to set up like a Google Cloud account for some project I was working on and get some API key somewhere.
Google Cloud以极其难用著称。
And Google Cloud is famously super hard to navigate.
所以我根本不想处理这些麻烦事。
So I just didn't wanna deal with any of it.
我就直接用了Claude Cowork。
I just used Claude Cowork.
在使用Cowork开发的第一周内,这种情况就很快发生了。
Within the first week of developing on Cowork, this happened very, very quickly.
我发现自己开始用Cowork来处理编码任务,而这并不是我们最初开发它的初衷。
I caught myself, like, starting to use Cowork for coding tasks, which is not ostensibly what we built it for.
对吧?
Right?
我们其实没必要这么做。
We don't need to.
但我发现自己正在使用我们内部用来收集崩溃信息和调试数据的工具。
But I found myself I found myself, like, on our internal internal tool that we have for to collect crashes, and just, like, debugging information.
我还会挑出那些我觉得可以轻松修复的错误,而不是那些可能是内核损坏或操作系统其他问题的错误。
And I found myself sort like picking out the ones that I think we can easily fix versus the ones that might be like kernel corruption or something else on the operating system.
我会把这些错误挑出来,然后告诉Claude:去修这个bug。
And I found myself sort of picking these out and then just telling Claude, go fix this bug.
我当时就想:我这是在干什么?
I was like, what am I doing here?
再往上走一步。
Go one level up.
告诉一个同事,我要你去查看所有这些崩溃工具。
Tell a coworker, I want you to go to all these crash tools.
我要你找出所有你觉得可以修复的bug,而不是操作系统崩溃那样的问题。
I want you to find all the bugs that you think are fixable and not like an operating system crash.
然后你再让另一个Claude去修复所有这些问题。
And then I want you to tell another Claude to like fix all of that.
而这就是,这就是,这就是……
And that's that's that's So
他们再找另一个Claude?
they tell another Claude?
是的。
Yeah.
所以他们可以启动另一个实例,或者
So they can spin up another instance or
目前我做的方式是,这有点像是个捷径,但我让她使用Claude Code远程功能
Currently what I do is, and this is a bit of a hack, but I tell her to use Claude Code remote
这本身就是它自己。
to Which is just itself.
是的。
Yeah.
这很有趣。
That's interesting.
所以,如果你想象一个带有20美元的仪表盘的话,你基本上就是这么做。
So you basically take if you if you imagine like a dashboard with like $20.
这是远程控制还是远程?
You this is remote control or remote?
抱歉。
Sorry.
我只是想确认一下
I just wanted to confirm what
我使用它的方式是,我让Cowork运行着,然后告诉Cowork:这是我每天早上通常去查找最新漏洞的地方。
The way I'm using it is I have Cowork running, and I'm telling Cowork, here's where I normally go every morning to find the latest bugs.
去阅读整个错误列表,区分哪些是可以修复的,哪些是无法修复的。
Go read the entire bug list, separate out which ones are fixable, which ones are not fixable.
对于可修复的错误,对每一个错误,编写一个包含提示的Markdown文件。
And then for the fixable ones, for is this almost loop, for each bug, write a markdown file with a prompt.
然后,为每个作为提示的Markdown文件启动一个ClaudeCode。
And then for each markdown file that is a prompt, start off a ClaudeCode.
所以,Claude Code 原生支持子代理的概念,而你这实际上就是一个子代理,只是没有使用它的功能。
So natively, Claude Code has this concept of sub agents, and this is basically a sub agent, but you're not using the sub agent's functionality.
我没有使用子代理的功能,原因是我将它作为Claude Code的远程任务来执行。
I'm not using the sub agent's functionality, and the reason I'm not is because I'm firing that off as a Claude Code remote task.
是的。
Yes.
这样挺好的,因为它可以立即启动任务,我可以去参加下一个会议,而在Claude Code远程模式下,工作已经开始进行了。
It's kinda nice, because then it can just fire it off, I can go to my next meeting, and in Claude Code remote, now the work's happening.
对。
Yeah.
你看,你已经开始用云服务代替本地机器了,我觉得这正是一个问题:难道不应该一切事情都优先基于云吗?
You see, like, you're already starting to use the cloud over your local machine, and I think this is one of those things where like, well, shouldn't just everything just be cloud first, right?
这真是个绝佳的问题,我只想好好聊聊这个。
This is such a good I'd like solely bother about this.
关于这个话题,我有太多想法了。
I have so many thoughts about this.
好吧。
Okay.
我通常认为,硅谷整体上低估了本地计算机的价值,我常举的例子是:为什么我们都在用MacBook,而不是iPad或Chromebook?
So I generally believe that Silicon Valley overall is undervaluing the local computer, and my default argument for that is always, how come we're all using MacBooks and not like an iPad or a Chromebook?
本地机器依然有其价值。
There's, like, still value in in having a local machine.
当我把Claude看作一个应该对你非常有用、极其有用的实体时,我认为这个实体必须能访问你所能访问的所有工具。
And now when I think about Claude as this entity that is supposed to be very useful to you, like tremendously useful to you, I think that entity needs to have access to all the same tools you have access to.
否则,它会在各种复杂情况下受到限制。
Otherwise, it's gonna be hamstrung in, like, all these complex ways.
我们可以采取两种不同的方法。
And there's there's sort of two approaches we could take.
我们可以说,好吧。
We could say, okay.
我们可以一点一点地把电脑上所有的东西都迁移到云端。
We're gonna, like, one by one chip away at everything that is at your computer and move it into the cloud.
这是一种做法。
That's that's one way to do it.
我认为其他产品已经走了这条路。
And I think other products have taken their path.
我个人觉得这是非常主观的看法,但就我使用的工具数量而言,我真的没有耐心去给另一个工具授权访问我每一样东西,还要持续更新这些权限。
I personally this is very personal opinion, but I personally, for the amount of tools that I use, just don't have the patience to give another tool like permissions to every single thing and keep those permissions up to date.
我还在思考的第二件事是,如果有人把你的整个工作内容都抓取到云端,那会是什么样子?目前我还没有一个明确的答案。
The second thing that I'm still grappling with, and I don't have a good answer for anyone just yet, but the second thing I'm still grappling with is, what does it look like for someone to slurp up your entire work and put that in the cloud?
比如,如果我点击一个按钮,它就把你的整个电脑内容完全克隆到云端,你会希望这样吗?
Like, if I just as an example, like, you would click a button and it just clone your entire computer into the cloud, is that something that you would want?
我并不完全相信所有人都会同意这一点。
I'm not totally convinced here that all everyone will.
是的。
Mhmm.
这实际上比我们即将面对的所有技术问题都更靠前。
And that is sort of like upstream of all the technical issues we're gonna have.
因为一般来说,我认为世界还没有准备好应对这种事。
Because, like, in general, I think the world is not ready for this kind of stuff.
我给你举个简单的例子,这对我们来说可能很容易实现。
Like, I'll give you one quick example that would probably be very easy for us.
作为桌面应用,理论上,在获得你授权的情况下,我们可以做很多事,比如读取你的 Chrome Cookie,如果我们真想这么做的话。
So as a desktop app, we, in theory, with your permission, can do a lot of things on your computer, including reading your Chrome cookies if we really want to do.
对吧?
Right?
我们可以获取你的 Chrome Cookie,你需要为我们解密它们,但如果我们真想这么做,也可以把它们上传到云端。
We could take your Chrome cookies, you would have to decrypt them for us, but we could put those on the cloud if we really felt like it.
这是一个非常简单却超酷的解决方案。
Pretty easy solution that would be super cool.
我们可以直接说:现在所有任务都可以在云端完成了。
We could just be like, oh, we can do all your tasks in the cloud now.
很多网站,包括银行,如果发现同一认证信息来自两个不同地点,就会直接锁定你的账户。
A lot of websites, banks included, if if they see the same authentication from, like, two different locations, will just lock down your account.
然后你就得跑去网点,说:好吧。
And now you have to go to the branch and be like, okay.
我带着我的护照来了。
I'm with I'm here with my passport.
你真的知道这一点。
You actually know that.
哇。
Wow.
尽管我们对‘智能体’这个词已经听腻了,但我觉得还有很多事情需要慢慢跟上。
You know, as tired as well are of the term agent for the agentic future, I think there's a lot of stuff that sort of slowly needs to catch up.
展开剩余字幕(还有 480 条)
在那之前,作为正在开发Claude的人,我能使Claude最有效的方法就是把它放在你工作的地方。
And until that's the case, the way I, as someone who's working on Claude, can make Claude most effective is to like put it where you're working.
关于我们的心理模型,还有其他想法吗?
Anything else I thought with our mental model?
所以,基本上,我也只是希望,我越了解它的工作原理,就越能充分发挥它的潜力。
So like basically, like, part of me also just want like, the more I understand how it works, the more I can use it to its full potential.
对吧?
Right?
是的。
Yeah.
所以我从你这里听到的是,你让我删掉那个规划功能,但你并没有对仅限于Claude Cowork的这部分做任何特别处理。
And so what I'm hearing from you is you told me to delete the planning thing, you're not doing anything special on on the that's only exclusive to Claude Cowork.
我们有一些技巧,可以每周都进行调整。
We have some tricks for this sort of like change week over week.
我们评估Cowork时,可能会使用与评估ClockCoder不同的用例。
We eval Cowork maybe against different use cases than you would evil a ClockCoder.
对吧?
Right?
如果你这样想的话。
If you think about it this way.
好的。
Okay.
所以,ClockCoder 就是我发邮件给 ClockCoder。
So like, ClockCoder is I'd email ClockCoder.
是的。
Yeah.
所以 Claude Code 非常适合编码任务,我们主要评估它在典型软件工程师工作中的表现是变好了还是变差了。
So Claude Code is like quite optimized for coding tasks, and we mostly evaluate whether or we're getting better or worse, depending on how good it is at like a typical swy job.
而 Claude Cowork,另一方面,我们更多地根据典型的知识型工作来评估,比如金融领域或法律办公室里常见的那种工作。
And Claude Cowork, on the other hand, we evaluate more against typical knowledge work, the kind of stuff you would find in finance or in like maybe like in like a legal office.
我个人的使用场景总是像管理我的事情,比如管理我的个人房贷之类的。
My personal use case is always like managing my things like managing my personal mortgage or something like that.
对吧?
Right?
或者像是为我和我的家人做财富规划。
Or like wealth planning for me and my family.
这些就是我们评估、规划和协作的使用场景。
Those are the kinds of use cases we eval, plot, co work on.
你可能注意到的是,我们对系统提示词所做的细微调整,我们在系统提示中加入的内容,以及我们通过工具引导Claude的方式。
And what you might be picking up on is like the subtle changes we make to the system prompt, what we put in the system prompt, how we steer Claude with the tools we give it.
所以它在某一个方向上会更好,而在另一个方向上则会变差,这种权衡无处不在,Claude Code 在编程上更优,而 Claude Cowork 在非编程任务上更出色。
So like either it'd be better in one or the other direction, where there's a trade off, trade offs exist a lot, Claude Code will be better for code, and Claude Cowork will be better for non coding tasks.
在接下来几代模型中,这些差距还会存在吗?
Will those gaps still exist in the next few generations of models?
不过这一点对我来说还不太清楚。
It's like a little unclear to me though.
是的。
Yeah.
因为目前这些高度优化的调整,我不确定它们会持续多久才依然相关。
Because right now these like hyper optimizations we make, I'm not sure for how long this is gonna be relevant.
我想我之前提到的,是它在感觉上确实有所不同,但可能这纯粹是提示词的效果,是我过度解读了。
I think what I was referring to was also it it just qualitatively felt different when I probably it's just all prompting and I'm reading too much into it.
但关键是,它会输出一个九步计划,我可以编辑这个计划、提供反馈,然后看到它执行这个计划。
But like the the fact that it comes out like a nine step plan, I can edit the plan and give feedback and and and see it execute the plan.
是的。
Yeah.
它感觉比Claude Code更具长期性,但也许这些功能在Claude Code中本来就存在,只是你们为它做了更友好的界面。
It felt more long range than in Claude Code, but maybe that already existed in Claude Code and you just built a nicer UI for it.
这其实是两者兼有。
It's kind of both.
比如,开发Claude Code规划功能的团队,他们可能会说:是的,我们在Claude Code里已经具备了所有这些功能,而他们确实有。
Like if the Claude Code people who build the planning functionalities with Citi, they would probably say, yes, we have all of those things in Claude Code, and they do.
我认为人们通常会给CoWork分配时间跨度更长的任务。
I think people tend to give CoWork tasks that are maybe of longer time horizon.
我觉得这太长了。
I thought it's so long.
是的。
Yeah.
这是不是一件事?
Is that like one thing?
对吧?
Right?
你只是觉得这一大块工作往往稍微大一点。
You're just like that the chunk of work tends to be maybe a little bigger.
第二点是,当工作时间变长时,它会变得稍微更模糊,我们会要求CoWork大量使用规划工具或询问用户的问题工具。
And then the second thing is that because the work, when it gets longer, it gets a little bit more ambiguous, we do tell Cowork to make heavy use of the planning tool or to make heavy use of the ask user question tool.
对吧?
Right?
我们希望它能提出不同的场景,比如弄清楚用户真正想要什么。
We do want it to come up with like different scenarios of, okay, tease out what the user actually wants.
别去工作四个小时,然后回来做错了事。
Don't go off to work for like four hours and then come back with the wrong thing.
你可能也注意到了这一点。
And you're probably picking up on that.
是的。
Yeah.
我真希望能告诉你,我打造了一个神奇的东西,里面有什么秘密配方。
I wish I could tell you I like built this magical thing, and it's like there's some secret sauce on
哦,不不不。
Oh, no no no.
我的意思是,清晰明了总是好的。
I mean, it's it's just clarity is good.
你知道,工程师们只是想搞清楚,这样他们才能做好规划。
You know, engineers just wanna know, so they can they can plan around it.
现在我觉得,对我而言,我也得切换到我的另一台机器了,因为这台是新机器,没有我的会话记录。
And now I think also for me, I am realizing I have to switch to my my other machine because this is a new machine that doesn't have my session.
但是,是的,规划对我来说真的很重要,以便我能够批准或判断它是否正确。
But, yeah, the the the planning is really important for for me to, approve or, like, to see whether it's, like, it's right.
这个用户提问的方式呈现得非常出色。
The ask user question is so beautifully presented.
我的意思是,它在Cursor和Claude Code中也同样可用。
I mean, it all it's also available in, like, Cursor and and in Claude Code.
但我觉得,看到它真的理解我,明白我想做什么,这种感觉非常好。
But, like, I I think, like, it's so nice to see that it, like it's kind of for me, like, to understand that it gets me, it gets what I wanna do.
是的。
Yeah.
对。
Yeah.
这非常难以实现。
That applies very hard.
只是在
Just on the
关于评估的话题。
topic of evals.
当你提到评估时,我觉得人们对它的含义很模糊。
When you say eval, I think people are very vague about what it means.
这仅仅是凭感觉测试,还是你有针对Claude Cowork的自动化程序化评估?
Is it just like vibe testing, or do you have like automated programmatic evals of Claude Cowork?
当我们说评估时,我们的意思是,我们会完整地获取整个对话记录,包括Claude最终可用的所有工具,然后根据我们的调整来衡量其输出结果。
When we say eval, what we really mean is that we essentially take the entire transcript, including all the tools that Claude has available ultimately to it, and we then measure what are the outputs depending on what we tweak.
对吧?
Right?
所以我们经常进行这种评估。
So we do run that a lot.
我们在训练中使用它。
We use that in training.
我们在训练中使用它;如果你把训练后阶段和其周围的支撑体系区分开,Cowork基本上属于支撑体系的一部分,但显然我们也会用它进行一些训练。
We use that in in, like if you sort of separate out post training from, like, the scaffolding around it, Cowork sort of exists in the scaffolding space, but obviously we also train on it a little bit.
所以当我们说评估时,意思是给定某个对话记录,输出会是什么样子?
So when we say eval, we mean given the certain transcript, what do the outputs look like?
包括文件输出以及实际的令牌输出,也就是你在聊天窗口中看到的内容。
Including the file outputs as well as the the actual token outputs, like the ones that you see in the chat window.
我想知道,失败模式在多大程度上是模型智能的问题,而不是最终工具用来发挥智能的问题。
I'm curious how much of the failure modes are the model intelligence versus like the usage of the end tool to put the intelligence in.
比如世界规划就是一个很好的例子。
Like, the world planning is like a good example.
对吧?
Right?
制定一个计划是一回事。
It's like one thing is to come up with a plan.
另一回事是制作一个漂亮的电子表格,是的。
The other thing is like make a nice spreadsheet Yeah.
它能帮你一步步执行这个计划。
That kinda runs you through the plan.
你是怎么看到这个演变的?
Like, how have you seen that evolve?
我经常纠结的是,无论你设计出什么样的辅助结构,我们似乎仍然存在一种模型能力过剩的现象,即模型的能力远超用户实际使用的程度。
The thing that I grapple with a lot is that whatever scaffolding you come up with, I think we still have a bit of sort of like model overhang where the model is dramatically more capable than where users end up using it for.
我认为部分原因在于,我们并没有给模型提供它理论上能够使用的所有工具。
And I think part of that is that we're just not getting the model all the tools to do all the things that's theory capable of.
对吧?
Right?
这是一方面。
That's like one thing.
然而,每当你构建了这样的辅助结构时,我一直在想,什么时候这些辅助结构会消失?以及你该投入多少精力去找到正确的辅助方式。
However, whenever you do build the scaffolding, I'm sort of wondering at what point at what point will that scaffolding go away, and, like, how much you invest in figuring out what the right scaffolding is.
这有点像在下注。
It's kind of up to it's a little bit of a bet.
对吧?
Right?
作为一名工程师,我特别享受的一点是,在Anthropic和前沿实验室工作,我能稍微多了解一些即将推出的新模型——它们的能力如何、擅长什么、不擅长什么。
And one thing that I as an engineer quite enjoy is that like, working at Anthropic and working at a Frontier Lab, I'd maybe have a little bit more insight into what's coming coming down the chute in terms of like what's the next model, what is the model capable of, is good at, what is it bad at.
我越来越想知道,我们是否应该在这些‘脚手架式修正’上投入太多精力,即模型本身并不会出错,只是没有执行你希望它做的任务?
And I'm increasingly wondering, is the right thing for us to like really invest too much in sort of these like scaffolding corrections where the model might otherwise not misbehave, but just not do the thing that you want?
是的。
Yeah.
还是应该尽可能赋予模型更多能力,努力确保这些能力是安全的,这样最坏的情况也只是比原本稍好一些,然后只需等待下一个模型发布?
Or is it to just like give it as many capabilities as possible, try to make those safe so that the worst case scenario is like, no, it's better than it might be otherwise, and then just simply wait a second for the next model drop?
我个人目前更倾向于后者。
I'm personally currently more leaning into the latter.
我认为我们会看到大量应用和公司利用AI做出令人印象深刻的事情,短期内这些应用可能非常有效,因为它们高度针对特定使用场景。
I think we're gonna see a lot of, like, applications and companies that do very impressive things with AI that in the short term might seem very effective because they're very specialized to individual use cases.
但我觉得,一旦模型在泛化能力上提升,并且在没有被过度引导的情况下也能更好地处理这些特定场景,这种优势能持续多久就不好说了。
But I think once models get better at generalization and get better at, like, those specific use cases without being super guided on those, I'm not sure how long that's gonna stick around.
你其实已经能在Skills和MTP服务器中看到这种趋势了。
And you can sort of kind of already see this in, Skills and MTP servers.
对吧?
Right?
我们已经看到从MTP服务器到Skills的缓慢转变。
We've we've already seen sort of things, like, slow shift from MTP servers to Skills.
一个很好的例子是Barry,他开发了Skills。
And like maybe a good example is Barry who made Skills.
他最初在做一些东西,老实说,看起来和COWORK今天的产品非常相似。
He was initially hacking on something that honestly looked a lot looked looked a lot like what Cowork does today.
他在思考,如果有一个COWORK,但面向那些不想写代码的人,会怎么样?
He was sort of thinking about what if Cowork, but for like people who don't wanna build code.
他也把这作为一个原型,放在桌面应用里。
And he too did that as a prototype inside the desktop app.
我们最早想到的用例之一是:哪些编程相关的场景能从图形界面中受益,以及从底层代码中稍微分离出来?
One of the first use cases we thought of were, okay, what what are like coding like use cases that could really benefit from graphical interfaces, and like from being a little separated from the actual underlying code.
每个人都会想到同一个答案:数据分析。
And everyone comes up with the same answer as data analysis.
对。
Right.
而我们现在有多少用户呢?
Whereas, like, how many users do we have today?
总是数据分析。
How many like, it's always data analysis.
我认为最终促成技能开发的原因是我们想把这个小原型连接到我们的数据仓库。
And I think the thing that ultimately led to skills is that we wanted to connect this little prototype to our data warehouse.
团队很快发现,与其为这个工具构建一个自定义的接口来连接我们的数据仓库,他们干脆直接写了一个Markdown文件,内容大概是:亲爱的Claude,如果你想获取数据,这是端点,这是API的格式,你来搞定。
And the team very quickly discovered that, like, instead of building a custom tool for the thing to talk to our data warehouse, they just, like, made a markdown file, like dear Claude, if you wanna get data, here's the endpoint, here's what the API looks like, you figure it out.
然后他们就把控制权交出去了。
And then they Like, hand over control.
是的。
Yeah.
是的。
Yeah.
而且,也许可以再往上提升一层抽象层次。
Also just like, maybe go one step up in the layer of abstractions.
对吧?
Right?
而不是告诉它:这是CLI,请调用这个CLI。
Just like, of instead of telling the thing, here's the CLI, please call this CLI.
或者这是MCPs,请调用这个接口规范。
Or here's an MCPs, please call this interface shape.
直接给它这个端点就行了。
Just like, this is the endpoint.
如果你想知道什么,只要在这里发个请求,也许你可以直接发SQL。
If you wanna know something, if you post here, maybe you can do post SQL.
这样也没问题。
It's gonna be okay.
嗯哼。
Mhmm.
这最终变得非常有效,他们开始尝试同样的模式,即直接给模型一个描述其所需任务的 Markdown 文件,整个过程最终演变成了 Skills。
And that ended up being so effective that they started trying the same pattern of like just giving the model a markdown file that describes whatever it needs to do, that the whole thing eventually became Skills.
我们觉得,我们应该把这个整理成一个工具包。
And we're like, we should package this up.
这是个非常好的想法。
This is a very good idea.
是的。
Yeah.
我们在会议上邀请过 Barry 和 Mahesh,他确实提出了一个很好的想法。
We've had Barry and Mahesh on on our conference, and he's definitely got a good idea there.
对。
Yeah.
我想给你展示一下我是如何使用 Claude Cowork 的。
I wanted to show you how I've been using Claude Cowork.
这简直是我最喜欢的部分。
This is like it was my favorite part.
这是我的情况,这就是我们运营Discord的方式。
This is so this is like me, this is how we run the Discord.
一开始,我根本不信任Cloud Cowork,这是我第一次使用。
We literally at first, I didn't trust Cloud Cowork, this is my very first usage.
好的。
Okay.
对吧?
Right?
于是我心想,好吧,我就手动从Zoom下载所有录音,然后上传到YouTube。
So then I was like, okay, I will just try to manually download from Zoom all my recordings and upload it to YouTube.
因为这个过程非常繁琐,我得不停地点击、点击、点击,YouTube的用户体验也不太友好。
Because this is a very laborious process, I gotta click click click, YouTube isn't super user friendly.
结果它就自动完成了。
And it just did it.
然后我想,其实连从Zoom下载的部分,也应该交给Claude Cowork来处理,于是我这么做了。
And then I was like, actually, you know, even the download from Zoom part, should also put into Claude Cowork, and then I did it.
对吧?
Right?
这里有一堆,它开始在这里压缩,并且甚至能够逐帧查看视频内容来为视频命名,这样我就可以自动上传了。
Here's a bunch of and it starts compacting here, and it starts to even be able to do things like look through the individual frames of the video to name the video, so that I can upload it automatically.
哦,这真是太棒了,这取代了我作为YouTuber的工作。
Oh, that is And this replaces my job as a YouTuber.
我们会永远感激你的创造力。
We will forever appreciate your creativity.
是的。
Yes.
你知道吗?
You know?
所以这很棒。
And so that's great.
但顺便说一下,它压缩后还生成了新的东西。
But then, by way, it compacts and makes makes like a new thing.
对吧?
Right?
所以我没有你最初的那个东西。
So I I don't I don't have your initial initial thing.
但后来我让它自己开发技能,让那些重复性、一次性且需要人工指导的任务变得更自动化,我可以独立使用并重复利用这些技能,当然也可以编写新的技能。
But then I asked it to make its own skills so that it so that something that's repetitive and one off and human guided becomes more automated, and I can use the skills independently and reuse them, and then obviously can write skills.
这些内容会进入下面的上下文和技能部分,这真是太好了。
And that goes into context and skills at the bottom here, which is which is so nice.
所以我现在每周都会用到这些技能。
So I have all these skills that that I now sort of do on a weekly basis.
我知道你已经发布了定时协作功能,但我还没用过。
I know you've released scheduled co works, which I haven't done yet.
但是
But
你当然应该试试。
You really, of course, you should try them.
我觉得看到这一切对我来说非常棒、非常有趣,因为对我来说,技能特别有趣的一点是它们制作起来太简单了。
I I think this is like so wonderful and fun for me to see, because I think one thing that is very fun for me about skills in particular is that they're so easy to make.
任何人都能创建一个技能。
Like, anyone can make a skill.
比如,发短信就可以是一个技能,而且它们可以高度个性化地贴合你的需求。
Like, text message could be a skill, and they can be so hyper personalized to you.
这就像一种减法层。
And this is like sort of the subtraction layer.
对吧?
Right?
我只是猜的,但我假设你在工作中非常出色。
Like, I I'm just guessing, but I assume you're very good at your job.
你可能已经给它一些指导,告诉它该如何做。
You've probably given this thing some guidance about how to do it.
对吧?
Right?
我只是说把所有东西都封装成一个技能。
I I just said wrap everything up into into a skill.
对吧?
Right?
是的。
Yeah.
然后我想,实际上,有时候我需要把东西拆开,因为某些部分可能会失败,或者某些部分可能需要单独使用,所以我让它把一个技能拆分成三个技能。
And then and then I was like, actually, sometimes I might need to break things apart because some parts fail or some parts might be needed individually, so I told it to split one skill into three skills.
这就像是一个技能拆分功能,然后还有一个父技能,如果我想使用的话,它会协调所有这些技能。
So it's like a skill splitting thing, and then there's like a parent skill that just orchestrates all of them if I wanna use that.
你知道,我觉得这真的很棒。
You know, like, I think that's that's like really good.
还有一部分,就是我跟你说过的那个谷歌浏览器的东西。
And there's there's one more part, which is the Google Chrome thing that I told you about.
是的。
Yeah.
我想知道,有什么比用 Cloud Cowork 上传到 YouTube 更好的方法吗?
Where I'm like, okay, you know what's better than uploading using Cloud Cowork to YouTube?
比如,实际去看看文档,学习如何编程上传到 YouTube,然后把这个功能封装成一个技能。
Like, actually looking at the docs to like, programmatically upload to YouTube, and then putting that in a skill.
我以前从没做过这个,也不想处理 Google Cloud,所以 Cloud Cowork 帮我做了。
And I've never done that before, I don't want to deal with Google Cloud, so Cloud Cowork does it for me.
这真的很酷。
That is really cool.
所以我根本不在乎,我就只是去做,其实也没那么重要。
So I just I don't care, I just like, do thing, it doesn't really matter.
这真的很酷。
That is really cool.
然后你应该是把这个技能和它对应的脚本配对了吧?
And then you, I assume, paired the skill just with the script that it's built?
是的。
Yeah.
然后我就更新一下这个技能。
And then I just update update the skill.
哦,这太棒了。
Oh, that is beautiful.
是的,这太好了。
Yeah, that's wonderful.
这有点像一个技能。我觉得,人们进入 Cloud Cowork 的方式通常是:取一个你平时需要手动点击完成的知识型任务,然后尝试把它自动化,接着你就会想,如果再进一步会怎样?
It's kind of like a skill Basically, I think, like, the way that people ease into cloud cowork is like, take a knowledge work task that you would normally be clicking around for, and then try to turn turn that, and then you do the, okay, well, what if you went further?
好的。
Okay.
如果再进一步会怎样?
And then what if you went further?
当你对它越来越信任时,你就会逐步扩大协同工作的范围,同时教会它如何取代你。
What if and then you sort of expand the scope of co work as you gain trust with it, and then and also teach it how to replace you.
是的。
Yeah.
这就像玩《Factorio》,但为你自己的生活而玩。
It's like a little bit like playing Factorio, but for your own life.
就像你说的,你从非常小的事情开始。
Like you say, you start really small.
是的。
Yeah.
你先自动化一件非常小的事情,一旦上手了,就会不断扩展这个自动化帝国,让生活越来越轻松。
You start automating something really tiny, and like once it clicks, you keep adding onto this like automation empire, just like make your life easier and easier.
我最喜欢的一个技能是,每天早上,Cowork都会查看我的日历,确保没有时间冲突。
My favorite skill has been every single morning, Cowork starts looking at my calendar and make sure that there's no conflicts.
因为人们常常会安排很多会议,有时是临时安排,有时会漏掉。
Because people tend to schedule a lot of meetings, sometimes last minute, or sometimes miss it.
这常常让人很痛苦。
It's often painful.
嗯。
Mhmm.
很多产品都是以这种方式存在的。
And a lot of products have existed like that a lot.
我已经在自定义提示中写好了。
I've written in the custom prompt there.
我还没有把它变成一项技能。
I haven't made it a skill.
说实话,我应该这么做。
Honestly, I should.
是的。
Yeah.
但我已经给了它们相当明确的指示,比如,这些人是这样的。
But I've given them, like, pretty clear instructions about, okay, here are some people.
如果他们把会议安排在其他会议之上,我大概会去参加他们的会议。
If they book over other meetings, I'm probably gonna go to their meeting.
比如,如果达里奥安排了一个会议。
Like, if Dario schedules a meeting.
对。
Right.
不要试图重新安排到后面。
Not try to reschedule down.
对吧?
Right?
我想里面还有一些其他规则,比如我更关注哪种会议、不太关注哪种会议,什么情况下可以推迟,以及我希望什么时候工作、什么时候不工作。
And I think there's some other rules in there about, like, what kind of meetings I care more about, what kind of meetings I care less about, what is okay to, like, maybe pun, like, when I want to be when I want to be working, when I don't want to be working.
正是这些细微之处,我觉得能让人产生共鸣。
And it's those really small things that I can think kinda click with people.
我们刚推出 Cowork 时,我在 Twitter(X)上看到最火的一句用户评论是‘整理一下你的桌面’,这当然很荒谬。
Right when we launched Cowork, I think one of the user phrases that went most viral on Twitter, x, was clean up your desktop, which is of course silly.
这真是个让人哭笑不得的说法。
That's such a swear thing.
对吧?
Right?
你根本不需要一个模型来整理你的桌面。
Like, you don't need a model to clean up your desktop.
其实不用。
Not really.
像这样吗?
Like this?
就是,整理我的桌面?
Like, clean up my desktop?
是的。
Yeah.
没错。
Exactly.
对。
Yeah.
我得选一下我的桌面,我想?
I need to I need to choose my desktop, I guess?
给我桌面的访问权限?
Give it access to my desktop?
是的。
Yeah.
好的。
Okay.
好的。
Okay.
这太吓人了。
This is very scary.
我们来处理。
We'll do it.
我之前用过
I did I did it with
我的下载文件夹。
my downloads folder.
当时的情况是,你有那么多条款清单,还有八份你办公室租赁合同的副本。
It was like, you have so many term sheets and there's like eight copies of your rental lease for your office.
我当时就想,好吧。
I was like, alright.
别冲我喊。
Like, don't yell at me.
这根本就是个小任务。
Like, it's such a small task.
而且我平时根本不会走出去跟别人说:我开发了一个产品。
And I like, I I would never go out there normally otherwise and tell people, I've built a product.
我可以帮你整理照片,因为这感觉起来很小。
I can organize your photos Because for it feels small.
但我觉得你说得对,比如
But I think to your point, like
哦,这就是要问用户问题的地方。
Oh, here's here's the here's the ask user questions.
是的
Yeah.
太棒了
Beautiful.
对吧
Right?
这明显是垃圾吗?
Is it obvious junk?
你可能不该点那个。
You probably shouldn't click that.
不
No.
如果不是直接的。
If it's not direct.
只要可以撤销,我
As long as it's reversible, I
别制定计划。
don't Make a plan.
是的。
Yeah.
我有一个典型的乱七八糟的文件夹,所以是的。
I I have a I have a typical everything is super messy folder, so yes.
我觉得这非常有帮助。
I think this this is super helpful.
所以这是一个相当简单的任务,但我已经好了。
So this is a pretty simple task, but I've okay.
在这里。
Here it is.
对吧?
Right?
这是进度。
Here's the progress.
我没在这一个里看到这个,这肯定和……不一样
I don't see this in this one, like, this gotta be something different than
和Claude Code不一样,因为我们的做法就是这样。
than Claude Code because I'm like We do yeah.
这就是我们的系统提示,我们会说:好吧,我们希望你思考一下这个任务。
That's we do system prompt, we're like, alright, we want you to think about like this task.
是的。
Yeah.
一种方法,是的。
A method yeah.
然后我可以为这些事情提供一些小建议吗?
And then I can I can I can do like little suggestions for for for these things?
太棒了。
It's beautiful.
看看这个。
Look at this.
我可以这么说,别这么做,别干那个吗?
I I can I can like say like, oh, don't do that, don't do this?
太棒了。
It's amazing.
我很高兴你喜欢。
I'm so happy you like it.
我的意思是反过来讲,我们是Claude Code团队的一员,如果你希望在Claude Code里也有这个功能。
I mean, the other way around, like, we're part of the Claude Code team, if you would like this in Claude Code.
是的。
Yeah.
天啊。
Damn.
所以,没错,这真的非常好。
So so, yeah, I mean, this is really good.
显然,我简直在夸个不停。
Obviously, I'm I'm, like, kinda raving about it.
你知道的,我还有其他事情,比如注册PG&E。
And, you know, I have other things, like, sign up for PG and E.
所以如果你能帮我打电话,那就太好了。
So if you can do phone calls for me, that'd be great.
我,人们都
I I do People have
做过这事。
done that.
显然,你不能原生做到这一点,但人们已经通过其他各种服务商实现了。
Obviously, you can't do that natively, but people have done that with like various other providers.
是的。
Yeah.
然后这是注册Figma MCP。
And then this is like signing up for the Figma MCP.
我确实想把所有事情都做一遍,包括数据分析。
I really am trying to do like everything, data analysis as well.
我觉得,设计转代码非常好。
I do think, oh, design to code, very, very good.
对吧?
Right?
所以,这里有一个Figma文件,拿去吧。
So like, here's a Figma file, take it.
而其他很多任务都是像知识工作那样,比如我手动点击,但这个不是。
And then this is where, like, a lot of other tasks is like knowledge work, like, my manual clicking, but this is no.
我通常会用Claude Cowork或Claude Code来做这个,但我觉得你对Chrome的集成更好。
I would normally use Claude Cowork Claude Code for this, but because I perceive that you have better Chrome integration
嗯。
Mhmm.
我觉得你实际上能做得更好,这是我一次性处理我的会议网站。
I I think you can actually do a better job of And I this this is one shot at my conference website.
这挺酷的。
That's pretty cool.
说真的,我挺想听听你对桌面应用里代码功能的看法的,那个功能我从来没用过,但其实是同一个团队做的。
Like, at some point, I would love to, like, hear how you feel about code in the desktop app, which is like I never use which is the the same team same team.
所以我用的是云代码终端,我觉得这是云编码的默认方式。
So I use the cloud coding terminal, which I I perceive to be the default way of cloud coding.
所以有一件事我想说,抱歉,我只是觉得,我好像不在状态。
So one thing this has sorry, I'm just like, I'm not here I'm not here.
我相信如果外面有人想听我花一个小时宣传我的东西,他们一定会很乐意。
I'm I'm I'm sure if people out there wanna like hear me advertise my stuff for like an hour.
请尽管这么做吧。
Please do that.
这个工具内置了一个浏览器,很多产品都有这个功能。
This thing has, a built in browser, which is a thing a lot of products have.
他说,是的,这是内置浏览器,我觉得让Claude能看到你实际在做什么,会让它变得高效得多。
He said, yeah, it's a built in browser, and I think giving Claude eyes into, like, what you're actually working on makes it so much more effective.
这可能就是你在Cowork里看到的那些功能。
And that's probably what you've seen in Cowork.
因为它能查看 Chrome,可以调试 DOM,能看到一些东西。
Because it can see Chrome, it can, like, debug the DOM, it can, like, see things.
这确实让它更强大。
That does make it more powerful.
是的。
Yeah.
所以我觉得我的认知模型有点混乱,因为我只用这个代码工具,是因为我以为它内置了浏览器功能。
So so I think my mental model is kinda broken, because I only use this code work because I thought it had a browser thing in it.
但我明白,Claude Code 应用程序,或者 Claude Code 的桌面版,确实是内置了浏览器的。
But I understand that the Claude Code app, or the app version of Claude Code does have a built in browser.
我见过这个预览功能。
I've seen I've seen this preview thing.
是的。
Yeah.
我只是从来没用过。
I just I've never used it.
但最终,你还是会遇到困难。
But in the end in the end, you sort of get like Hard.
是的。
Yeah.
你基本上得到的是同样的东西。
You basically get the same thing.
对吧?
Right?
就像你所描述的,额外的技能是,如果 Claude 能看到它正在处理的内容,它会表现得更好。
Like, the the the additional skill that you're describing is Claude is better if we can see what it's working on.
对吧?
Right?
这基本上就是这里的总结。
That's that's sort of like the summary here.
而且这实际上是利用了你的 Chrome
And like This is it's really using your Chrome
是的。
Yeah.
或者它只是在创建自己一个小浏览器,其实没什么区别,因为无论哪种方式,它都能看到自己正在处理的内容,而这只是让它变得更强大,然后是的。
Or it's just like making up its own little, browser, it doesn't really make a bit difference, because either way it's gonna see what it's working on, and that just makes it much And then Yeah.
你不需要为你的云运行质量保证。
You don't have to run QA for your cloud.
为什么它没有加载我现有的Claude Code会话?
Why doesn't it pick up my existing Claude Code sessions?
因为,我的意思是,我当然用过Claude Code,但是
Because I I mean, obviously, I've used Claude Code, but
非常好的问题。
Excellent question.
除了坦诚地说,我们真的没有好的答案。
Don't have a good answer other than like, we're honest.
就是还没实现。
Just haven't.
是的。
Yeah.
是的。
Yeah.
这就是OpenAI团队的做法。
This is what the OpenAI team does.
好的。
Okay.
不错。
Cool.
我没有什么其他的,我只是真的想拓展人们的思维,也可能让那些还没尝试过的人看到,我觉得我有时用这个的频率甚至超过了我的使用Dia的频率,这非常有趣。
I I I don't have other like, I I just I I do wanna ex expand people's minds and also maybe show people if they haven't really done it, but like, I I think it's very interesting how I sometimes use this more than I use I mean, I use Dia.
对吧?
Right?
是的。
Yeah.
我用过所有其他的智能浏览器,而Anthropic不需要自己构建一个智能浏览器,因为有Claude Cowork就足够了。
I and I use I've used, like, all the other agentic browsers, and Anthropic didn't have to build an agentic browser because you just had Claude Cowork, and that's enough.
是的。
Yeah.
我也觉得,目前我个人的优先事项可能更倾向于与市面上一些优秀的浏览器集成,而不是从零开始重新开发一个浏览器。
I I also think, like, maybe integrating with number of excellent browsers out there is, like, currently on my personal priority list a little higher than, trying to rebuild a browser from scratch.
是的。
Yeah.
你知道,我从不绝对说永远不会,但我认为回到我们想将这个工具融入你整个现有工作流程的理念,我们的目标实际上并不是取代你电脑上的任何应用程序,而是与你的新工作流程很好地配合。
You know, I never say never, but I think going back to this idea of, like, we wanna plug this into your entire existing workflow, think I our goal is actually to not replace any of the applications you have on your computer, but instead like work really well with the new workflow.
打造新的那个。
Make the new one.
是的。
Yeah.
是的。
Yeah.
如今,尤其是在浏览器领域,大部分创新似乎都集中在用户人体工学上。
It seems that nowadays, especially on the browser, most of the innovation is like user ergonomics.
而不是底层的浏览器引擎。
It's not really like the underlying browser engine.
所以我觉得对Claude来说,用的是Dia、Chrome还是Alice,其实都没什么区别。
So I feel like to Claude, it doesn't really matter if it's like Dia or Chrome or Alice, whatever.
是的。
Yeah.
我们希望在您所在的地方与您相遇,这听起来显而易见,但确实如此,因为我不想通过限定‘只服务愿意更换浏览器的人’来人为限制我的潜在用户群。
We wanna we wanna meet you wherever you are, which is like like, obviously I would say that, but it's also just generally true because I don't wanna restrict my potential user base artificially by saying, okay, like I'm gonna start building for the people who are willing to switch browsers.
没错。
Right.
这涉及到很多诉讼,关于谁该掌控浏览器,还有大量资金因默认浏览器和默认搜索引擎的归属问题而易手。
That's such a like, you know, like many lawsuits have been filed over who gets to already the browser, and like a lot of money has switched hands over the question of like, which browser is default, and which search engine is default within the browser.
我只是想为是的而构建。
I just wanna build for yeah.
我想为swyx构建产品。
I wanna build for swyx essentially.
我想为那些有一系列烦人任务、觉得Claude可以帮他们完成的人构建产品。
Like, I wanna I wanna build for people who have a number of annoying tasks that they feel like maybe Claude could do it could do
为他们完成。
it for them.
是的。
Yeah.
你对技能可移植性怎么看?
What do you think about skills portability?
我觉得有一件事。
I think there's been one thing.
我用另一个叫Zoho的工具,它有点像云电脑加智能代理,我有一个技能可以把访客添加到办公室。
I use another thing called Zoho, which is kinda like a cloud computer plus agent, and I have a skill to add visitors to the office.
是的。
Yeah.
所以每当有人需要在非工作时间进来时,他们都需要到楼下签到。
So whenever somebody has to come in, after hours, they need to check-in downstairs.
但我希望可以发个短信来完成这个操作。
But I wanna, like, text the thing.
所以这在协同办公环境中其实行不通。
So it doesn't really work in in co work.
但现在这个技能已经放在Zoho系统里了,而不在我的协同办公工具中。
But now that skill is in the ZO harness, and it's not in my co work thing.
如果我做了修改,就得手动同步两者。
And then if I make a change, it's I gotta I gotta sync them.
你觉得这种情况会怎么发展?
How do you see that going?
我觉得记忆就像是Claude的个人记忆。
Like, I see memory as like Claude personal.
换句话说,我并不希望我的记忆互相交叉。
Kinda like, I don't necessarily want my memories to be crossing.
是的
Yeah.
但我确实希望我使用的技能能在不同代理之间共享。
But I do want my skills to be cross agent that I use.
我认为在MCPs中,人们也做同样的事情。
I think with MCPs, people do the same thing.
就像是,哦,MCP网关,MCP注册中心。
It's like, oh, MCP gateway, MCP registry.
我不太确定这是否能成为一个商业模式。
I don't really know if that's like a business.
所以我很好奇,你在这方面有没有什么想法?
So I'm curious, like, if you've had any thoughts in the area.
对我来说,这让我回到了最基础的原语。
I think for me, this is sort of where I go back to the really basic primitives.
我们的技能是基于文件的,而不是那种存在于某个地方、非常专有的复杂系统。
For our skills are file based, instead of like this complicated thing that exists inside a place somewhere that is like super pro proprietary.
我非常倾向于认为,这一切其实就是文件和文件夹,这本身就让它们非常易于移植。
I'm really leaning into the idea of like it's all just files and folders, and that makes it very portable on its own.
对吧?
Right?
我们确实将技能作为这种容器格式的一部分,我们称之为插件。
We do have skills as part of this container format, which we just call plugins.
嗯哼。
Mhmm.
这些插件同时适用于Claude Code和Claude Cowork,格式相同。
And plugins are available both for Claude Code and Claude Cowork, the same format.
你可以安装插件。
And you can install plugins.
在Cowork中,这目前已经可以使用。
This works in Cowork today.
你可以直接说,要把整个GitHub仓库作为技能市场或插件市场添加进来。
You can basically say, gonna add a whole, like, just a GitHub repo as a skills marketplace or like a plugin marketplace.
这就是我们实现可移植性的方式。
And that's how we're doing portability.
我认为我们在如何让人们对编写技能产生兴趣方面还有很大的提升空间。
I think we have a lot of room left to grow in how do we make it easy for people to know that they can write skills?
我们该如何让他们轻松地与你分享一个技能呢?
How do we make it easy for them to just like share a skill with you?
因为显然,我刚才说的这些话。
Because obviously, all the words I just said.
对吧?
Right?
我正在失去大部分的知识工作者群体。
Like, I'm losing most of the knowledge worker base out there.
没错。
Right.
要说你可以连接到 GitHub 仓库,这确实有点难传达。
And it's hard for saying, oh, you can connect to GitHub repo.
这并不是大多数人在通用知识工作者领域中最终会采用的工作方式。
It's not exactly how most people will end up working in like a general knowledge worker space.
但我认为这其中有一些值得挖掘的东西。
But I think there's something there.
还有一件事,我认为尚未被充分探索,那就是技能中哪些部分具有高度可移植性,而哪些部分则非常个人化。
And another thing that's there that I think has not really been properly explored is the the the combination of which part of the skill is very portable, and then which part of the skill is like very personal to you.
对吧?
Right?
我认为,作为整个行业,我们至今尚未真正解决这个问题。
And I think that's something we haven't really solved yet as an industry.
嗯。
Mhmm.
因为,你希望在技能中引入更多结构,还是始终区分公开技能、私有技能,或者搭配使用?
Because like, which part do wanna introduce more structure to the skill, or have always have like public skill, private skill, you know, pairs?
是的。
Yeah.
是的。
Yeah.
有点吧。
Kind of.
我认为最简单的方法是我们使用字符串插值之类的方式。
I think there's like a like the easiest way to do this, which is we do, like, use string interpolation or something.
对吧?
Right?
没错。
Right.
是的。
Yeah.
对。
Yeah.
插入用户名,插入电话号码,插入已知的文件夹路径,诸如此类的东西。
Insert username here, insert, like, phone number, insert, like, known folder locations, that kind of stuff.
这可能有点笨拙。
That's probably clunky.
这就是我们没有开发它的原因。
That's why we haven't built it.
但我确实认为有人会想出一种有趣的方式来保留我们喜欢的技能特性。
But I do think someone is going to come up with, like, an interesting way to keep everything we like about skills.
可移植性就是一个文件。
The portability is just a file.
就是纯文本。
It's just markdown.
其实就是文字,说实话。
It's just text, honestly.
写一个文本文件里的文字。
Write like a text file words.
完全没有结构,这意味着你不需要任何教程就能编写一个技能。
The complete lack of structure, which means you don't need any kind of tutorial to write a skill.
就像你向我解释一样向Claude解释它,Claude可能会在我之前就理解了。
Just like explain it to Claude the way you would explain it to me, and Claude will probably get it before I work in.
对吧?
Right?
比如预订航班,就像我们向今天刚入职的人解释一样,告诉Claude如何预订航班。
You just like for booking your flight, tell Claude how to book a flight the same way we're telling someone who just started working here today.
但再结合一些非常个人化的东西。
But combine that with a very, like, personal thing.
也许我们还是用预订航班的例子吧。
Maybe we'll stick with the booking a flight example.
实际上,我不认为AI应该去预订航班。
I don't actually think AIs should be booking flights.
我认为我们现有的工具是可行的。
I think the tools we have is Yes.
可能性。
Possibility.
是的。
Yeah.
终于有人说了。
Finally, somebody says it.
这是每个人都在做的默认演示。
It's the default demo that everyone was making.
我就是觉得,我对航班预订的演示有点抵触。
And I'm like, I'm leaning against flight booking demos.
这确实是个不错的展示。
It's a good showcase.
是的。
Yeah.
我就想自己订机票。
I'm like, I just wanna book my flight myself.
但没错。
But Right.
我认为很多事情都既有个人成分,也有非个人成分,这或许就是为什么人们会倾向于使用航班预订服务。
I think there's a lot of things that have a personal and a non personal component, and that's maybe why people reach for flight booking.
因为有些事情是非常普遍的。
Because some things are very universal.
更便宜的航班通常更好。
Cheaper flight is usually better.
对吧?
Right?
很少有人会特意选择最贵的航班。
Like, few people try to book the most expensive flight.
而有些事情则非常个人化,比如你偏好的时间、座位和机场。
And then some things are quite personal about, like, what times you prefer, which seats you prefer, which airports you prefer.
将这些整合成一种技能形式,真正可移植、兼容且易于理解。
Combining that in like a skill format that is actually portable, compatible, easy to understand for people.
我认为这会非常令人兴奋。
I think that would be very exciting.
我们只是还没找到解决方案而已。
We just haven't figured it out yet.
是的。
Yeah.
我觉得文本部分,现在每个人应该都有某种云文件服务。
I think the text part I think everybody by now has some sort of like Cloud file thing.
无论是Dropbox、Google Drive,还是别的什么。
Either Dropbox, Google Drive, whatever.
所以某种程度上,感觉应该把我的技能符号链接到我所有的智能体框架中。
So it feels like in a way, it should basically like symlink my skills into all my agent harnesses.
对。
Yeah.
让它们保持同步。
Just keep those in sync.
比如,我们内部有一个价值令牌仓库,里面包含了所有命令和子智能体。
Like, we have internally this like valuable tokens repo, which is like all the commands and sub agents.
还不错。
It's good.
然后我搞了个TUI,你可以启动它,比如安装这个命令和三个子代理到这个代理和这个文件夹里,直接复制粘贴就行。
And then I built, like, a TUI where you can start and be, like, you know, install this command and the three sub agents into this agent and this folder and just copy paste this.
但它什么也不做。
It doesn't do anything.
它真的只是把文件复制过去。
It literally, like, CP the file into that.
我觉得应该有个类似的东西,每次我进入一个新项目时,它会说:嘿。
I feel like there should be something similar where, like, whenever I go into a new thing, it's like, hey.
这是那个云文件夹的链接,直接把这些技能下载下来。
Here's, like, the link to exactly the cloud folder and just bring down these skills into this.
比如现在,它并不能这样工作。
Like, today, it doesn't quite work like that.
比如,如果我安装了一个新代理,我必须手动复制粘贴所有技能,而且我甚至都不知道它们在哪。
Like, if I install a new agent, I cannot I have to, like, copy paste all the skills, and I don't even know where they are.
是的。
Yeah.
这正是主要问题。
That's like the big problem.
我到哪里去找它们呢?
It's like, where do I find them?
是的。
Yeah.
所以我很想知道,未来我的个人生产力可能就体现在我的技能上。
So I'm curious, like, in the future, like, that that almost feels like my personal productivity thing will be my skills.
是的。
Yeah.
真正重要的不是我使用的工具,因为每个人都能使用同样的工具。
It's not really the product that I use because everybody has access to the same product.
但今天,这看起来就像是复制粘贴 MVA 文件。
But today, there's that just looks like copy pasting MVA files.
我觉得有太多事情了。
I think so many things.
我真的很喜欢把智能体和大语言模型看作是另一个同事。
I I really like thinking about agents and LLMs just as, another coworker.
已经有很多尝试想要建立文档公司,声称要解决你所有的文档问题。
So many attempts have been made to build documentation companies that are like, oh, we're gonna solve all your documentation problems.
我自己会花一点时间在Notion上工作。
I myself like spend a little bit of time working in Notion.
对吧?
Right?
我非常熟悉让所有人都达成一致这个概念。
I'm like deeply familiar with the concept of let's get everyone on the same page.
嗯。
Mhmm.
对吧?
Right?
你在这里基本上是说,你希望你所有的智能体都能在你的偏好、技能以及工作方式和执行方式上保持一致。
And what you're basically saying here is you want all your agents to be on the same page about your preferences, about the skills, about the way they ought to work, and like how they ought to execute.
我不确定什么才是正确的做法。
And I'm not sure what the right thing is going to be.
如果是一家公司能够说:好吧,我们作为一个独立实体,并不试图强行介入任何特定产品。
If it's going to be some some company that can say, alright, we're as an independent body, we're not trying to like push into any particular product.
我们的职责是成为技能权威,提供某种东西,比如说,我们要成为技能领域的Dropbox,可以轻松链接到它们想使用的任何产品。
It's our job to be like the skill authority, and we provide, I don't know, we're gonna be the Dropbox of skills, and we can just Simlink us into all the products they wanna use.
我不确定这个想法能否成为可行的商业模式,但作为一个构想,它会很酷。
I'm not sure that's gonna be viable business, but as as an idea, it will be cool.
对吧?
Right?
是的。
Yeah.
对。
Yeah.
我觉得有太多事情正在作为业务消失。
I think so many things are just going away as businesses.
那我该怎么处理呢?
It's like, how am I supposed to do it?
我甚至没要求别人为此开发一个产品。
I'm not even asking somebody to make a product about it.
我只是想亲自了解。
Like, I wanna personally know.
就像你说的,有些事情你几乎需要一种技能,然后在个人和工作之间进行调和。
And there's things, like you said, it's like, you almost want a skill and then interpolate it between personal and work.
所以,如果我为工作订机票,和为自己订机票是不一样的。
So if I'm booking a flight for work, it's different than I'm booking a flight personally Yeah.
在某些方面。
In some ways.
是的。
Yeah.
但很多基础结构其实是相同的。
But, like, a lot of the scaffolding is the same.
你知道的?
You know?
不错。
Cool.
作为工程师,我会告诉你,从一个人到另一个人。
Mean, as an engineer, I will tell you, like, you know, taking a person to taking a person.
我就觉得像兄弟一样。
I will just be like siblings.
这正是我和Claude打交道的方式。
Well, that's what that's what I do with Claude.
M d 和代理。
M d and agents.
M d。
M d.
这和兄弟姐妹的情况是一样的。
It's just the same with how a sibling.
所以它确实有效,但感觉就是这样。
And so it's like that works, but it feels like yeah.
我不知道。
I don't know.
也许吧。
Maybe.
我们总是可以再往上一层。
We can always go one level up.
你可以随时把问题告诉Cowork,然后Cowork会帮你解决。
You can always tell Cowork your problem, and then Cowork will solve it for you.
这并不会产生兄弟姐妹。
It doesn't make the siblings.
这是一种可以采取的方式。
That's like one way to do it.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。